Consume events from AWS EventBridge in a self hosted kafka cluster outsite aws - apache-kafka

We got a SaaS which is publishing it's events on AWS eventbridge (coulple of milion per day). We would love to consume those events and put them on our self hosted Kafka cluster. What would be the best methode to do this? We where thinking about lambda's, but the seem expensive for this use case and we don't to to manage to many other components. Does anyone have some good practices?

i was looking for a similar solution but in my case it is from eventbridge to MSK within AWS account. at this point looks like the only option is to use a lambda to feed into Kafka.

As per today, AWS only supports following Targets - https://docs.amazonaws.cn/en_us/eventbridge/latest/userguide/eb-targets.html#eb-console-targets
I have a similar use case where i need to send a message to AWS RabbitMQ or even to AWS Kafka as i need Priority Logic Implemented.
As AWS supports Lambda's, the message can be forwarded to the lambda from where it can be fed to any system

Related

How can I install connector config in kafka connect

Is there any other way to deploy connector config rather than POSTing connector config to kafka connect REST api? https://docs.confluent.io/platform/current/connect/references/restapi.html#tasks
I am thinking of any form of persistent approach like a volume or s3, where connect during bootstrap would grap those configs would be great. Don't know/can't find if thats anywhere available.
regards
The REST API is the only way.
You can use abstractions like Terraform or Kubernetes resources, however, which wrap an HTTP client.
If you use other storage, that'll require you to write extra code to download files and call the REST API.

Spring Cloud Data Flow with Azure Event Hub limitations?

We plan to use Spring Cloud Data Flow on Azure Cloud using Azure EventHub as a messaging binder.
On Azure EventHub, there are hard limits :
100 Namespaces
10 topics per namespaces.
The Spring Cloud Azure Event Hub Stream Binder seems to be able to configure only one namespace, so how can we manage multiple namespaces?
Maybe we should use multiple binders, to have multiple instances of the Spring Cloud Azure Event Hub Stream Binder?
Does anyone have any ideas? or documentation we did not find?
Regards
RĂ©mi
Spring Cloud Data Flow and Spring Cloud Skipper support the concept of "platform accounts". Using that, you can set up multiple accounts, for each namespace or any other K8s clusters even. This opens a lot of flexibility to work around these hard limits in Azure stack.
We have a recipe on multi-platform deployments.
When deploying the streams from SCDF, you'd pick and choose the platform account (aka namespace or other configs), so automatically the deployed stream apps (with Azure binder in the classpath) would be running in different namespaces. Effectively, dodging the limits enforced in Azure.
The provenance tracking of where the apps run and the audit trail is automatically also captured in SCDF, so at any given time, you'd know who did what and in which namespace.

How can I have a GKE cluster "expire" and delete itself?

We stand up a lot of clusters for testing/poc/deving and its up to us to remember to delete them
What I would like is a way of setting a ttl on an entire gke cluster and having it get deleted/purged automatically.
I could tag the clusters with a timestamp at creation and have an external process running on a schedule that reaps old clusters, but it'd be great if I didn't have to do that- it might be the only way but maybe there is a gke/k8s feature for this?
Is there a way to have the cluster delete itself without relying on an external service? I suppose it could spawn a cloud function itself- but Im wondering if there is a native gke/k8s feature to do this more elegantly
You can spawn GKE cluster with Alpha features. Such clusters exist for one month maximum and then are auto-deleted.
Read more: https://cloud.google.com/kubernetes-engine/docs/concepts/alpha-clusters
Try Cloud Scheduler and hook it up with your build server. Cloud Scheduler supports Http , App Engine , Pub/Sub endpoints.
I don't believe there is a native way to do this, but it doesn't seem unreasonable to use cloud scheduler to every so often trigger a cloud function which looks for appropriately labeled clusters and triggers their deletion via the API.

Ignite: work with messaging and less dependencies

I'm writing a microservice for an existing Ignite cluster. I need to have basic communications with Ignite Messaging system, and don't need other Ignite capabilities. I don't want to include Ignite libraries as it will bloat my microservice - ignite.zip is about 10 times larger than my server and I only need a small subset of functionality.
How can I send messages to existing Ignite cluster and receive messages from it?
EDIT: Ignite documentation lists REST API as one of ways to use Ignite. I'm not sure how it can be used to work with Ignite messaging - suppose I want to receive message as soon as it becomes available in the Ignite messaging? I don't want to poll for messages, as that's not efficient enough for me. If using REST API, the question becomes: how (if it's possible) to receive message using Ignite REST API from the distributed messaging system?
To do this you need only one JAR - ignite-core, which doesn't have any additional dependencies.
To achieve functionality, you can start a client node in your application and use IgniteMessaging API: https://apacheignite.readme.io/docs/messaging

How to use kafka and storm on cloudfoundry?

I want to know if it is possible to run kafka as a cloud-native application, and can I create a kafka cluster as a service on Pivotal Web Services. I don't want only client integration, I want to run the kafka cluster/service itself?
Thanks,
Anil
I can point you at a few starting points, there would be some work involved to go from those starting points to something fully functional.
One option is to deploy the kafka cluster on Cloud Foundry (e.g. Pivotal Web Services) using docker images. Spotify has Dockerized kafka and kafka-proxy (including Zookeeper). One thing to keep in mind is that PWS currently doesn't support apps with persistence (although this work is starting) so if you were to go this route right now, you would lose the data in kafka when the application is rolled. Looking at that Spotify repo, it looks like the docker images are generally run without any mounted volumes, so this persistence-less kafka seems like it may be a valid use case (I don't know enough about kafka to say).
The other option is to deploy kafka directly on some IaaS (e.g. AWS) using BOSH. BOSH can be hard if you're seeing it for the first time, but it is the ideal way to deploy any distributed software that you want running on VMs. You will also be able to have persistent volumes attached to your kafka VMs if necessary. Here is a kafka BOSH release which may work.
Once you have your cluster running, you have two ways to integrate your Cloud Foundry applications with it. The simplest is just to provide it to your applications as a "user-provided service", which lets you flow kafka cluster access info to your apps. The alternative would to put a service broker in front of your cluster, which would be especially useful if you have many different people who will be pushing apps that need to talk to the kafka cluster. Rather than you having to manually tell people the access info each time, they can do something simple like cf bind-service SOME_APP YOUR_KAFKA_SERVICE. Here is a kafka service broker along with more info about service brokers in general.
According to the 12-factor app description (https://12factor.net/processes), Kafka should not run as an application on top of Cloud Foundry:
Twelve-factor processes are stateless and share-nothing. Any data that needs to persist must be stored in a stateful backing service, typically a database.
Kafka is often considered a "distributed commit log" and as such carries a large amount of state. Many companies use it to keep all events flowing through their distributed system of micro services for a long (sometimes unlimited) amount of time.
Therefore I would strongly recommend to go for the second option in the accepted answer: Kafka topics should be bound to your applications in the form of stateful services.