Triggering kubernetes job for a kafka message - kubernetes

I have a kubernetes service that only does something when it consumes a message from a Kafka queue. The queue does not have messages very often, and running the service as a job triggered whenever a message is found would save resources.
I see that Kubernetes has this functionality for AMQP-type message services: https://kubernetes.io/docs/tasks/job/coarse-parallel-processing-work-queue/
Is there a way to adapt this for Kafka, given that Kafka does not support AMQP? I'd switch to a different messaging system, but I have other services that also read from this queue that require Kafka.

That Kafka consumer Service is all you really need. If you want to save resources, this could be paired with KEDA autoscaler such that it scales up and down, depending on load or consumer group lag.
Or you can use serverless platforms such as KNative to trigger based on Kafka (or other messaging systems) events.
Kafka does not support AMQP
Kafka Connect should be able to bridge AMQP to Kafka. E.g. Apache Camel has connectors for both.

Related

How to Conect MQTT broker to a knative kafka source

Basically I want to send messages from a MQTT(mosquito) broker to a knative event source(kafka) . In case of a simple kafka broker I could use the confluent's kafkaconnect but in this case it's a knative event source rather than a broker. The problems lies with conversion to cloud events.
Since you have a current MQTT broker which can read/write to Kafka, you might the the Kafka source to convert the Kafka messages to CloudEvents and send them to your service.
If you're asking about how to connect the MQTT broker with Kafka, I'd suggest either finding or writing an MQTT source or using something outside the Knative ecosystem.

How do I set up Elastic Node APM distributed tracing to work with Kafka and multiple Node services?

I'm using Kafka for a queue, with Node services producing and consuming messages to Kafka topics using Kafka-Node.
I've been using a home-brewed distributed tracing solution, but now we are moving to the Elastic APM.
This seems to be tailored to HTTP servers, but how do I configure it to work with Kafka?
I want to be able to track transactions like the following: Service A sends an HTTP request to Service B, which produces it to Kafka Topic C, from which it is consumed by Service D, which puts some data into Kafka Topic E, from which it is consumed by Service B.
I worked with the Elastic APM team, who had just rolled out this package: https://www.npmjs.com/package/elastic-apm-node
The directions are pretty self-explanatory, works like a charm.

Apache Kafka consumer groups and microservices running on Kubernetes, are they compatible?

So far, I have been using Spring Boot apps (with Spring Cloud Stream) and Kafka running without any supporting infrastructure (PaaS).
Since our corporate platform is running on Kubernetes we need to move those Spring Boot apps into K8s to allow the apps to scale and so on. Obviously there will be more than one instance of every application so we will define a consumer group per application to ensure the unique delivery and processing of every message.
Kafka will be running outside Kubernetes.
Now my doubt is: since the apps deployed on k8s are accessed through the k8s service that abstracts the underlying pods, and individual application pods can't be access directly outside of the k8s cluster, Kafka won't know how to call individual instances of the consumer group to deliver the messages, will it?
How can I make them work together?
Kafka brokers do not push data to clients. Rather clients poll() and pull data from the brokers. As long as the consumers can connect to the bootstrap servers and you set the Kafka brokers to advertise an IP and port that the clients can connect to and poll() then it will all work fine.
Can Spring Cloud Data Flow solve your requirement to control the number of instances deployed?
and, there is a community released Spring Cloud Data Flow server for OpenShift:
https://github.com/donovanmuller/spring-cloud-dataflow-server-openshift

Basic kafka topic availability checker

I need a simple health checker for Apache Kafka. I dont want something large and complex like Yahoo Kafka Manager, basically I want to check if a topic is healthy or not and if a consumer is healthy.
My first idea was to create a separate heart-beat topic and periodically send and read messages to/from it in order to check availability and latency.
The second idea is to read all the data from Apache Zookeeper. I can get all brokers, partitions, topics etc. from ZK, but I dont know if ZK can provide something like failure detection info.
As I said, I need something simple that I can use in my app health checker.
Some existing tools you can try them out if you haven't yet -
Burrow Linkedin's Kafka Consumer Lag Checking
exhibitor Netflix's ZooKeeper co-process for instance monitoring, backup/recovery, cleanup and visualization.
Kafka System Tools Kafka command line tools

Apache Kafka - Advisory Message

Apache Kafka has option "Advisory Message", similar to ActiveMQ?
ActiveMQ Advisory Message -> http://activemq.apache.org/advisory-message.html
No, there is no such functionality. Instead of managing cluster with advisory messages, kafka relies on zookeeper and whenever some action is required (e.g. delete topic, perform rebalance) it creates appropriate "command" node in zk.
Having said this, kafka exposes a lot of it's underpinnings as JMX accessible statistics.