Monitoring of dynamic(scale up/down) kafka cluster - apache-kafka

We are using kafka cluster and we want monitor this cluster.
Current our approach:
collect jmx metrics (telegraf jolokia plugin or jmxtrans)
store into influx
render via grafana
But we faced with problem of static configuration of kafka broker's list.
We can configure only static list via Jmxtrans or Telegraf jolokia plugin.
We would like dynamic list (example configure only kafka cluster zookeeper connection for getting brokers list in each iteration of collecting metrics) for case scaling up/down of kafka.
May be do exist another instruments for monitoring dynamic kafka cluster?

My finally solution next:
Custom bash input plugin (because i don't want build own telegraf with one custom go input plugin and I know not very well go language now:) )
In bash plugin zkCli is used for all kafka nodes discovery
Next bash script post bulk jolokia requet to each kafka node, aggregates and format to influx plugin.

Related

Attaching Jolokia to kafka brokers deployed using Strimzi operator

I am deploying a kafka cluster on Kubernetes using Strimzi kafka operator. I need to be able to query Kafka JMX MBEANS remotely through HTTP/REST using Jolokia (Jolokia is an agent that converts and exposes JMX MBEAN measurements for querying over HTTP…).
AFAIK, Strimzi documentation does not provide any hint on how to attach Jolokia to kafka brokers. Hence can you please provide a hint on what kind of modifications to the deployment files (strimzi operator and/or cluster deployment files) so that Jolokia is attached to the brokers/Zookeeper instances.
Last I checked, Strimzi offers Prometheus JMX Exporter already, not Jolokia. Prometheus also offers Mbeans over HTTP. https://strimzi.io/docs/operators/latest/overview.html#metrics-overview_str
But the concept is the same, and the fact you're using Strimzi doesn't really matter, since the process is the same regardless of how Kafka is running - you need that JVM agent added into KAFKA_OPTS environment variable. You might want to use a custom docker image that has the Jolokia agent available

Is there a sample example of opensource Kafka Cassandra connector configuration?

We are feeding events (logs) from Logstash to Apache Cassandra using the PerimeterX Cassandra Logstash out plugin. We have hit the max throughput of the plugin to be 8K as it opens only 2 connections to Cassandra whereas Cassandra has a much higher throughput (for consuming data) and we expecting a throughput on the actual system to be 30K or higher.
Here throughput is the capacity to consume the incoming events, which is x units/sec
Hence we planned to introduced Kafa in the middle which has a 45K throughput with Logstash output.
We are looking for help from this stack overflow post. We could configure the connector JAR as mentioned in the documentation. But there is no proper guide or current documentation is very confusing and goes in a loop with the configuration requirement. We don't see the plugin being called when Kafka is running with the target topic.
Some help on what is the correct configuration, or some documentation info on Cassandra keyspaces will be helpful.
After placing the JAR as mentioned in the documentation
We need to run Kafka connect which will show all the connectors configured.
To turn on Kafka connect run the below command (Kafka connect in distributed mode)
bin/connect-distributed.sh config/connect-distributed.properties
Kafka connect has a REST API service available at http://localhost:8083
using this REST API you can configure your connectors.
To register the connector use the below API
POST /connectors – creates a new connector; the request body should be a JSON object containing a string name field and an object config field with the connector configuration parameters
The JSON sample to register the connector is present kafka-connect-cassandra-sink-1.4.0.tar.gz file.
The official-documentation provides a list with all endpoints.
More info available here

Kafka Producer jmx metrics missing

We have Spring Boot applications deployed on OKD (The Origin Community Distribution of Kubernetes that powers Red Hat OpenShift). Without much tweaking by devops team, we got in prometheus scraped kafka consumer metrics from kubernetes-service-endpoints exporter job, as well as some producer metrics, but only for kafka connect api, not for standard kafka producer api. This is I guess a configuration for that job:
https://github.com/prometheus/prometheus/blob/master/documentation/examples/prometheus-kubernetes.yml
What is needed to change in scrape config in order to collect what's been missing?
This issue with micrometer is the source of the problem.
So, we could add jmx exporter, or wait for the issue resolution.

How to consume message from a topic in Prometheus

I am working on Kafka --> Prometheus --> Grafana pipeline. I have java application which send message inside a kafka topic. But in prometheus it shows only the message count of topic. I am running an instance of JMX Exporter when I run Kafka.
export JMX_YAML=/home/kafka_2.12-2.3.0/prometheus/kafka-0-8-2.yml
export JMX_JAR=/home/kafka_2.12-2.3.0/prometheus/jmx_prometheus_javaagent-0.6.jar
export KAFKA_OPTS="$KAFKA_OPTS -javaagent:$JMX_JAR=7076:$JMX_YAML"
bin/kafka-server-start.sh config/server.properties
But I need to read the topic data in prometheus. Is there any direct Kafka to Prometheus importer?
I have heard about "Kafka Connect framework"? How to configure it inside prometheus?
Prometheus doesn't run Kafka Connect; you would have to configure that separately.
Also, Prometheus is pulled based, so you at the very least would have to use PushGateway, assuming a Kafka Connector did exist.
If you just want to ultimately display data in Grafana, there are existing connectors for Elasticsearch, Influx, Cassandra, and most JDBC databases
Telegraf or Logstash could be used as alternatives to Kafka Connect, as well, or you can write your own consumer.

Is it possible to monitor kafka producer/consumer metrics in DCOS?

I am trying to do this without using Confluent Control Center, since I do not have a license.
I am able to see the Kafka Broker metrics by using dcos task metrics details <broker-id> and see that all of these are already exposed on my DCOS Prometheus instance.
However, I do not see any consumer/producer metrics available on Prometheus, despite having some producers/consumer tasks on dcos.
Is there a process I can follow to expose kafka prodcuer/consumer metrics on dcos? I tried the following https://github.com/ibm-cloud-architecture/refarch-eda/blob/master/docs/kafka/monitoring.md .
But from my understanding we cannot use JMX on a Kafka instance hosted on DCOS (yet) (soruce: https://jira.mesosphere.com/browse/DCOS_OSS-3632?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel)
Any ideas?
You would have to add Prometheus JMX exporters to each of your Kafka Java processes, and then you would need to have the Prometheus server be able to scrape those. You would do this by downloading that JAR in each of the processes (containers?), then editing the KAFKA_OPTS environment varible to include the -javaagent option
AFAIK, this does not require setting up a remotely accessible JMX port.
Note: Control Center doesn't monitor JMX values. It uses Kafka MetricsReporters and Interceptors. Use of these interfaces, if you chose to write your own, or find others, doesn't require Control Center at all.