Kafka consumer lag through JMX - apache-kafka

I'm trying to monitor the lag of a consumer group in Kafka 0.10.
Our consumers are keeping track of their offsets in Kafka rather than ZooKeper. This means I can get the figures using the following:
bin/kafka-consumer-groups.sh --bootstrap-server <broker> --describe --group <group-name>
This works fine. However, my broker already makes use of the Prometheus JMX exporter for collecting a number of stats. I've connected JConsole to the brokers but can't see the same data being reported in JMX as reported by kafka-consumer-groups.sh.
Is there anyway to get this information from Kafka with JMX without needing any additional tools?

You could retrieve the atrributes {topic}-{partition}.records-lag of metric kafka.consumer:type=consumer-fetch-manager-metrics,client-id={client-id} for all partitions. That should be equivalent to the output of consumer-groups.sh

Related

How to check that Kafka does rebalance?

I'm writing a Go service that works with Kafka. I have a problems with bad commits when broker rebalances. I want to do an experiment forcing Kafka to rebalance and to see how the service behaves.
What I do:
running Kafka in Docker locally (broker, zookeeper, schema registry and control center)
created a topic with 2 partition
running producer that sends messages to both partitions
Then I'm running two consumers with the same groupID, after that I'm closing one. It seems to me that broker should start rebalancing this moment. Or no? Who's logs should I check for it?
You can check that by running the following commands:
bin/kafka-consumer-groups --bootstrap-server host:9092 --list
and to describe:
bin/kafka-consumer-groups --bootstrap-server host:9092 --describe --group foo
Full documentation could be found here: Kafka consumer group
Who's logs should I check for it?
The running consumer's log should be checked, depending on if the library you're using actually logs such information.

Kafka consumer group description does not include all topics [duplicate]

What I want to achieve is to be sure that my Kafka streams consumer does not have lag.
I have simple Kafka streams application that materialized one topic as store in form of GlobalKTable.
When I try to describe consumer on Kafka by command:
kafka-consumer-groups --bootstrap-server localhost:9092 --describe --group my-application-id
I can't see any results. And there is no error either. When I list all consumers by:
kafka-consumer-groups --bootstrap-server localhost:9092 --describe --all-groups
my application consumer is listed correctly.
Any idea where to find additional information what is happening that I can't describe consumer?
(Any other Kafka streams consumers that write to topics can be described correctly.)
If your application does only materialize a topic into a GlobalKTable no consumer group is formed. Internally, the "global consumer" does not use subscribe() but assign() and there is no consumer group.id configured (as you can verify from the logs) and no offset are committed.
The reason is, that all application instances need to consume all topic partitions (ie, broadcast pattern). However, a consumer group is designed such that different instances read different partitions for the same topic. Also, per consumer group, only one offset can be committed per partition -- however, if multiple instance read the same partition and would commit offsets using the same group.id the commits would overwrite each other.
Hence, using a consumer group while "broadcasting" data does not work.
However, all consumers should expose a "lag" metrics records-lag-max and records-lag (cf https://kafka.apache.org/documentation/#consumer_fetch_monitoring). Hence, you should be able to hook in via JMX to monitor the lag. Kafka Streams includes client metrics via KafkaStreams#metrics(), too.

How to retrieve Kafka Consumer Configs

I have several consumers that connect to Kafka Cluster that I do not have control over. At the same time, I would like to have visibility into how those consumers are configured.
Is there an API to list all consumers (if there is one for publishers, it is an added benefit) and then read all their configs?
I am talking about these consumer settings:
https://docs.confluent.io/current/installation/configuration/consumer-configs.html#cp-config-consumer
This is not possible as most of those settings are configured at the consumer only and are not pushed to the brokers or any topic.
It's possible however to get a high-level description for a given consumer group:
./kafka-consumer-groups.sh --bootstrap-server localhost:9092 --describe --group consumer-group

Unable to describe Kafka Streams Consumer Group

What I want to achieve is to be sure that my Kafka streams consumer does not have lag.
I have simple Kafka streams application that materialized one topic as store in form of GlobalKTable.
When I try to describe consumer on Kafka by command:
kafka-consumer-groups --bootstrap-server localhost:9092 --describe --group my-application-id
I can't see any results. And there is no error either. When I list all consumers by:
kafka-consumer-groups --bootstrap-server localhost:9092 --describe --all-groups
my application consumer is listed correctly.
Any idea where to find additional information what is happening that I can't describe consumer?
(Any other Kafka streams consumers that write to topics can be described correctly.)
If your application does only materialize a topic into a GlobalKTable no consumer group is formed. Internally, the "global consumer" does not use subscribe() but assign() and there is no consumer group.id configured (as you can verify from the logs) and no offset are committed.
The reason is, that all application instances need to consume all topic partitions (ie, broadcast pattern). However, a consumer group is designed such that different instances read different partitions for the same topic. Also, per consumer group, only one offset can be committed per partition -- however, if multiple instance read the same partition and would commit offsets using the same group.id the commits would overwrite each other.
Hence, using a consumer group while "broadcasting" data does not work.
However, all consumers should expose a "lag" metrics records-lag-max and records-lag (cf https://kafka.apache.org/documentation/#consumer_fetch_monitoring). Hence, you should be able to hook in via JMX to monitor the lag. Kafka Streams includes client metrics via KafkaStreams#metrics(), too.

Getting Kafka usage details

I am trying to find ways to get current usage statistics for my kafka cluster. I am looking to collect following information:
Number of topics in kafka cluster
Number of partitions per kafka broker
Number of active consumers and producers
Number of client connections per kafka broker
Number of messages on each partition, size of disk etc.
Lagging replicas, consumer lag etc.
Active consumer groups
Any other statistics that can and should be collected, currently I am looking at collecting the above stats.
I can get 1 and 2 using zookeeper utilities but I am lost on rest. I have looked at mbeans in Jconsole but didn't find anything about above. I also tried JmxTool to get these mbeans using regex based expression but that also didn't work.
I am using Kafka v2.1 and using new consumer api so zookeeper doesn't have any information about consumers.
Any pointers would be great help!
Might as well use https://github.com/yahoo/kafka-manager or https://github.com/linkedin/cruise-control to get this information.
There are scripts under $KAFKA_HOME/bin which can help you.
Number of topics in kafka cluster
./kafka-topics.sh --zookeeper localhost:2181 --list
Number of partitions per kafka broker
./kafka-topics.sh --zookeeper localhost:2181 --describe
Number of messages on each partition, size of disk etc.
./kafka-log-dirs.sh --describe --bootstrap-server localhost:9092
Lagging replicas, consumer lag etc.
./kafka-consumer-group.sh --bootstrap-server localhost:9092 --group $GROUP_NAME --describe
Active consumer groups
Number of active consumers and producers
You can't get active producer.
Know existing producers for a kafka topic
./kafka-consumer-group.sh --bootstrap-server localhost:9092 --list
Number of client connections per kafka broker
./