Need to identify idle kafka consumers - apache-kafka

I have 3 spark-streaming consumers with same group id subscribed to a Kafka topic which has only one partition. So that one partition is assigned to one consumer and other consumers are idle. We need to kill the idle consumers. We are using Spark with YARN.
How to identify the consumer which is idle
Is there any commands from kafka to check which consumer is idle or is there any other we can achieve the same?

kafka-consumer-groups --group name will show all clients within a group and those assigned a partition
Although, it doesn't hurt to have backup processes incase one of the consumer threads fails

Try this -
$ kafka-consumer-groups.sh --describe --group group_name --members

Related

find no consumer in Kafka consumer group but consume is normal

I use kafka which version is kafka V0.11.
I have a consumer group which group.id = test, but when I use command like kafka-consumer-groups --bootstrap-server localhost:9092 --group test --describe, I find no consumer under this group.
however, service consume is normal.
Who could know why? thx.

Kafka consumer group description does not include all topics [duplicate]

What I want to achieve is to be sure that my Kafka streams consumer does not have lag.
I have simple Kafka streams application that materialized one topic as store in form of GlobalKTable.
When I try to describe consumer on Kafka by command:
kafka-consumer-groups --bootstrap-server localhost:9092 --describe --group my-application-id
I can't see any results. And there is no error either. When I list all consumers by:
kafka-consumer-groups --bootstrap-server localhost:9092 --describe --all-groups
my application consumer is listed correctly.
Any idea where to find additional information what is happening that I can't describe consumer?
(Any other Kafka streams consumers that write to topics can be described correctly.)
If your application does only materialize a topic into a GlobalKTable no consumer group is formed. Internally, the "global consumer" does not use subscribe() but assign() and there is no consumer group.id configured (as you can verify from the logs) and no offset are committed.
The reason is, that all application instances need to consume all topic partitions (ie, broadcast pattern). However, a consumer group is designed such that different instances read different partitions for the same topic. Also, per consumer group, only one offset can be committed per partition -- however, if multiple instance read the same partition and would commit offsets using the same group.id the commits would overwrite each other.
Hence, using a consumer group while "broadcasting" data does not work.
However, all consumers should expose a "lag" metrics records-lag-max and records-lag (cf https://kafka.apache.org/documentation/#consumer_fetch_monitoring). Hence, you should be able to hook in via JMX to monitor the lag. Kafka Streams includes client metrics via KafkaStreams#metrics(), too.

Unable to describe Kafka Streams Consumer Group

What I want to achieve is to be sure that my Kafka streams consumer does not have lag.
I have simple Kafka streams application that materialized one topic as store in form of GlobalKTable.
When I try to describe consumer on Kafka by command:
kafka-consumer-groups --bootstrap-server localhost:9092 --describe --group my-application-id
I can't see any results. And there is no error either. When I list all consumers by:
kafka-consumer-groups --bootstrap-server localhost:9092 --describe --all-groups
my application consumer is listed correctly.
Any idea where to find additional information what is happening that I can't describe consumer?
(Any other Kafka streams consumers that write to topics can be described correctly.)
If your application does only materialize a topic into a GlobalKTable no consumer group is formed. Internally, the "global consumer" does not use subscribe() but assign() and there is no consumer group.id configured (as you can verify from the logs) and no offset are committed.
The reason is, that all application instances need to consume all topic partitions (ie, broadcast pattern). However, a consumer group is designed such that different instances read different partitions for the same topic. Also, per consumer group, only one offset can be committed per partition -- however, if multiple instance read the same partition and would commit offsets using the same group.id the commits would overwrite each other.
Hence, using a consumer group while "broadcasting" data does not work.
However, all consumers should expose a "lag" metrics records-lag-max and records-lag (cf https://kafka.apache.org/documentation/#consumer_fetch_monitoring). Hence, you should be able to hook in via JMX to monitor the lag. Kafka Streams includes client metrics via KafkaStreams#metrics(), too.

Re-processing/reading Kafka records/messages again - What is the purpose of Consumer Group Offset Reset?

My kafka topic has 10 records/messages in total and 2 partitions having 5 messages each. My consumer group has 2 consumers and each of the consumer has already read 5 messages from their assigned partition respectively. Now, I want to re-process/read messages from my topic from start/beginning (offset 0).
I stopped my kafka consumers and ran following command to reset consumer group offset to 0.
./kafka-consumer-groups.sh --group cg1 --reset-offsets --to-offset 0 --topic t1 --execute --bootstrap-server "..."
My expectation was that once I restart my kafka consumers they will start reading records from offset 0 i.e. beginning, but that didn't happen and they polled from their last position i.e. offset 5. Why is that so? I then have to make each of my consumers, explicitly seek to offset 0 (beginning) to re-process/read records from the beginning. And in later tests cycles, I didn't even ran above command to reset offset for kafka consumer group.
My question is, if I have to make my consumers explicitly seek to beginning to make them re-process/read messages again, then what's the purpose of resetting the offset of kafka consumer group?
Handling Kafka consumer offsets is bit more tricky. Consumer program uses auto.offset.reset config only when consumer group used does not have a valid offset committed in an internal Kafka topic.(Other supported offset storage is Zookeeper but internal Kafka topic is used as offset storage in latest Kafka versions).
Consider below scenarios:
Consumer in consumer group named 'group1' has consumed 5 messages from topic 'testtopic' and offset details are committed to internal Kafka topic- Next time when the consumer starts, it will not use 'auto.offset.reset' config. Instead it will fetch the stored offset from storage and will continue fetch messages from the retrieved offset.
Consumer in consumer group named 'group2' is started as a new consumer to fetch messages from 'testtopic'. This is new group and there is no offset details available in internal Kafka topic- 'auto.offset.reset' config is used now to decide where to start; either from beginning of the topic or from latest(only new messages will be consumed).
The issue as per your question is that the command to reset offset not working, you have to manually seek to beginning and start consumer.
kafka-consumer-groups.sh --bootstrap-server <kafka_host:port> --group <group_id> [--topic <topic_name> or --all-topics] --reset-offsets [--to-earliest or --to-offset <offset>] --execute
There are three possibilities for reset command not working.
The log retention period is smaller and offset you are trying to reset is no longer available
A consumer instance in the consumer group is running. In both cases, reset offset command may not work.
Kafka version is <0.11. Reset offset API is available only from Kafka 0.11
From your question, first and third case is unlikely. Please check for second case. Stop any consumer instance running and then try resetting offsets.
Below command can be used to check whether a consumer group has active consumer instance.
kafka-consumer-groups.sh --bootstrap-server <kafka_host:port> --group <group_id> --describe
Sample output:
Consumer group 'group1' has no active members.
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
intro 0 0 99 99

How to check the lag of a consumer in Kafka which is assigned with a particular partition of a topic?

I want to check the lag for a consumer group which was assigned manually to particular topic , is this possible ? . I am using Kafka - 0.10.0.1 .I used sh kafka-run-class.sh kafka.admin.ConsumerGroupCommand —new-consumer —describe —bootstrap-server localhost:9092 —group test but it says no group exists , so i wonder when we assign a partition manually can we check the lag for the consumer.
In Nussknacker we are using AKHQ GUI tool which provides various monitoring options as consumer and consumer groups lag and general Kafka operations as topic, topic data and schema registry management
./kafka-consumer-groups.sh --bootstrap-server localhost:9092
--describe --group
.
If You want API support or visual lag monitoring you can use https://github.com/yahoo/kafka-manager