How to check consumer offsets when the offset store is Kafka? - apache-kafka

From 0.8.1.1 release, Kafka provides the provision for storage of offsets in Kafka, instead of Zookeeper (see this).
I'm not able to figure out how to check the details of offsets consumed, as the current tools only provide consumer offset count checks for zookeeper only.(I'm referring to this)
If there are any tools available to check consumer offset, please let me know.

I'am using kafka 0.8.2 with offsets stored in kafka. This tools works good for me:
./kafka-run-class.sh kafka.tools.ConsumerOffsetChecker
--topic your-topic
--group your-consumer-group
--zookeeper localhost:2181
You get all informations you need: topic size, consumer lag, owner.

The following straight command gives the enough details:
kafka-consumer-groups.sh --bootstrap-server localhost:9092 --describe --group my-second-application
You will get the details like this
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
first_topic 0 4 4 0 consumer-1-7cb31cf3-1621-4635-8f95-6ae85215b31b /10.200.237.53 consumer-1
first_topic 1 3 3 0 consumer-1-7cb31cf3-1621-4635-8f95-6ae85215b31b /10.200.237.53 consumer-1
first_topic 2 3 3 0 consumer-1-7cb31cf3-1621-4635-8f95-6ae85215b31b /10.200.237.53 consumer-1
first-topic 0 4 4 0 - - -

I'm using Kafka 2.1 and I use kafka-consumer-groups command which gives useful details like current offset, log-end offset, lag, etc. The simplest command syntax is
kafka-consumer-groups.sh \
--bootstrap-server localhost:29092 \
--describe --group <consumer group name>
And the sample output looks like this
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
your.topic 1 17721650 17721673 23 consumer-159-beb9050b /1.2.3.4 consumer-159
your.topic 3 17718700 17718719 19 consumer-159-beb9050b /1.2.3.4 consumer-159
your.topic 0 17721700 17721717 17 consumer-159-beb9050b /1.2.3.4 consumer-159
HTH

Related

Why is my kafka topic not being reset to 0?

When I describe one of my topics I get this status:
➜ local-kafka_2.12-2.0.0 bin/kafka-consumer-groups.sh --bootstrap-server myip:1025 --group mygroup --describe
Consumer group 'mygroup' has no active members.
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
mytopic 0 858 858 0 - - -
when I try to reset it to the earliest, I get this status:
➜ local-kafka_2.12-2.0.0 bin/kafka-consumer-groups.sh --bootstrap-server myip:1025 --group mygroup --topic mytopic --reset-offsets --to-earliest --execute
TOPIC PARTITION NEW-OFFSET
mytopic 0 494
I would have expected the new offset to be at 0 rather than 494.
Question
1 - In the describe output the current offset is shown as 858, however resetting to earliest shows as 494. So there would be a lag of 364. My question is, what happened to the remaining 494 (858-364) offsets? Are they gone because of some configuration setup for this topic? My retention.ms is set to 1 week
2 - If the 494 records are gone, is there a way to recover them somehow?
In case you have access to the data directory of your kafka clusters, you can see the data that is present in there using the command kafka-run-class.bat kafka.tools.DumpLogSegments.
For more information see e.g. here: https://medium.com/#durgaswaroop/a-practical-introduction-to-kafka-storage-internals-d5b544f6925f
Your data might have been deleted either due to log retention time or due to size limitation of the logs (the configuration property log.retention.bytes).

Kafka: Describe Consumer Group Offset

While the Kafka consumer application is up and running, we are able to use the kafka-consumer-groups.sh to describe and retrieve the offset status.
However, if the application goes down, then the command just displays the application is in REBALANCING.
Is there a way to just see the lag of a particular consumer group, even if the application is not up and running?
For example, I would like this output
GROUP|TOPIC|PARTITION|CURRENT-OFFSET|LOG-END-OFFSET|LAG
hrly_ingest_grp|src_hrly|4|63832846|63832846|0
hrly_ingest_grp|src_hrly|2|38372346|38372346|0
hrly_ingest_grp|src_hrly|0|58642250|58642250|0
hrly_ingest_grp|src_hrly|5|96295762|96295762|0
hrly_ingest_grp|src_hrly|3|50602337|50602337|0
hrly_ingest_grp|src_hrly|1|29288993|29288993|0
You can use kt (Kafka tool) - https://github.com/fgeller/kt
Command to query offset and lag will be as follow:
kt group -group groupName -topic topicName -partitions all
Even the consumer application is down, this command will show the offset of each consumer of that group
bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --describe --group my-group
Output:
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
topic3 0 241019 395308 154289 consumer2-e76ea8c3-5d30-4299-9005-47eb41f3d3c4 /127.0.0.1 consumer2
topic2 1 520678 803288 282610 consumer2-e76ea8c3-5d30-4299-9005-47eb41f3d3c4 /127.0.0.1 consumer2
topic3 1 241018 398817 157799 consumer2-e76ea8c3-5d30-4299-9005-47eb41f3d3c4 /127.0.0.1 consumer2
topic1 0 854144 855809 1665 consumer1-3fc8d6f1-581a-4472-bdf3-3515b4aee8c1 /127.0.0.1 consumer1
topic2 0 460537 803290 342753 consumer1-3fc8d6f1-581a-4472-bdf3-3515b4aee8c1 /127.0.0.1 consumer1
topic3 2 243655 398812 155157 consumer4-117fe4d3-c6c1-4178-8ee9-eb4a3954bee0 /127.0.0.1 consumer4

Kafka Stream Consumer Group not showing offset

I have two kafka streams applications.
I can see there consumer groups using:
>bin/kafka-consumer-groups.sh --list --bootstrap-server localhost:9092`
streams-distribution-app
streams-collection-app
I can see offsets of the distribution-app using:
>bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --describe --group streams-distribution-app
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
o365_user_activity 0 3900 3985 85 streams-distribution-app-StreamThread-1 /127.0.0.1
o365_cassandra_data_load 0 - 0 - streams-distribution-app-StreamThread-1 /127.0.0.1 streams-distribution-app
But I cant view the offsets for collection-app:
bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --describe --group streams-collection-app
Note: This will not show information about old Zookeeper-based consumers.
The app is running and processing logs. I dont understand what can be the reason for not describing the offset information.
UPDATE:
After quite some time (around 20 mins) I can finally see the offsets.
But there is a lot of lag.
bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --describe --group streams-collection-app
Note: This will not show information about old Zookeeper-based consumers.
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
o365_activity_contenturl 0 200 64941 64741 streams-collection-app-StreamThread-1 /127.0.0.1 streams-collection-app
These are the configs for stream-collection-app:
props.put(StreamsConfig.producerPrefix(ProducerConfig.RETRIES_CONFIG), Integer.MAX_VALUE);
props.put(StreamsConfig.producerPrefix(ProducerConfig.RETRY_BACKOFF_MS_CONFIG), Integer.MAX_VALUE);
props.put(StreamsConfig.producerPrefix(ProducerConfig.REQUEST_TIMEOUT_MS_CONFIG), Integer.MAX_VALUE);
props.put(StreamsConfig.consumerPrefix(ConsumerConfig.MAX_POLL_RECORDS_CONFIG), 200);
UPDATE 2:
I tried increasing the num.stream.threads config and set it 4. And again restarted the application with some 40k+ records on source topic.
Still there was no output on describing the stream app consumer-group. After about 10-15 mins I can see the following offsets:
bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --describe --group streams-collection-app
Note: This will not show information about old Zookeeper-based consumers.
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
o365_activity_contenturl 0 200 11274 11074 streams-collection-app-StreamThread-1 /127.0.0.1 streams-app-StreamThread-1
o365_activity_contenturl 1 200 11384 11184 streams-collection-app-StreamThread-2 /127.0.0.1 streams-app-StreamThread-2
o365_activity_contenturl 3 200 11343 11143 streams-collection-app-StreamThread-4 /127.0.0.1 streams-app-StreamThread-4
o365_activity_contenturl 2 200 11471 11271 streams-collection-app-StreamThread-3 /127.0.0.1 streams-app-StreamThread-3

kafka __consumer_offsets and unclean shutdown?

I was wondering can anyone explain how actually __consumer_offsets works?
For testing purposes I have single instance of kafka 0.11.0.0 With these overridden settings:
offsets.topic.replication.factor=1
broker.id=0
offsets.retention.minutes=43200
log.flush.scheduler.interval.ms=60000
log.retention.hours=720
log.flush.interval.ms=60000
log.retention.check.interval.ms=300000
log.segment.bytes=1073741824
And I have a single consumer called pigeon.
Everything works fine, untill I do a kill -9 on kafka server (unclean shutdown). After that it seems that the client looses offset.
Before the kill -9:
Log from the client (using kafka-reactive):
2017-10-12 13:08:32.960 [DEBUG] o.a.k.c.c.i.ConsumerCoordinator - Group pigeon committed offset 275620 for partition ClusterEvents-0
Looking at ConsumerGroupCommand:
# ./kafka-run-class.sh kafka.admin.ConsumerGroupCommand --bootstrap-server localhost:9092 --group pigeon --new-consumer --describe`
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
ClusterEvents 0 275620 275620 0 pigeon-1507813552573-b3c74e75-04c1-48d0-bf5a-b66c203861aa/10.84.2.238 pigeon-1507813552573
And looking at __consumer_offsets:
#./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic __consumer_offsets --from-beginning --formatter "kafka.coordinator.group.GroupMetadataManager\$OffsetsMessageFormatter" | grep pigeon
[pigeon,ClusterEvents,0]::[OffsetMetadata[264458,NO_METADATA],CommitTime 1507285596838,ExpirationTime 1509877596838]
<.....>
[pigeon,ClusterEvents,0]::[OffsetMetadata[275620,NO_METADATA],CommitTime 1507813712886,ExpirationTime 1510405712886]
So the __consumer_offsets first offset is 264458 and we can see that 275620 offset is committed
After kill -9:
Now let's do a kill -9 on kafka process, while kafka is down, stop the consumer and after kafka restarts let's look at the same data:
# ./kafka-run-class.sh kafka.admin.ConsumerGroupCommand --bootstrap-server localhost:9092 --group pigeon --new-consumer --describe`
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
ClusterEvents 0 264458 275645 11187 - - -
#./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic __consumer_offsets --from-beginning --formatter "kafka.coordinator.group.GroupMetadataManager\$OffsetsMessageFormatter" | grep pigeon
[pigeon,ClusterEvents,0]::[OffsetMetadata[264458,NO_METADATA],CommitTime 1507285596838,ExpirationTime 1509877596838]
<.....>
[pigeon,ClusterEvents,0]::[OffsetMetadata[275620,NO_METADATA],CommitTime 1507813712886,ExpirationTime 1510405712886]
So although __consumer_offsets contains same info that offset is 275620 commited, but ConsumerGroupCommand reports that the current offset is 264458. Why?
How does __consumer_offsets actually work?
If I restart the consumer, it will start consuming from offset 264458, commit the latest offset, and I can do a kill -9 on kafka again, and it will start consuming from 264458
Am I misunderstanding how this should work? At first I though that this is due to log changes not being fsynced to the disk, so i decreased
log.flush.interval.ms to 60s, and waited for couple of minutes between kills. But that does not seem to help. And since __consumer_offsets contains much greater commiteed value, why does after unclean shutdown set offset to 264458
Apparently it was an issue with kafka 0.11.0.0 and 0.11.0.1 fixes that.
More info [KAFKA-5600] - Group loading regression causing stale metadata/offsets cache

What command shows all of the topics and offsets of partitions in Kafka?

I'm looking for a Kafka command that shows all of the topics and offsets of partitions. If it's dynamically would be perfect. Right now I'm using java code to see these information, but it's very inconvenient.
Kafka ships with some tools you can use to accomplish this.
List topics:
# ./bin/kafka-topics.sh --list --zookeeper localhost:2181
test_topic_1
test_topic_2
...
List partitions and offsets:
# ./bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --broker-info --group test_group --topic test_topic --zookeeper localhost:2181
Group Topic Pid Offset logSize Lag Owner
test_group test_topic 0 698020 698021 1 test_group-0
test_group test_topic 1 235699 235699 0 test_group-1
test_group test_topic 2 117189 117189 0 test_group-2
Update for 0.9 (and higher) consumer APIs
If you're using the new apis, there's a new tool you can use: kafka-consumer-groups.sh.
./bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group count_errors --describe
GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG OWNER
count_errors logs 2 2908278 2908278 0 consumer-1_/10.8.0.55
count_errors logs 3 2907501 2907501 0 consumer-1_/10.8.0.43
count_errors logs 4 2907541 2907541 0 consumer-1_/10.8.0.177
count_errors logs 1 2907499 2907499 0 consumer-1_/10.8.0.115
count_errors logs 0 2907469 2907469 0 consumer-1_/10.8.0.126
You might want to try kt. It's also quite faster than the bundled kafka-topics.
This is the current most complete info description you can get out of a topic with kt:
kt topic -brokers localhost:9092 -filter my_topic_name -partitions -leaders -replicas
It also outputs as JSON, so you can pipe it to jq for further flexibility.
If anyone is interested, you can have the the offset information for all the consumer groups with the following command:
kafka-consumer-groups --bootstrap-server localhost:9092 --all-groups --describe
The parameter --all-groups is available from Kafka 2.4.0
We're using Kafka 2.11 and make use of this tool - kafka-consumer-groups.
$ rpm -qf /bin/kafka-consumer-groups
confluent-kafka-2.11-1.1.1-1.noarch
For example:
$ kafka-consumer-groups --describe --group logstash | grep -E "TOPIC|filebeat"
Note: This will not show information about old Zookeeper-based consumers.
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
beats_filebeat 0 20003914484 20003914888 404 logstash-0-XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX /192.168.1.1 logstash-0
beats_filebeat 1 19992522286 19992522709 423 logstash-0-XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX /192.168.1.1 logstash-0
beats_filebeat 2 19990597254 19990597637 383 logstash-0-XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX /192.168.1.1 logstash-0
beats_filebeat 7 19991718707 19991719268 561 logstash-0-YYYYYYYY-YYYY-YYYY-YYYY-YYYYYYYYYYYY /192.168.1.2 logstash-0
beats_filebeat 8 20015611981 20015612509 528 logstash-0-YYYYYYYY-YYYY-YYYY-YYYY-YYYYYYYYYYYY /192.168.1.2 logstash-0
beats_filebeat 5 19990536340 19990541331 4991 logstash-0-ZZZZZZZZ-ZZZZ-ZZZZ-ZZZZ-ZZZZZZZZZZZZ /192.168.1.3 logstash-0
beats_filebeat 6 19990728038 19990733086 5048 logstash-0-ZZZZZZZZ-ZZZZ-ZZZZ-ZZZZ-ZZZZZZZZZZZZ /192.168.1.3 logstash-0
beats_filebeat 3 19994613945 19994616297 2352 logstash-0-AAAAAAAA-AAAA-AAAA-AAAA-AAAAAAAAAAAA /192.168.1.4 logstash-0
beats_filebeat 4 19990681602 19990684038 2436 logstash-0-AAAAAAAA-AAAA-AAAA-AAAA-AAAAAAAAAAAA /192.168.1.4 logstash-0
Random Tip
NOTE: We use an alias that overloads kafka-consumer-groups like so in our /etc/profile.d/kafka.sh:
alias kafka-consumer-groups="KAFKA_JVM_PERFORMANCE_OPTS=\"-Djava.security.auth.login.config=$HOME/.kafka_client_jaas.conf\" kafka-consumer-groups --bootstrap-server ${KAFKA_HOSTS} --command-config /etc/kafka/security-enabler.properties"