How to get log end offset of all partitions for a given kafka topic using kafka command line? - apache-kafka

When I describe a kafka topic it doesn't show the log end offset of any partition but show all the other metadata such as ISR,Replicas,Leader.
How do I see a log end offset of the partition for a given topic?
Ran this: ./kafka-topics.sh --zookeeper zk-service:2181 --describe --topic "__consumer_offsets"
Output Doesn't have a offset column.
Note: Need Only the log end offset.

Since you're only looking for the log end offset for a topic, you can use kafka-run-class with the kafka.tools.GetOffsetShell class.
Assuming your topic is __consumer_offsets, you would get the end offset by running:
./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:9092 --time -1 --topic __consumer_offsets
Change the --broker-list localhost:9092 to your desired Kafka address. This will list all of the log end offsets for each partition in the topic.

install kafkacat, its an easy to use kafka tool:
sudo apt-get update
sudo apt-get install kafkacat
kafkacat -C -b <kafka-broker-ip-and-port> -t <topic> -o -1
This will not consume anything because the offset is incremented after a message is added. But it will give you the offsets for all the partitions. Note however that this isn't the current offset that you are consuming at... The above answers will help you more in terms of looking into partition lag.

Following is the command you would need to get the offset of all partitions for a given kafka topic for a given consumer group:
kafka-consumer-groups --bootstrap-server <kafka-broker-list-with-ports> --describe --group <consumer-group-name>
Please note that the <consumer-group-name> at the end is important as the offsets are committed by consumers that are typically a part of a consumer group.
The output of this command may look something like:
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
<topic-name> 0 62 62 0 <consumer-id> <host> <client>
In your post however, you're trying to get this information for the internal topic __consumer_offsets so you would need a consumer group which would have consumers consuming from this internal topic. You could perhaps do the following:
kafka-console-consumer --bootstrap-server <kafka-broker-list-with-ports> --topic __consumer_offsets --formatter "kafka.coordinator.group.GroupMetadataManager\$OffsetsMessageFormatter" --max-messages 5
Output of the above command:
[<consumer-group-name>,<topic-name>,0]::[OffsetMetadata[481690879,NO_METADATA],CommitTime 1479708539051,ExpirationTime 1480313339051]
Just use the <consumer-group-name> from the output and put it in the kafka-consumer-groups command mentioned in the beginning and you'll get the offset details for all the 50 partitions for the given consumer group only.
I hope this helps.

Related

Command to know if offset kafka was reseted

Recently I had the name of my topic changed and then it seems that my consumer read all the messages from the topic, ignoring the offset. I wonder if anyone knows a command that I can check if my offset has been reset?
Thanks
Marcus
In kafka version 2+
If you describe your consumer group to can see the offset:
kafka-consumer-groups --describe --group <consumer group name> --bootstrap-server <kafka broker IP>:9092
To change offset to latest:
kafka-consumer-groups --bootstrap-server <kafka broker IP>:9092 --group <consumer group name> --topic <Topic name> --reset-offsets --to-latest --execute
Based on your kafka consumer, consumer group have property to read messages from the beginning, from an offset or latest.

Why does kafka-consumer-groups produce no output for listed consumer?

When I run the command:
kafka-consumer-groups -bootstrap-server localhost:9092 -list
It produces a list of consumer groups. Here's some lengthy output:
$ kafka-consumer-groups --bootstrap-server localhost:9092 --list
Note: This will not show information about old Zookeeper-based consumers.
simulate_birdfeeder.e67c034c-a96d-11e9-861a-983b8f0e47c8
xfer_server.e65d3732-a96d-11e9-992f-983b8f0e47c8
xfer_server.e654e596-a96d-11e9-992f-983b8f0e47c8
scheduler_node.e6ba65ce-a96d-11e9-b125-983b8f0e47c8
simulate_birdfeeder.e695c336-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e681a4c8-a96d-11e9-861a-983b8f0e47c8
scheduler_node.e6b936f4-a96d-11e9-b125-983b8f0e47c8
simulate_birdfeeder.e6956c2e-a96d-11e9-861a-983b8f0e47c8
scheduler_node.e6de0222-a96d-11e9-b125-983b8f0e47c8
simulate_birdfeeder.e68ec02c-a96d-11e9-861a-983b8f0e47c8
scheduler_node.e6b48f82-a96d-11e9-b125-983b8f0e47c8
scheduler_node.e6b436cc-a96d-11e9-b125-983b8f0e47c8
simulate_birdfeeder.e68fed3a-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e658faf0-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e64f28ea-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e691a710-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e6889f30-a96d-11e9-861a-983b8f0e47c8
74b6a2e3-efe6-4a62-ad6b-a4782038db43
simulate_birdfeeder.e67ec7bc-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e67b9178-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e65b4b2a-a96d-11e9-861a-983b8f0e47c8
scheduler_node.e6dcad1e-a96d-11e9-b125-983b8f0e47c8
water_connector
simulate_birdfeeder.e6924904-a96d-11e9-861a-983b8f0e47c8
reporter
xfer_server.e652ee8a-a96d-11e9-992f-983b8f0e47c8
scheduler_node.e68d3446-a96d-11e9-b125-983b8f0e47c8
simulate_flexcharger.e560bef8-a96d-11e9-a249-983b8f0e47c8
511bcf37-7e6b-4a33-9a57-9ad498f9a089
simulator.e5d05a10-a96d-11e9-9375-983b8f0e47c8
trajectory.queue
simulate_birdfeeder.e662497a-a96d-11e9-861a-983b8f0e47c8
scheduler
simulate_birdfeeder.e6874ca2-a96d-11e9-861a-983b8f0e47c8
85f138e2-bb87-4885-98b8-a6536595cf7c
scheduler_node.e69bcd9e-a96d-11e9-b125-983b8f0e47c8
scheduler_node.e69b0c42-a96d-11e9-b125-983b8f0e47c8
simulate_birdfeeder.e6612c7a-a96d-11e9-861a-983b8f0e47c8
scheduler_node.e691649e-a96d-11e9-b125-983b8f0e47c8
simulate_birdfeeder.e665d02c-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e6554ab8-a96d-11e9-861a-983b8f0e47c8
xfer_server.e65b81ee-a96d-11e9-992f-983b8f0e47c8
server.e6159012-a96d-11e9-bbdf-983b8f0e47c8
simulate_birdfeeder.e666c9b4-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e67dffb2-a96d-11e9-861a-983b8f0e47c8
KMOffsetCache-my-computer
simulate_birdfeeder.e697c230-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e683e0bc-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e65fbd40-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e69706c4-a96d-11e9-861a-983b8f0e47c8
scheduler_node.e69cc438-a96d-11e9-b125-983b8f0e47c8
xfer_server.e64a93de-a96d-11e9-992f-983b8f0e47c8
scheduler_node.e6a55e72-a96d-11e9-b125-983b8f0e47c8
simulate_birdfeeder.e685be50-a96d-11e9-861a-983b8f0e47c8
product_monitor
simulate_birdfeeder.e688f89a-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e66b29fa-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e6811242-a96d-11e9-861a-983b8f0e47c8
scheduler_node.e6a62be0-a96d-11e9-b125-983b8f0e47c8
scheduler_node.e6b9fa58-a96d-11e9-b125-983b8f0e47c8
simulate_birdfeeder.e68a2986-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e690480c-a96d-11e9-861a-983b8f0e47c8
server.e60fec02-a96d-11e9-bbdf-983b8f0e47c8
scheduler_node.e6a45856-a96d-11e9-b125-983b8f0e47c8
simulate_birdfeeder.e68385ae-a96d-11e9-861a-983b8f0e47c8
server
simulate_birdfeeder.e668c246-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e65f11a6-a96d-11e9-861a-983b8f0e47c8
xfer_server.e64f2bb0-a96d-11e9-992f-983b8f0e47c8
scheduler_node.e6b8e834-a96d-11e9-b125-983b8f0e47c8
scheduler_node.e6ddb13c-a96d-11e9-b125-983b8f0e47c8
Of which I can describe them:
$ kafka-consumer-groups --bootstrap-server localhost:9092 --describe --group server
Note: This will not show information about old Zookeeper-based consumers.
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
reservationist.queue 0 456 456 0 rdkafka-fa9be154-c7ac-471f-a9d1-5bfd3f42da8c /127.0.0.1 rdkafka
However, for some consumer groups, if I describe them, there is no output at all. I'll get something like the following:
$ kafka-consumer-groups --bootstrap-server localhost:9092 --describe --group scheduler_node.e6ddb13c-a96d-11e9-b125-983b8f0e47c8
Note: This will not show information about old Zookeeper-based consumers.
Aside from the warning, there is no output, but the consumer does exist, because it is listed in the former list command. Now, if I use the zookeeper option, the command complains about the group not existing.
Often the consumers that look like widget has a description of offsets, and the ones that look like widget.some-long-random-string have no output. And consumers that look like some-long-random-string also produce a description with offsets.
(I'm not 100% certain, but I'm pretty sure the missing consumer groups are subscribed to, otherwise my application would be bombing, which isn't the case.)
I am using confluent kafka locally. My code and confluent are all running in my dev box:
$ confluent version kafka
This CLI is intended for development only, not for production
https://docs.confluent.io/current/cli/index.html
1.1.1-cp1
Can any give me some insight as to what is going on? Thanks.
kafka-consumer-groups \
--bootstrap-server localhost:9092 \
--describe \
--group your_consumer_group_name
will return no output if the consumers in the consumer group have not committed any offsets.
If the output of
bin/kafka-consumer-groups.sh \
--bootstrap-server localhost:9092 \
--describe \
--group server
and
bin/kafka-consumer-groups.sh \
--bootstrap-server localhost:9092 \
--new-consumer
--describe \
--group server
matches and the consumer group appears in both consumer groups list output, my guess is that your consumers inside the consumer group server have not committed any offsets and this is why no information appears using the describe command.
Make sure that your consumers inside server group commit their offsets succesfully (either manually or automatically).

How to fetch recent messages from Kafka topic

Do we have any option like fetching recent 10/20/ etc., messages from Kafka topic. I can see --from-beginning option to fetch all messages from the topic but if I want to fetch only few messages first, last, middle or latest 10. do we have some options?
First N messages
You can use --max-messages N in order to fetch the first N messages of a topic.
For example, to get the first 10 messages, run
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning --max-messages 10
Next N messages
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --max-messages 10
Last N messages
To get the last N messages, you need to define a specific partition and the offset:
bin/kafka-simple-consumer-shell.sh --bootstrap-server localhost:9092 --topic test--partition testPartition --offset yourOffset
M to N messages
Again, for this case you'd have to define both the partition and the offset.
For example, you can run the following in order to get N messages starting from an offset of your choice:
bin/kafka-simple-consumer-shell.sh --bootstrap-server localhost:9092 --topic test--partition testPartition --offset yourOffset --max-messages 10
If you don't want to stick to the binaries, I would suggest you to use kt which is a Kafka command line tool with more options and functionality.
For more details refer to the article How to fetch specific messages in Apache Kafka
Without specifying an offset and partition, you'll only be able to consume next N or first N. To consume in the "middle" of the unbounded stream, you need to give the offset
Other than console consumer, there's kafkacat
First twenty
kafkacat -C -b -t topic -o earliest -c 20
And from previous twenty (from partition zero)
kafkacat -C -b -t topic -P 0 -o -20

Kafka: What is Current Offset or Record Count of Topic?

How do I get the current offset, or offset by partition, or record count for a given topic? It doesn't need to be perfect, but I want a ballpark idea of how much data is in a Kafka topic.
In order to get the offset for the partitions of a topic you can use kafka.tools.GetOffsetShell
./bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:9092 --topic your_topic_name --time -1
If you want to get the latest offset for a particular group, you can also use:
./bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --topic your_topic_name--zookeeper localhost:2181 --group your_group_id
In order to count the entries within a topic, you can either consume the whole topic (when you stop the consumer the total number of consumed messages will be reported). Alternatively, you can use
./bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list <broker>: <port> --topic <topic-name> --time -1 --offsets 1 | awk -F ":" '{sum += $3} END {print sum}'

Before consumers for new topic are attached, I create new topic and produce message in apache kafka

Before consumers for new topic are attached, I create new topic and produce a first message in apache kafka.
Then consumers for new topic are attached, but the first message could not be consumed.
Why..?
In this case, already log-end offset=1, commited offset=1, lag=0.
Doesn't "commited offset=1" mean it's already been consumed?
My question is why it has already been consumed.
Let me know if there's anything I'm wrong with.
This is my test case.
# create new topic
$ kafka/bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic NEW_TOPIC_NAME
# produce a first message
$ kafka/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic NEW_TOPIC_NAME
> send a first message
# then execute consumer
$ kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic NEW_TOPIC_NAME
> # no consume a first message
But after consumers for new topic are attached, I produce a second message then normally consume.
By default, the kafka-console-consumer starts from the end of the topic.
If you want to consume messages produced before, you can set --from-beginning for example:
kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092
--topic NEW_TOPIC_NAME --from-beginning