Kafka 0.10.1.0 change offset time - apache-kafka

Elasticsearch pipeline set up with Kafka cluster between 2 logstash instances.
I need to reset the offset back 12 hours for a topic and start the consumer again.
bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list kfkserver:9092 --topic topicname --time 1488153601000
which returns topicname:0:3730858
1488153601000 <- 2017-02-27 00:00:01 in milliseconds
How can I set the offset time?

If you're on 0.10.x and don't have the awesome offset management tool that was added in 0.11, there's a hack to use kafka-console-consumer.sh to change a consumer group's offset. This only works with the numeric offset though, not the timestamp.
First, stop whatever process is running that's using that consumer. Clean shutdown is best. Then, run a command that looks like this:
bin/kafka-console-consumer.sh --bootstrap-server :9092 \
--topic my-topic \
--partition 1 \
--consumer-property group.id=my-consumer-group \
--max-messages 0 \
--offset 12345
--max-messages 0 is important here; setting it to any other value, including 1, will consume that many messages and then commit the current latest offset in that topic/partition. This must be a bug in the console consumer.
Next, check your work with kafka-consumer-groups.sh:
./kafka-consumer-groups.sh --bootstrap-server :9092 \
--group my-consumer-group \
--describe

Related

How to decode messages in MM2 offset sync topic?

In reference to the offset sync topic covered in KIP-382 that maintains the cluster-to-cluster offset mapping, while consuming the messages from mm2-offset-syncs.target.internal found them to be serialized.
Is there a way the output can be deserialized so its understandable using the kafka command line consumer?
./kafka-console-consumer.sh --bootstrap-server localhost:xxxx --topic mm2-offset-syncs.dest.internal --from-beginning
Yes, you can use OffsetSyncFormatter to deserialize the content of your offset syncs topics. For example:
./bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 \
--topic mm2-offset-syncs.target.internal \
--formatter org.apache.kafka.connect.mirror.formatters.OffsetSyncFormatter \
--from-beginning
The more details, see KIP-597: MirrorMaker2 internal topics Formatters

Why does kafka-consumer-groups produce no output for listed consumer?

When I run the command:
kafka-consumer-groups -bootstrap-server localhost:9092 -list
It produces a list of consumer groups. Here's some lengthy output:
$ kafka-consumer-groups --bootstrap-server localhost:9092 --list
Note: This will not show information about old Zookeeper-based consumers.
simulate_birdfeeder.e67c034c-a96d-11e9-861a-983b8f0e47c8
xfer_server.e65d3732-a96d-11e9-992f-983b8f0e47c8
xfer_server.e654e596-a96d-11e9-992f-983b8f0e47c8
scheduler_node.e6ba65ce-a96d-11e9-b125-983b8f0e47c8
simulate_birdfeeder.e695c336-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e681a4c8-a96d-11e9-861a-983b8f0e47c8
scheduler_node.e6b936f4-a96d-11e9-b125-983b8f0e47c8
simulate_birdfeeder.e6956c2e-a96d-11e9-861a-983b8f0e47c8
scheduler_node.e6de0222-a96d-11e9-b125-983b8f0e47c8
simulate_birdfeeder.e68ec02c-a96d-11e9-861a-983b8f0e47c8
scheduler_node.e6b48f82-a96d-11e9-b125-983b8f0e47c8
scheduler_node.e6b436cc-a96d-11e9-b125-983b8f0e47c8
simulate_birdfeeder.e68fed3a-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e658faf0-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e64f28ea-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e691a710-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e6889f30-a96d-11e9-861a-983b8f0e47c8
74b6a2e3-efe6-4a62-ad6b-a4782038db43
simulate_birdfeeder.e67ec7bc-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e67b9178-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e65b4b2a-a96d-11e9-861a-983b8f0e47c8
scheduler_node.e6dcad1e-a96d-11e9-b125-983b8f0e47c8
water_connector
simulate_birdfeeder.e6924904-a96d-11e9-861a-983b8f0e47c8
reporter
xfer_server.e652ee8a-a96d-11e9-992f-983b8f0e47c8
scheduler_node.e68d3446-a96d-11e9-b125-983b8f0e47c8
simulate_flexcharger.e560bef8-a96d-11e9-a249-983b8f0e47c8
511bcf37-7e6b-4a33-9a57-9ad498f9a089
simulator.e5d05a10-a96d-11e9-9375-983b8f0e47c8
trajectory.queue
simulate_birdfeeder.e662497a-a96d-11e9-861a-983b8f0e47c8
scheduler
simulate_birdfeeder.e6874ca2-a96d-11e9-861a-983b8f0e47c8
85f138e2-bb87-4885-98b8-a6536595cf7c
scheduler_node.e69bcd9e-a96d-11e9-b125-983b8f0e47c8
scheduler_node.e69b0c42-a96d-11e9-b125-983b8f0e47c8
simulate_birdfeeder.e6612c7a-a96d-11e9-861a-983b8f0e47c8
scheduler_node.e691649e-a96d-11e9-b125-983b8f0e47c8
simulate_birdfeeder.e665d02c-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e6554ab8-a96d-11e9-861a-983b8f0e47c8
xfer_server.e65b81ee-a96d-11e9-992f-983b8f0e47c8
server.e6159012-a96d-11e9-bbdf-983b8f0e47c8
simulate_birdfeeder.e666c9b4-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e67dffb2-a96d-11e9-861a-983b8f0e47c8
KMOffsetCache-my-computer
simulate_birdfeeder.e697c230-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e683e0bc-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e65fbd40-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e69706c4-a96d-11e9-861a-983b8f0e47c8
scheduler_node.e69cc438-a96d-11e9-b125-983b8f0e47c8
xfer_server.e64a93de-a96d-11e9-992f-983b8f0e47c8
scheduler_node.e6a55e72-a96d-11e9-b125-983b8f0e47c8
simulate_birdfeeder.e685be50-a96d-11e9-861a-983b8f0e47c8
product_monitor
simulate_birdfeeder.e688f89a-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e66b29fa-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e6811242-a96d-11e9-861a-983b8f0e47c8
scheduler_node.e6a62be0-a96d-11e9-b125-983b8f0e47c8
scheduler_node.e6b9fa58-a96d-11e9-b125-983b8f0e47c8
simulate_birdfeeder.e68a2986-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e690480c-a96d-11e9-861a-983b8f0e47c8
server.e60fec02-a96d-11e9-bbdf-983b8f0e47c8
scheduler_node.e6a45856-a96d-11e9-b125-983b8f0e47c8
simulate_birdfeeder.e68385ae-a96d-11e9-861a-983b8f0e47c8
server
simulate_birdfeeder.e668c246-a96d-11e9-861a-983b8f0e47c8
simulate_birdfeeder.e65f11a6-a96d-11e9-861a-983b8f0e47c8
xfer_server.e64f2bb0-a96d-11e9-992f-983b8f0e47c8
scheduler_node.e6b8e834-a96d-11e9-b125-983b8f0e47c8
scheduler_node.e6ddb13c-a96d-11e9-b125-983b8f0e47c8
Of which I can describe them:
$ kafka-consumer-groups --bootstrap-server localhost:9092 --describe --group server
Note: This will not show information about old Zookeeper-based consumers.
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
reservationist.queue 0 456 456 0 rdkafka-fa9be154-c7ac-471f-a9d1-5bfd3f42da8c /127.0.0.1 rdkafka
However, for some consumer groups, if I describe them, there is no output at all. I'll get something like the following:
$ kafka-consumer-groups --bootstrap-server localhost:9092 --describe --group scheduler_node.e6ddb13c-a96d-11e9-b125-983b8f0e47c8
Note: This will not show information about old Zookeeper-based consumers.
Aside from the warning, there is no output, but the consumer does exist, because it is listed in the former list command. Now, if I use the zookeeper option, the command complains about the group not existing.
Often the consumers that look like widget has a description of offsets, and the ones that look like widget.some-long-random-string have no output. And consumers that look like some-long-random-string also produce a description with offsets.
(I'm not 100% certain, but I'm pretty sure the missing consumer groups are subscribed to, otherwise my application would be bombing, which isn't the case.)
I am using confluent kafka locally. My code and confluent are all running in my dev box:
$ confluent version kafka
This CLI is intended for development only, not for production
https://docs.confluent.io/current/cli/index.html
1.1.1-cp1
Can any give me some insight as to what is going on? Thanks.
kafka-consumer-groups \
--bootstrap-server localhost:9092 \
--describe \
--group your_consumer_group_name
will return no output if the consumers in the consumer group have not committed any offsets.
If the output of
bin/kafka-consumer-groups.sh \
--bootstrap-server localhost:9092 \
--describe \
--group server
and
bin/kafka-consumer-groups.sh \
--bootstrap-server localhost:9092 \
--new-consumer
--describe \
--group server
matches and the consumer group appears in both consumer groups list output, my guess is that your consumers inside the consumer group server have not committed any offsets and this is why no information appears using the describe command.
Make sure that your consumers inside server group commit their offsets succesfully (either manually or automatically).

How to get log end offset of all partitions for a given kafka topic using kafka command line?

When I describe a kafka topic it doesn't show the log end offset of any partition but show all the other metadata such as ISR,Replicas,Leader.
How do I see a log end offset of the partition for a given topic?
Ran this: ./kafka-topics.sh --zookeeper zk-service:2181 --describe --topic "__consumer_offsets"
Output Doesn't have a offset column.
Note: Need Only the log end offset.
Since you're only looking for the log end offset for a topic, you can use kafka-run-class with the kafka.tools.GetOffsetShell class.
Assuming your topic is __consumer_offsets, you would get the end offset by running:
./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:9092 --time -1 --topic __consumer_offsets
Change the --broker-list localhost:9092 to your desired Kafka address. This will list all of the log end offsets for each partition in the topic.
install kafkacat, its an easy to use kafka tool:
sudo apt-get update
sudo apt-get install kafkacat
kafkacat -C -b <kafka-broker-ip-and-port> -t <topic> -o -1
This will not consume anything because the offset is incremented after a message is added. But it will give you the offsets for all the partitions. Note however that this isn't the current offset that you are consuming at... The above answers will help you more in terms of looking into partition lag.
Following is the command you would need to get the offset of all partitions for a given kafka topic for a given consumer group:
kafka-consumer-groups --bootstrap-server <kafka-broker-list-with-ports> --describe --group <consumer-group-name>
Please note that the <consumer-group-name> at the end is important as the offsets are committed by consumers that are typically a part of a consumer group.
The output of this command may look something like:
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
<topic-name> 0 62 62 0 <consumer-id> <host> <client>
In your post however, you're trying to get this information for the internal topic __consumer_offsets so you would need a consumer group which would have consumers consuming from this internal topic. You could perhaps do the following:
kafka-console-consumer --bootstrap-server <kafka-broker-list-with-ports> --topic __consumer_offsets --formatter "kafka.coordinator.group.GroupMetadataManager\$OffsetsMessageFormatter" --max-messages 5
Output of the above command:
[<consumer-group-name>,<topic-name>,0]::[OffsetMetadata[481690879,NO_METADATA],CommitTime 1479708539051,ExpirationTime 1480313339051]
Just use the <consumer-group-name> from the output and put it in the kafka-consumer-groups command mentioned in the beginning and you'll get the offset details for all the 50 partitions for the given consumer group only.
I hope this helps.

How to fetch recent messages from Kafka topic

Do we have any option like fetching recent 10/20/ etc., messages from Kafka topic. I can see --from-beginning option to fetch all messages from the topic but if I want to fetch only few messages first, last, middle or latest 10. do we have some options?
First N messages
You can use --max-messages N in order to fetch the first N messages of a topic.
For example, to get the first 10 messages, run
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning --max-messages 10
Next N messages
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --max-messages 10
Last N messages
To get the last N messages, you need to define a specific partition and the offset:
bin/kafka-simple-consumer-shell.sh --bootstrap-server localhost:9092 --topic test--partition testPartition --offset yourOffset
M to N messages
Again, for this case you'd have to define both the partition and the offset.
For example, you can run the following in order to get N messages starting from an offset of your choice:
bin/kafka-simple-consumer-shell.sh --bootstrap-server localhost:9092 --topic test--partition testPartition --offset yourOffset --max-messages 10
If you don't want to stick to the binaries, I would suggest you to use kt which is a Kafka command line tool with more options and functionality.
For more details refer to the article How to fetch specific messages in Apache Kafka
Without specifying an offset and partition, you'll only be able to consume next N or first N. To consume in the "middle" of the unbounded stream, you need to give the offset
Other than console consumer, there's kafkacat
First twenty
kafkacat -C -b -t topic -o earliest -c 20
And from previous twenty (from partition zero)
kafkacat -C -b -t topic -P 0 -o -20

Kafka: What is Current Offset or Record Count of Topic?

How do I get the current offset, or offset by partition, or record count for a given topic? It doesn't need to be perfect, but I want a ballpark idea of how much data is in a Kafka topic.
In order to get the offset for the partitions of a topic you can use kafka.tools.GetOffsetShell
./bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:9092 --topic your_topic_name --time -1
If you want to get the latest offset for a particular group, you can also use:
./bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --topic your_topic_name--zookeeper localhost:2181 --group your_group_id
In order to count the entries within a topic, you can either consume the whole topic (when you stop the consumer the total number of consumed messages will be reported). Alternatively, you can use
./bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list <broker>: <port> --topic <topic-name> --time -1 --offsets 1 | awk -F ":" '{sum += $3} END {print sum}'