How to re-consume messages from the beginning? - scala

I am using auto.offset.reset=earliest in my code and used offset commit in kafka with the help of below code.
val offsetRanges=rdd.asInstanceOf[HasOffsetRanges].offsetRanges
inputStream.asInstanceOf[CanCommitOffsets].commitAsync(offsetRanges)
Now when I run my program it does not get new messages as all messages are committed.
I am testing this code in QA so want to reset the offset to beginning but looks like earliest is not working it is not reading new messages and there are no new messages in the topic. I want to read messages from beginning for testing purpose.
Can someone assist if earliest does not fetch messages from beginning if it is committed?

The property auto.offset.reset is used only if there's no committed offset for the partition. You can reset the offset for the whole group using kafka-consumer-groups (comes as part of Kafka):
kafka-consumer-groups --bootstrap-server <kafkahost:port> --group <group_id> --topic <topic_name> --reset-offsets --to-earliest --execute

Related

Kafka - command to fetch the current offset committed by this consumer

I am going through the documentation and didn't find a way apart from using "committed" method to fetch the current committed offset per partition of this current consumer , consumer group.
Is there a simple way , command to find out the same committed offset of a consumer ?
You can use the Kafka tool as described in the documentation checking consumer position
> bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --describe --group my-group

How to seperate a consumer from consumer group without losing offset?

I have a logstash kafka consumer group subscribing nearly 20topics which isn't performing well for a particular high priority kafka topic to process, so I've decided to remove one topic out of the consumer group and launch a separate consumer group for high priority topics, but unfortunately with that i'm losing the offset that was there in old consumer group.
Is there anyway I can start the new logstash consumer group with the initial offset from last consumer group?
Thanks
You can use kafka scripts to set offset for new group.
Sample scenario:
Stop application.
Check current offset for group. You can use following command. The output will contains information about current offset, log end offset, lag, etc for each topic.
./bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group groupId --describe
Set offset for new group id, that will be used by new application (suppose the offset for topic you would like to switch is 10001)
./bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group newGroupId --to-offset 10001 --topic topicName --reset-offsets --execute
Remove topicName from topics list for old application.
Set up new logstash configuration with newGroupId group id.
Start old and new logstash applications.

Kafka Consumer offset fetch

I am using a kafka version where the offset storage is kafka i.e._consumer_offsets
How to retrieve the consumer offset ,when the consumer is down or inactive?
Below Command work but with a flaw that it reads from beginning not the latest one
kafka-console-consumer --consumer.config /tmp/consumer.config --formatter "kafka.coordinator.GroupMetadataManager\$OffsetsMessageFormatter" --zookeeper <> --topic __consumer_offsets --from-beginning
If your consumer is down and you want last committed offset of the topic for the specific group please refer to the below mentioned sources
https://github.com/yahoo/kafka-manager
KafkaConsumerOffsets

Kafka 10.2 new consumer vs old consumer

I've spent some hours to figure out what was going on but didn't manage to find the solution.
Here is my set up on a single machine:
1 zookeeper running
3 broker running (on port 9092/9093/9094)
1 topic with 3 partitions and 3 replications (each partition are properly assigned between brokers)
I'm using kafka console producer to insert messages. If i check the replication offset (cat replication-offset-checkpoint), I see that my messages are properly ingested by Kafka.
Now I use the kafka console consumer (new):
sudo bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --from-beginning --topic testTopicPartitionned2
I dont see anything consumed. I tried to delete my logs folder (/tmp/kafka-logs-[1,2,3]), create new topics, still nothing.
However when I use the old kafka consumer:
sudo bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic testTopicPartitionned2
I can see my messages.
Am I missing something big here to make this new consumer work ?
Thanks in advance.
Check to see what setting the consumer is using for auto.offset.reset property
This will affect what a consumer group without a previously committed offset will do in terms of setting where to start reading messages from a partition.
Check the Kafka docs for more on this.
Try providing all your brokers to --bootstrap-server argument to see if you notice any differnce:
sudo bin/kafka-console-consumer.sh --bootstrap-server localhost:9092,localhost:9093,localhost:9094 --from-beginning --topic testTopicPartitionned2
Also, your topic name is rather long. I assume you've already made sure you provide the correct topic name.

how to get the all messages in a topic from kafka server

I would like to get all the messages from beginning in a topic from server.
Ex:
bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic testTopic --from-beginning
When using the above console command, I would like to able to get all messages in a topic from the beginning but I couldn't consume all the messages in a topic from beginning using java code.
You can get all messages using the following command:
cd Users/kv/kafka/bin
./kafka-console-consumer.sh --bootstrap-server localhost:9092 \
--topic topicName --from-beginning --max-messages 100
The easiest way would be to start a consumer and drain all the messages. Now I don't know how many partitions you have in your topic and whether you already have a an existing consumer group or not, but you have a few options:
Have a look at this API: https://kafka.apache.org/090/javadoc/index.html?org/apache/kafka/clients/consumer/KafkaConsumer.html
1) If you already have a consumer in the same consumer group, and still want to start consuming from the beginning, you should use the seek option listed in the API doc and set the offset to 0 for each consumer in the group. This would start consuming from the beginning.
2) Otherwise, you can start a few consumers in a new consumer group & you would not have to worry about seek.
PS: Please remember to provide more details about your setup in the future if you have more questions on Kafka. A lot of things depend on how you have configured your infrastructure & how you would prefer it to be and would thus vary from case to case.
TopicPartition topicPartition = new TopicPartition(topic, 0);
List<TopicPartition> partitions = Arrays.asList(topicPartition);
consumer.assign(partitions);
consumer.seekToBeginning(partitions);
Just change the consumer group
ConsumerConfig.GROUP_ID_CONFIG - to new group id
and set
AUTO_OFFSET_RESET_CONFIG - earliest
sample code-
props.put(ConsumerConfig.GROUP_ID_CONFIG, "newID");
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");