delete specific messages from kafka topic __consumer_offsets - apache-kafka

I want to delete all messages that are contained in the __consumer_offsets table that start with a given key (resetting one particular consumer group without affecting the rest).
Is there a way to do this?

Kafka comes with a ConsumerGroupCommand tool. You cand find some information in the Kafka documentation.
If you plan to reset a particular Consumer Group ("myConsumerGroup") without affecting the rest you can use
> bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --reset-offsets --group myConsumerGroup --topic topic1 --to-latest
Depending on your requirement you can reset the offsets for each partition of the topic with that tool. The help function or documentation explain the options.

Related

Kafka List all partition with no leader

In my kafka Cluster there are more than 2k topics and each topic has 5 partitions. I want list only that partitions which has no leader.
I can go can check for each topic using the below syntax:
kafka-topics.sh --describe --topic <topic_name> --zookeeper <zookeeper_ip>:port
But the problem is there are 2k+ topics, can't be done manually. I can also write a script to loop over each topic and get the partition with no leader. But i am interested in some efficient way to get the information.
Using kafka-topics.sh you can specify the --unavailable-partitions flag to only list partitions that currently don't have a leader and hence cannot be used by Consumers or Producers.
For example:
kafka-topics.sh --describe --unavailable-partitions --zookeeper <zookeeper_ip>:port

How to find the consumer topic and the group id from the __consumer_offsets topic in kafka?

I am trying to parse the logs from the __consumer_offsets topic in kafka. The idea is to find the group id, topic and the consumer which is creating load in my kafka cluster.
The command I am executing is below
bin/kafka-console-consumer.sh --topic __consumer_offsets --bootstrap-server brokers --formatter "kafka.coordinator.group.GroupMetadataManager\$OffsetsMessageFormatter" --new-consumer --consumer.config consumer.conf
Now the output looks like this
[dedupeconsumergroup,daas.dedupe.avrosyslog.incoming,4]::[OffsetMetadata[8646,NO_METADATA],CommitTime 1538115746766,ExpirationTime 1538202146766]
[dedupeconsumergroup,daas.dedupe.avrosyslog.incoming,6]::[OffsetMetadata[8639,NO_METADATA],CommitTime 1538115746766,ExpirationTime 1538202146766]
Can someone help me in understanding this log or point me to the documentation. Thanks in advance.
[dedupeconsumergroup,daas.dedupe.avrosyslog.incoming,6]::[OffsetMetadata[8639,NO_METADATA],CommitTime 1538115746766,ExpirationTime 1538202146766]
it means that consumer group with name dedupeconsumergroup read/commited offset 8639 on partition 6 in topic daas.dedupe.avrosyslog.incoming
which particular consumer read particular offset is probably little more complicated to find as you would have to know which partition was assigned to which consumer at a given point in time - it can change over time due to rebalancing.

How to check the lag of a consumer in Kafka which is assigned with a particular partition of a topic?

I want to check the lag for a consumer group which was assigned manually to particular topic , is this possible ? . I am using Kafka - 0.10.0.1 .I used sh kafka-run-class.sh kafka.admin.ConsumerGroupCommand —new-consumer —describe —bootstrap-server localhost:9092 —group test but it says no group exists , so i wonder when we assign a partition manually can we check the lag for the consumer.
In Nussknacker we are using AKHQ GUI tool which provides various monitoring options as consumer and consumer groups lag and general Kafka operations as topic, topic data and schema registry management
./kafka-consumer-groups.sh --bootstrap-server localhost:9092
--describe --group
.
If You want API support or visual lag monitoring you can use https://github.com/yahoo/kafka-manager

Kafka 10.2 new consumer vs old consumer

I've spent some hours to figure out what was going on but didn't manage to find the solution.
Here is my set up on a single machine:
1 zookeeper running
3 broker running (on port 9092/9093/9094)
1 topic with 3 partitions and 3 replications (each partition are properly assigned between brokers)
I'm using kafka console producer to insert messages. If i check the replication offset (cat replication-offset-checkpoint), I see that my messages are properly ingested by Kafka.
Now I use the kafka console consumer (new):
sudo bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --from-beginning --topic testTopicPartitionned2
I dont see anything consumed. I tried to delete my logs folder (/tmp/kafka-logs-[1,2,3]), create new topics, still nothing.
However when I use the old kafka consumer:
sudo bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic testTopicPartitionned2
I can see my messages.
Am I missing something big here to make this new consumer work ?
Thanks in advance.
Check to see what setting the consumer is using for auto.offset.reset property
This will affect what a consumer group without a previously committed offset will do in terms of setting where to start reading messages from a partition.
Check the Kafka docs for more on this.
Try providing all your brokers to --bootstrap-server argument to see if you notice any differnce:
sudo bin/kafka-console-consumer.sh --bootstrap-server localhost:9092,localhost:9093,localhost:9094 --from-beginning --topic testTopicPartitionned2
Also, your topic name is rather long. I assume you've already made sure you provide the correct topic name.

Consume and produce message in particular Kafka partition?

For reading all partitions in topic:
~bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic myTopic --from-beginning
How can I consume particular partition of the topic? (for instance with partition key 13)
And how produce message in partition with particular partition key? Is it possible?
You can't using console consumer and producer. But you can using higher level clients (in any language that works for you).
You may use for example assign method to manually assign a specific topic-partition to consume (https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java#L906)
You may use a custom Partitioner to override the partitioning logic where you will decide manually how to partition your messages (https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java#L206-L208)
With the many clients that are available you can specify the partition number just like serejja has stated.
Also look into https://github.com/cakesolutions/scala-kafka-client which uses actors and provides multiple modes for manual partitions and offsets.
If you want to do the same on the terminal, I suggest using kafkacat. (https://github.com/edenhill/kafkacat)
My personal choice during development.
You can do things like
kafkacat -b localhost:9092 -f 'Topic %t[%p], offset::: %o, data: %s key: %k\n' -t testtopic
And for a specific partition, you just need to use -p flag.
Console producer and consumer do not provide this flexibility. You could achieve this through Kafka APIs.
You could manually assign partition to consumer using assign() operation KafkaConsumer/Assign. This will disable group rebalancing. Please use this very carefully.
You could specify partition detail in KafkaProducer message. If not specified, it stores as per Partitioner policy.
How can I consume particular partition of the topic? (for instance
with partition key 13)
There is a flag called --partition in kafka-console-consumer
--partition <Integer: partition> The partition to consume from.
Consumption starts from the end of
the partition unless '--offset' is
specified.
The command is as follows:
bin/kafka-console-consumer --bootstrap-server localhost:9092 --topic test --partition 0 --from-beginning