removing a kafka consumer group in zookeeper - apache-zookeeper

I'm using kafka_2.9.2-0.8.1.1 with zookeeper 3.4.6.
Is there a utility that can automatically remove a consumer group from zookeeper? Or can I just remove everything under /consumers/[group_id] in zookeeper? If the latter, is there anything else I'm missing & can this be done with a live system?
Update:
As of kafka version 2.3.0, there is a new utility:
> bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --delete --group my-group
Related doc: http://kafka.apache.org/documentation/#basic_ops_consumer_lag
See below for more discussion

As of v0.9.0, Kafka ships with a suite of tools in the /bin one of which is the kafka-consumer-groups.sh tool. This will delete a consumer group. ./kafka-consumer-groups.sh --zookeeper <zookeeper_url> --delete --group <group-name>

For new consumers (which use a kafka topic to manage offsets instead of zookeeper) you cannot delete the group information using kafka's built in tools.
Here is an example of trying to delete the group information for a new style consumer using the kafka-consumer-groups.sh script:
bin/kafka-consumer-groups.sh --bootstrap-server "kafka:9092" --delete --group "indexer" --topic "cleaned-logs"
Option '[delete]' is only valid with '[zookeeper]'. Note that there's no need to delete group metadata for the new consumer as the group is deleted when the last committed offset for that group expires.
Here's the important part of that response:
Note that there's no need to delete group metadata for the new consumer as the group is deleted when the last committed offset for that group expires.
This is kind of annoying from a monitoring perspective (esp. when tracking offsets via something like burrow) because it means that if you change consumer group names in your code you'll keep seeing that old groups are behind on their offsets until those offsets expire.
Hypothetically you could write a tombstone to that topic manually (which is what happens during offset expiration) but I haven't found any tools that make this easy.

you can delete group from kafka by CLI
kafka-consumer-groups --bootstrap-server localhost:9092 --delete --group group_name

Currently, as I know, the only way to remove a Kafka consumer group is manually deleting Zookeeper path /consumers/[group_id].
If you just want to delete a consumer group, there is nothing to worry about manually deleting the Zookeeper path, but if you do it for rewinding offsets, the below will be helpful.
First of all, you should stop all the consumers belongs to the consumer group before removing the Zookeeper path. If you don't, those consumers will not consume newly produced messages and will soon close connections to the Zookeeper cluster.
When you restart the consumers, if you want the consumers to start off from the beginning, give auto.offset.reset property to smallest (or earliest in new Kafka releases). The default value of the property is largest (or latest in new Kafka releases) which makes your restarting consumers read after the largest offset which in turn consuming only newly produced messages. For more information about the property, refer to Consumer Config in the Kafka documentation.
FYI, there is a question How can I rewind the offset in the consumer? in Kafka FAQ, but it gave me not much help.

Related

MM2.0 consumer group behavior

I'm trying to run some tests to understand MM2 behavior. As part of that I had the following questions:
How to correctly pass a custom consumer group for MM2 in mm2.properties?
Based on this question, tried passing <alias>.group.id=temp_cons_group in mm2.properties and on restarting the MM2 instance could see the consumer group mentioned in the MM2 logs.
However, when I try listing consumer groups registered in the source broker, the group doesn't show up?
How to test if the property <alias>.consumer.auto.offset.reset works?
Here, I want to consume the same messages again so in reference to the question, tried setting <source_alias>.consumer.auto.offset.reset to earliest and restarted MM2.
I was able to see the property set correctly in MM2 logs but did not get the messages from the beginning in the target cluster topic.
How do I start a MM2 instance to start consuming messages from a specific offset for a topic present in the source cluster?
MirrorMaker does not use a consumer group to run and instead uses the assign() API, so it's expected that you don't see a group.
It's hard to "test". One way to verify this configuration was picked up is to check it's present in the logs when MirrorMaker starts its consumers.
This is currently not trivial to do. There's a KIP in progress to improve the process but at the moment it requires manually updating the internal offset topic from your Connect instance. At a very high level, here's the process:
First, ensure MirrorMaker is not running. Then you need to find the offset records for MirrorMaker in the offsets topic using a command like:
./bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 \
--topic <CONNECT_OFFSET_TOPIC \
--from-beginning \
--property print.key=true | grep <SOURCE_CONNECTOR_NAME>
You will see records with offsets for each partition MirrorMaker handles. To update the offsets, you need to produce new records to this topic with the offsets you want. For each partition, ensure your record has the same key as the existing message so it replaces the existing stored offsets.

Kafka remove unused consumer groups

I have an application that uses Apache Kafka and creates a new consumer group on every startup. It takes fixed string and adds generated uuid to generate group id. (ex. my_consumer_group_123324234234, my_consumer_group_123324234235 ...). When I shut down the app old consumer groups stays unused until offsets.retention.minutes after kafka doesn't remove them.
I wonder if it is possible to remove unused consumer groups (filtered with name like 'my_consumer_group_*') by script
Yes, it should be possible using the kafka-consumer-groups.sh script included with Kafka.
You could create a script that periodically lists the existing consumer groups
kafka-consumer-groups.sh --bootstrap-server <kafka-servers-addrs> --list
Then describes each one of them
kafka-consumer-groups.sh --bootstrap-server <kafka-servers-addrs> --describe --group <consumer-group>
One option to detect if they are unused is to parse the output to see if it returns:
Consumer group '<consumer-group>' has no active members.
Note that relying on the message could be a bit brittle, since the message could change across Kafka versions, so I'd look for some other more robust approach (e.g. status code that the script returns (if any), initialize your own consumer...)
And then deletes the ones that are unused:
kafka-consumer-groups.sh --bootstrap-server <kafka-server-addrs> --delete --group <consumer-group1> --group <consumer-group2>

How to delete Kafka "ghost" consumers?

When I use Kafka Tool: https://www.kafkatool.com/ I see additional consumer groups that I do not see with kafka-consumer-groups.sh
I'm assuming that the additional consumer groups are coming from Zookeeper while kafka-consumer-groups.sh only shows what it sees on the brokers.
Is there a way to delete these "ghost" groups? They are not used? Can I manually browse zookeeper and go delete those nodes?
IIRC, KafkaTool uses Zookeeper, not --bootstrap-server AdminClient protocol to list groups that kafka-consumer-groups does...
Also, kafka-console-consumer creates random groups that get hidden in the kafka-consumer-groups output.
While you could remove them from Zookeeper, all inactive consumer groups will automatically go away, as per the offsets retention policies over time, and having them there doesn't cause any performance penalties.

Kafka topics not created empty

I have a Kafka cluster consisting on 3 servers all connected through Zookeeper. But when I delete a topic that has some information and create the topic again with the same name, the offset does not start from zero.
I tried restarting both Kafka and Zookeeper and deleting the topics directly from Zookeeper.
What I expect is to have a clean topic When I create it again.
I found the problem. A consumer was consuming from the topic and the topic was never actually deleted. I used this tool to have a GUI that allowed me to see the topics easily https://github.com/tchiotludo/kafkahq. Anyway, the consumers can be seen running this:
bin/kafka-consumer-groups.sh --list --bootstrap-server localhost:9092

Delete unused kafka consumer group

I'm using Apache Kafka 0.10 with a compacted topic as a distributed cache synch mechanism. When the application starts up it generates an instance specific consumer group id. As instances are added and removed for horizontal scalability, obviously we get a large number of group ids that should never be used again.
I'm sure that this is the perfect use case for KStreams and KTables, but I am trying to do this myself for intellectual reasons as well as that the KStreams and KTables are defined as alpha quality in 0.10.
Is there a Kafka API call that I can use that could delete an existing consumer group, knowing that it should never be used again?
Since Zookeeper is not maintaining consumer offsets in version 0.10, Is there a way delete the consumer group using Kafka?
It's possible with CLI in Kafka
./bin/kafka-consumer-groups \
--bootstrap-server <bootstrap_server(s)> \
--topic <topic_name> \
--delete \
--group <consumer_group_name>
kafka-consumer-groups should be available in Kafka installation home.
Since Kafka 0.9, an internal topic is used to store committed offsets. You can configure how long those offsets should be kept via offsets.retention.minutes. (See also offsets.retention.check.interval.ms).