How to delete Kafka "ghost" consumers? - apache-kafka

When I use Kafka Tool: https://www.kafkatool.com/ I see additional consumer groups that I do not see with kafka-consumer-groups.sh
I'm assuming that the additional consumer groups are coming from Zookeeper while kafka-consumer-groups.sh only shows what it sees on the brokers.
Is there a way to delete these "ghost" groups? They are not used? Can I manually browse zookeeper and go delete those nodes?

IIRC, KafkaTool uses Zookeeper, not --bootstrap-server AdminClient protocol to list groups that kafka-consumer-groups does...
Also, kafka-console-consumer creates random groups that get hidden in the kafka-consumer-groups output.
While you could remove them from Zookeeper, all inactive consumer groups will automatically go away, as per the offsets retention policies over time, and having them there doesn't cause any performance penalties.

Related

MM2.0 consumer group behavior

I'm trying to run some tests to understand MM2 behavior. As part of that I had the following questions:
How to correctly pass a custom consumer group for MM2 in mm2.properties?
Based on this question, tried passing <alias>.group.id=temp_cons_group in mm2.properties and on restarting the MM2 instance could see the consumer group mentioned in the MM2 logs.
However, when I try listing consumer groups registered in the source broker, the group doesn't show up?
How to test if the property <alias>.consumer.auto.offset.reset works?
Here, I want to consume the same messages again so in reference to the question, tried setting <source_alias>.consumer.auto.offset.reset to earliest and restarted MM2.
I was able to see the property set correctly in MM2 logs but did not get the messages from the beginning in the target cluster topic.
How do I start a MM2 instance to start consuming messages from a specific offset for a topic present in the source cluster?
MirrorMaker does not use a consumer group to run and instead uses the assign() API, so it's expected that you don't see a group.
It's hard to "test". One way to verify this configuration was picked up is to check it's present in the logs when MirrorMaker starts its consumers.
This is currently not trivial to do. There's a KIP in progress to improve the process but at the moment it requires manually updating the internal offset topic from your Connect instance. At a very high level, here's the process:
First, ensure MirrorMaker is not running. Then you need to find the offset records for MirrorMaker in the offsets topic using a command like:
./bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 \
--topic <CONNECT_OFFSET_TOPIC \
--from-beginning \
--property print.key=true | grep <SOURCE_CONNECTOR_NAME>
You will see records with offsets for each partition MirrorMaker handles. To update the offsets, you need to produce new records to this topic with the offsets you want. For each partition, ensure your record has the same key as the existing message so it replaces the existing stored offsets.

Kafka consumer group description does not include all topics [duplicate]

What I want to achieve is to be sure that my Kafka streams consumer does not have lag.
I have simple Kafka streams application that materialized one topic as store in form of GlobalKTable.
When I try to describe consumer on Kafka by command:
kafka-consumer-groups --bootstrap-server localhost:9092 --describe --group my-application-id
I can't see any results. And there is no error either. When I list all consumers by:
kafka-consumer-groups --bootstrap-server localhost:9092 --describe --all-groups
my application consumer is listed correctly.
Any idea where to find additional information what is happening that I can't describe consumer?
(Any other Kafka streams consumers that write to topics can be described correctly.)
If your application does only materialize a topic into a GlobalKTable no consumer group is formed. Internally, the "global consumer" does not use subscribe() but assign() and there is no consumer group.id configured (as you can verify from the logs) and no offset are committed.
The reason is, that all application instances need to consume all topic partitions (ie, broadcast pattern). However, a consumer group is designed such that different instances read different partitions for the same topic. Also, per consumer group, only one offset can be committed per partition -- however, if multiple instance read the same partition and would commit offsets using the same group.id the commits would overwrite each other.
Hence, using a consumer group while "broadcasting" data does not work.
However, all consumers should expose a "lag" metrics records-lag-max and records-lag (cf https://kafka.apache.org/documentation/#consumer_fetch_monitoring). Hence, you should be able to hook in via JMX to monitor the lag. Kafka Streams includes client metrics via KafkaStreams#metrics(), too.

How to retrieve Kafka Consumer Configs

I have several consumers that connect to Kafka Cluster that I do not have control over. At the same time, I would like to have visibility into how those consumers are configured.
Is there an API to list all consumers (if there is one for publishers, it is an added benefit) and then read all their configs?
I am talking about these consumer settings:
https://docs.confluent.io/current/installation/configuration/consumer-configs.html#cp-config-consumer
This is not possible as most of those settings are configured at the consumer only and are not pushed to the brokers or any topic.
It's possible however to get a high-level description for a given consumer group:
./kafka-consumer-groups.sh --bootstrap-server localhost:9092 --describe --group consumer-group

Kafka topics not created empty

I have a Kafka cluster consisting on 3 servers all connected through Zookeeper. But when I delete a topic that has some information and create the topic again with the same name, the offset does not start from zero.
I tried restarting both Kafka and Zookeeper and deleting the topics directly from Zookeeper.
What I expect is to have a clean topic When I create it again.
I found the problem. A consumer was consuming from the topic and the topic was never actually deleted. I used this tool to have a GUI that allowed me to see the topics easily https://github.com/tchiotludo/kafkahq. Anyway, the consumers can be seen running this:
bin/kafka-consumer-groups.sh --list --bootstrap-server localhost:9092

removing a kafka consumer group in zookeeper

I'm using kafka_2.9.2-0.8.1.1 with zookeeper 3.4.6.
Is there a utility that can automatically remove a consumer group from zookeeper? Or can I just remove everything under /consumers/[group_id] in zookeeper? If the latter, is there anything else I'm missing & can this be done with a live system?
Update:
As of kafka version 2.3.0, there is a new utility:
> bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --delete --group my-group
Related doc: http://kafka.apache.org/documentation/#basic_ops_consumer_lag
See below for more discussion
As of v0.9.0, Kafka ships with a suite of tools in the /bin one of which is the kafka-consumer-groups.sh tool. This will delete a consumer group. ./kafka-consumer-groups.sh --zookeeper <zookeeper_url> --delete --group <group-name>
For new consumers (which use a kafka topic to manage offsets instead of zookeeper) you cannot delete the group information using kafka's built in tools.
Here is an example of trying to delete the group information for a new style consumer using the kafka-consumer-groups.sh script:
bin/kafka-consumer-groups.sh --bootstrap-server "kafka:9092" --delete --group "indexer" --topic "cleaned-logs"
Option '[delete]' is only valid with '[zookeeper]'. Note that there's no need to delete group metadata for the new consumer as the group is deleted when the last committed offset for that group expires.
Here's the important part of that response:
Note that there's no need to delete group metadata for the new consumer as the group is deleted when the last committed offset for that group expires.
This is kind of annoying from a monitoring perspective (esp. when tracking offsets via something like burrow) because it means that if you change consumer group names in your code you'll keep seeing that old groups are behind on their offsets until those offsets expire.
Hypothetically you could write a tombstone to that topic manually (which is what happens during offset expiration) but I haven't found any tools that make this easy.
you can delete group from kafka by CLI
kafka-consumer-groups --bootstrap-server localhost:9092 --delete --group group_name
Currently, as I know, the only way to remove a Kafka consumer group is manually deleting Zookeeper path /consumers/[group_id].
If you just want to delete a consumer group, there is nothing to worry about manually deleting the Zookeeper path, but if you do it for rewinding offsets, the below will be helpful.
First of all, you should stop all the consumers belongs to the consumer group before removing the Zookeeper path. If you don't, those consumers will not consume newly produced messages and will soon close connections to the Zookeeper cluster.
When you restart the consumers, if you want the consumers to start off from the beginning, give auto.offset.reset property to smallest (or earliest in new Kafka releases). The default value of the property is largest (or latest in new Kafka releases) which makes your restarting consumers read after the largest offset which in turn consuming only newly produced messages. For more information about the property, refer to Consumer Config in the Kafka documentation.
FYI, there is a question How can I rewind the offset in the consumer? in Kafka FAQ, but it gave me not much help.