I have an application that uses Apache Kafka and creates a new consumer group on every startup. It takes fixed string and adds generated uuid to generate group id. (ex. my_consumer_group_123324234234, my_consumer_group_123324234235 ...). When I shut down the app old consumer groups stays unused until offsets.retention.minutes after kafka doesn't remove them.
I wonder if it is possible to remove unused consumer groups (filtered with name like 'my_consumer_group_*') by script
Yes, it should be possible using the kafka-consumer-groups.sh script included with Kafka.
You could create a script that periodically lists the existing consumer groups
kafka-consumer-groups.sh --bootstrap-server <kafka-servers-addrs> --list
Then describes each one of them
kafka-consumer-groups.sh --bootstrap-server <kafka-servers-addrs> --describe --group <consumer-group>
One option to detect if they are unused is to parse the output to see if it returns:
Consumer group '<consumer-group>' has no active members.
Note that relying on the message could be a bit brittle, since the message could change across Kafka versions, so I'd look for some other more robust approach (e.g. status code that the script returns (if any), initialize your own consumer...)
And then deletes the ones that are unused:
kafka-consumer-groups.sh --bootstrap-server <kafka-server-addrs> --delete --group <consumer-group1> --group <consumer-group2>
Related
I am running this command to list the offsets of a topic, but it keeps giving me error.
./kafka-consumer-groups.sh --bootstrap-server my.server.com:9092 --describe --group my-group
I also added the --offsets, but error remains same.
./kafka-consumer-groups.sh --bootstrap-server my.server.com:9092 --describe --offsets --group my-group
It keeps giving me error:
Found multiple arguments for option bootstrap-server, but you asked for only one
Option Description
------ -----------
--all-groups Apply to all consumer groups.
--all-topics Consider all topics assigned to a
group in the `reset-offsets` process.
--bootstrap-server <String: server to REQUIRED: The server(s) to connect to.
connect to>
--by-duration <String: duration> Reset offsets to offset by duration
from current timestamp. Format:
'PnDTnHnMnS'
--command-config <String: command Property file containing configs to be
config property file> passed to Admin Client and Consumer.
--delete Pass in groups to delete topic
partition offsets and ownership
information over the entire consumer
group. For instance --group g1 --
group g2
--delete-offsets Delete offsets of consumer group.
Supports one consumer group at the
time, and multiple topics.
--describe Describe consumer group and list
offset lag (number of messages not
yet processed) related to given
group.
--dry-run Only show results without executing
changes on Consumer Groups.
Supported operations: reset-offsets.
--execute Execute operation. Supported
operations: reset-offsets.
--export Export operation execution to a CSV
file. Supported operations: reset-
offsets.
--from-file <String: path to CSV file> Reset offsets to values defined in CSV
file.
--group <String: consumer group> The consumer group we wish to act on.
--help Print usage information.
--list List all consumer groups.
--members Describe members of the group. This
option may be used with '--describe'
and '--bootstrap-server' options
only.
Example: --bootstrap-server localhost:
9092 --describe --group group1 --
members
--offsets Describe the group and list all topic
partitions in the group along with
their offset lag. This is the
default sub-action of and may be
used with '--describe' and '--
bootstrap-server' options only.
Example: --bootstrap-server localhost:
9092 --describe --group group1 --
offsets
--reset-offsets Reset offsets of consumer group.
Supports one consumer group at the
time, and instances should be
inactive
Has 2 execution options: --dry-run
(the default) to plan which offsets
to reset, and --execute to update
the offsets. Additionally, the --
export option is used to export the
results to a CSV format.
You must choose one of the following
reset specifications: --to-datetime,
--by-period, --to-earliest, --to-
latest, --shift-by, --from-file, --
to-current.
To define the scope use --all-topics
or --topic. One scope must be
specified unless you use '--from-
file'.
--shift-by <Long: number-of-offsets> Reset offsets shifting current offset
by 'n', where 'n' can be positive or
negative.
--state [String] When specified with '--describe',
includes the state of the group.
Example: --bootstrap-server localhost:
9092 --describe --group group1 --
state
When specified with '--list', it
displays the state of all groups. It
can also be used to list groups with
specific states.
Example: --bootstrap-server localhost:
9092 --list --state stable,empty
This option may be used with '--
describe', '--list' and '--bootstrap-
server' options only.
--timeout <Long: timeout (ms)> The timeout that can be set for some
use cases. For example, it can be
used when describing the group to
specify the maximum amount of time
in milliseconds to wait before the
group stabilizes (when the group is
just created, or is going through
some changes). (default: 5000)
--to-current Reset offsets to current offset.
--to-datetime <String: datetime> Reset offsets to offset from datetime.
Format: 'YYYY-MM-DDTHH:mm:SS.sss'
--to-earliest Reset offsets to earliest offset.
--to-latest Reset offsets to latest offset.
--to-offset <Long: offset> Reset offsets to a specific offset.
--topic <String: topic> The topic whose consumer group
information should be deleted or
topic whose should be included in
the reset offset process. In `reset-
offsets` case, partitions can be
specified using this format: `topic1:
0,1,2`, where 0,1,2 are the
partition to be included in the
process. Reset-offsets also supports
multiple topic inputs.
--verbose Provide additional information, if
any, when describing the group. This
option may be used with '--
offsets'/'--members'/'--state' and
'--bootstrap-server' options only.
Example: --bootstrap-server localhost:
9092 --describe --group group1 --
members --verbose
--version Display Kafka version.
I have supplied all the necessary arguments, but why is this hapenning ?
I believe the issue is with your bootstrap-server naming convention my.server.com. Can you try with IP Address instead? Or localhost:9092.
When I use Kafka Tool: https://www.kafkatool.com/ I see additional consumer groups that I do not see with kafka-consumer-groups.sh
I'm assuming that the additional consumer groups are coming from Zookeeper while kafka-consumer-groups.sh only shows what it sees on the brokers.
Is there a way to delete these "ghost" groups? They are not used? Can I manually browse zookeeper and go delete those nodes?
IIRC, KafkaTool uses Zookeeper, not --bootstrap-server AdminClient protocol to list groups that kafka-consumer-groups does...
Also, kafka-console-consumer creates random groups that get hidden in the kafka-consumer-groups output.
While you could remove them from Zookeeper, all inactive consumer groups will automatically go away, as per the offsets retention policies over time, and having them there doesn't cause any performance penalties.
I am running kafka locally following instructions on quick start guide here,
and then I defined my consumer group configuration in config/consumer.properties so that my consumer can pick messages from the defined group.id
Running the following command,
bin/kafka-consumer-groups.sh --list --bootstrap-server localhost:9092
results in,
test-consumer-group <-- group.id defined in conf/consumer.properties
console-consumer-67807 <-- when connecting to kafka via kafka-console-consumer.sh
I am able to connect to kafka via a python based consumer that is configured to use the provide group.id i.e test-consumer-group
First of all, I am not able to understand how/when kafka creates consumer groups. It seems it loads the conf/consumer.properties at some point of time and additionally it implicitly creates consumer-group (in my case console-consumer-67807) when connecting via kafka-console-consumer.sh.
How can I explicitly create my own consumer group, lets say my-created-consumer-group ?
You do not explicitly create consumer groups but rather build consumers which always belong to a consumer group. No matter which technology (Spark, Spring, Flink, ...) you are using, each Kafka Consumer will have a Consumer Group. The consumer group is configurable for each individual consumer.
It seems it loads the conf/consumer.properties at some point of time and additionally it implicitly creates consumer-group (in my case console-consumer-67807) when connecting via kafka-console-consumer.sh
If you do not tell your console consumer to actually make use of that file it will not be taken into consideration.
There are the following alternatives to provide the name of a consumer group:
Console Consumer with property file (--consumer.config)
This is how the file config/consumer.properties should look like
# consumer group id
group.id=my-created-consumer-group
And this is how you would then ensure that the console-consumer takes this group.id into consideration:
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test-topic --from-beginning --consumer.config /path/to/config/consumer.properties
Console consumer with --group
For console consumers the consumer group gets created automatically with prefix "console-consumer" and suffix something like a PID, unless you provide your own consumer group by adding --group:
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test-topic --from-beginning --group my-created-consumer-group
Standard code-based consumer API
When using the standard JAVA/Scala/... Consumer API you could provide the Consumer Group through the properties:
Properties settings = new Properties();
settings.put(ConsumerConfig.GROUP_ID_CONFIG, "basic-consumer");
// set more properties
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(settings)) {
consumer.subscribe(Arrays.asList("test-topic")
I executed
"kafka-consumer-groups --bootstrap-server localhost:9092 --list"
and this results in one group being displayed: console-consumer-961
I then tried to delete this group:
kafka-consumer-groups --bootstrap-server localhost:9092 --delete --group console-consumer-961
But this results in an exception:
Error: Deletion of some consumer groups failed:
* Group 'console-consumer-961' could not be deleted due to: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.GroupNotEmptyException: The group is not empty.
I guess this group was created when I ran kafka-console-consumer.bat, but now this consumer is not running. How can I delete this consumer group?
You can validate the state of a consumer group by using the kafka-consumer-groups command line tool.
bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --describe --group console-consumer-961 --state
This will show you the state of the consumer. If it is not empty, Kafka will not allow to delete that group. If it still shows that your console consumer is running you need to make sure to properly shut it down (usually by CTRL+C)
When you delete a topic, offset information is now correctly reset. So when you create a topic with the same name, consumers start from the beginning of the new data.
you still can't delete new style consumer groups with the Kafka-consumer-groups tool, but your underlying issue is solved.
Before Kafka 0.10.2, there were hacks, but no clean solution to this issue.
I'm using kafka_2.9.2-0.8.1.1 with zookeeper 3.4.6.
Is there a utility that can automatically remove a consumer group from zookeeper? Or can I just remove everything under /consumers/[group_id] in zookeeper? If the latter, is there anything else I'm missing & can this be done with a live system?
Update:
As of kafka version 2.3.0, there is a new utility:
> bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --delete --group my-group
Related doc: http://kafka.apache.org/documentation/#basic_ops_consumer_lag
See below for more discussion
As of v0.9.0, Kafka ships with a suite of tools in the /bin one of which is the kafka-consumer-groups.sh tool. This will delete a consumer group. ./kafka-consumer-groups.sh --zookeeper <zookeeper_url> --delete --group <group-name>
For new consumers (which use a kafka topic to manage offsets instead of zookeeper) you cannot delete the group information using kafka's built in tools.
Here is an example of trying to delete the group information for a new style consumer using the kafka-consumer-groups.sh script:
bin/kafka-consumer-groups.sh --bootstrap-server "kafka:9092" --delete --group "indexer" --topic "cleaned-logs"
Option '[delete]' is only valid with '[zookeeper]'. Note that there's no need to delete group metadata for the new consumer as the group is deleted when the last committed offset for that group expires.
Here's the important part of that response:
Note that there's no need to delete group metadata for the new consumer as the group is deleted when the last committed offset for that group expires.
This is kind of annoying from a monitoring perspective (esp. when tracking offsets via something like burrow) because it means that if you change consumer group names in your code you'll keep seeing that old groups are behind on their offsets until those offsets expire.
Hypothetically you could write a tombstone to that topic manually (which is what happens during offset expiration) but I haven't found any tools that make this easy.
you can delete group from kafka by CLI
kafka-consumer-groups --bootstrap-server localhost:9092 --delete --group group_name
Currently, as I know, the only way to remove a Kafka consumer group is manually deleting Zookeeper path /consumers/[group_id].
If you just want to delete a consumer group, there is nothing to worry about manually deleting the Zookeeper path, but if you do it for rewinding offsets, the below will be helpful.
First of all, you should stop all the consumers belongs to the consumer group before removing the Zookeeper path. If you don't, those consumers will not consume newly produced messages and will soon close connections to the Zookeeper cluster.
When you restart the consumers, if you want the consumers to start off from the beginning, give auto.offset.reset property to smallest (or earliest in new Kafka releases). The default value of the property is largest (or latest in new Kafka releases) which makes your restarting consumers read after the largest offset which in turn consuming only newly produced messages. For more information about the property, refer to Consumer Config in the Kafka documentation.
FYI, there is a question How can I rewind the offset in the consumer? in Kafka FAQ, but it gave me not much help.