How to run two console consumers in the same consumer group? - apache-kafka

When I run two instances of Kafka-console-consumers with the exact same properties (using the default one config/consumer.properties), I get same messages on both the instances.
./bin/kafka-console-consumer.sh --bootstrap-server :9092 --topic test1
If both the instances have the same consumer group id, shouldn't Kafka send a given message to only one of the consumers? How to run them as one consumer group?

From kafka docs i found this
The default for console consumer's enable.auto.commit property when no group.id is provided is now set to false. This is to avoid polluting the consumer coordinator cache as the auto-generated group is not likely to be used by other consumers.
But here is the trick, use this command to list all consumer groups across all topics, as you said i have opened four console consumers and i want to check list of consumer groups consuming from that topic
bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --list
Every console consumer start with different group id, this is the reason always consuming from beginning addition of this property (--from-beginning)
ups.sh --bootstrap-server localhost:9092 --list
Note: This will not show information about old Zookeeper-based consumers.
console-consumer-66835
console-consumer-38647
console-consumer-18983
console-consumer-18365
console-consumer-96734
Okay easiest way to set group.id for console consumer
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning --consumer-property group.id=test1
Read up Managing Consumer Groups.

The trick is to use --consumer.config config/consumer.properties or --consumer-property group.id=test1 that would specify the group.id explicitly.
./bin/kafka-console-consumer.sh \
--bootstrap-server localhost:9092 \
--topic test1 \
--consumer.config config/consumer.properties

Related

How to create a new consumer group in kafka

I am running kafka locally following instructions on quick start guide here,
and then I defined my consumer group configuration in config/consumer.properties so that my consumer can pick messages from the defined group.id
Running the following command,
bin/kafka-consumer-groups.sh --list --bootstrap-server localhost:9092
results in,
test-consumer-group <-- group.id defined in conf/consumer.properties
console-consumer-67807 <-- when connecting to kafka via kafka-console-consumer.sh
I am able to connect to kafka via a python based consumer that is configured to use the provide group.id i.e test-consumer-group
First of all, I am not able to understand how/when kafka creates consumer groups. It seems it loads the conf/consumer.properties at some point of time and additionally it implicitly creates consumer-group (in my case console-consumer-67807) when connecting via kafka-console-consumer.sh.
How can I explicitly create my own consumer group, lets say my-created-consumer-group ?
You do not explicitly create consumer groups but rather build consumers which always belong to a consumer group. No matter which technology (Spark, Spring, Flink, ...) you are using, each Kafka Consumer will have a Consumer Group. The consumer group is configurable for each individual consumer.
It seems it loads the conf/consumer.properties at some point of time and additionally it implicitly creates consumer-group (in my case console-consumer-67807) when connecting via kafka-console-consumer.sh
If you do not tell your console consumer to actually make use of that file it will not be taken into consideration.
There are the following alternatives to provide the name of a consumer group:
Console Consumer with property file (--consumer.config)
This is how the file config/consumer.properties should look like
# consumer group id
group.id=my-created-consumer-group
And this is how you would then ensure that the console-consumer takes this group.id into consideration:
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test-topic --from-beginning --consumer.config /path/to/config/consumer.properties
Console consumer with --group
For console consumers the consumer group gets created automatically with prefix "console-consumer" and suffix something like a PID, unless you provide your own consumer group by adding --group:
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test-topic --from-beginning --group my-created-consumer-group
Standard code-based consumer API
When using the standard JAVA/Scala/... Consumer API you could provide the Consumer Group through the properties:
Properties settings = new Properties();
settings.put(ConsumerConfig.GROUP_ID_CONFIG, "basic-consumer");
// set more properties
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(settings)) {
consumer.subscribe(Arrays.asList("test-topic")

How to remove a stale consumer from a kafka broker?

kafka-consumer-groups.sh --bootstrap-server hostname:port --describe --group sub1
Consumer group 'sub1' has no active members.
kafka-consumer-groups.sh --bootstrap-server hostname:port --delete --group sub1
Option '[delete]' is only valid with '[zookeeper]'.
Note that there's no need to delete group metadata for the new consumer as the group is deleted when the last committed offset for that group expires.
Also when i try to display my consumer details using zookeeper , It tells consumer "sub1" not available.
kafka-consumer-groups.sh --zookeeper hostname:port --describe --group sub1
Note: This will only show information about consumers that use ZooKeeper (not those using the Java consumer API).
Error: The consumer group 'sub1' does not exist.
Group information for consumers that use Kafka to manage offsets instead of Zookeeper cannot be deleted with built-in tools. If you read the whole warning when trying to execute
kafka-consumer-groups.sh --bootstrap-server hostname:port --delete --group sub1
it clearly mentions why group metadata info cannot be deleted:
Option '[delete]' is only valid with '[zookeeper]'.
Note that there's no need to delete group metadata for the new consumer as the group is deleted when the last committed offset for that group expires.

How to get the consumer group of the single consumer

I am playing aroung Kafka, when I use
bin/kafka-console-consumer.sh --zookeeper localhost:2181 --from-beginning --topic test
Kafka will automatically create a consumer group. I am wondering how to get the consumer group name?
You should use kafka-consumer-groups.sh. The following command will list you all consumer groups.
bin/kafka-consumer-groups.sh --list --bootstrap-server localhost:9092
Note: This will only show information about consumers that use the Java Consumer API (non-Zookeeper-based consumers).

How do I delete a Kafka Consumer Group to reset offsets?

I want to delete a Kakfa consumer group so that when the application creates a consumer and subscribes to a topic it can start at the beginning of the topic data.
This is with a single node development vm using the current latest Confluent Platform 3.1.2 which uses Kafka 0.10.1.1.
I try the normal syntax:
sudo /usr/bin/kafka-consumer-groups --new-consumer --bootstrap-server localhost:9092 --delete --group my_consumer_group
I get the error:
Option [delete] is only valid with [zookeeper]. Note that there's no need to delete group metadata for the new consumer as the group is deleted when the last committed offset for that group expires.
If I try the zookeeper variant:
sudo /usr/bin/kafka-consumer-groups --zookeeper localhost:2181 --delete --group my_consumer_group
I get:
Delete for group my_consumer_group failed because group does not exist.
If I list using the "old" consumer, I do not see my consumer group (or any other consumer groups)
sudo /usr/bin/kafka-consumer-groups --zookeeper localhost:2181 --list
If I list using the "new" consumer, I can see my consumer group but apparently I can't delete it:
sudo /usr/bin/kafka-consumer-groups --new-consumer --bootstrap-server localhost:9092 --list
This can be done with Kafka 1.1.x. From the documentation:
bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --delete --group my-group --group my-other-group
In Kafka 0.11 (or Confluent 3.3) you can reset the offsets of any existing consumer group without having to delete the topic. In fact you can change the offsets to any absolute offset value or timestamp or any relative position as well.
These new functions are all added with the new --reset-offsets flag on the kafka-consumer-groups command line tool.
See KIP-122 details here https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling
Upgrading to the just released Confluent Platform 3.2 with Kafka 0.10.2 solved my underlying issue. When I delete a topic, offset information is now correctly reset. So when I create a topic with the same name, consumers start from the beginning of the new data.
I still can't delete new style consumer groups with the kafka-consumer-groups tool, but my underlying issue is solved.
Before Kafka 0.10.2, there were hacks, but no clean solution to this issue.
If you use Java client, you can first get the beginning offset.
TopicPartition partition = new TopicPartition("YOUR_TOPIC", YOUR_PARTITION);
Map<TopicPartition, Long> map = consumer.beginningOffsets(Collections.singleton(partition));
And the offset that consumer using to start processing, (if not delete the consumer group).
Long committedOffset = consumer.committed(partition).offset();
Now, if you think start from committedOffset is ok, just poll records.
if you want the beginning offset,
consumer.seek(partition, map.get(partition));
you can also reset the offset of single topic without deleting the entire consumer group :
bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group my-group --topic my-topic --reset-offsets --to-earliest --execute
If you're using Windows and you need a JAAS config with password & username to access your kafka cluster, you can use the following commands
#consumer.properties
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="XXXXXXXXXXXXXXX" password="XXXXXXXXXXXXXXXXXXX";
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
.\kafka\bin\windows\kafka-consumer-groups.bat --bootstrap-server xxxxxxxxxx.confluent.cloud:9092 --group cop-group --topic topic_name_ini --reset-offsets --to-earliest --execute --command-config .\kafka\config\consumer.properties
In the command above we used --command-config to pass a properties file for the JAAS config

Kafka Connect Offsets. Get/Set?

How do I get, set, or reset the offset of a Kafka Connect connector/task/sink?
I can use the /usr/bin/kafka-consumer-groups tool which runs kafka.admin.ConsumerGroupCommand to see the offsets for all my regular Kafka consumer groups. However, Kafka Connect tasks and groups do not show up with this tool.
Similarly, I can use the zookeeper-shell to connect to Zookeeper and I can see zookeeper entries for regular Kafka consumer groups, but not for Kafka Connect sinks.
As of 0.10.0.0, Connect doesn't provide an API for managing offsets. It's something we want to improve in the future, but not there yet. The ConsumerGroupCommand would be the right tool to manage offsets for Sink connectors. Note that source connector offsets are stored in a special offsets topic for Connect (they aren't like normal Kafka offsets since they are defined by the source system, see offset.storage.topic in the worker configuration docs) and since sink Connectors uses the new consumer, they won't store their offsets in Zookeeper -- all modern clients use native Kafka-based offset storage. The ConsumerGroupCommand can work with these offsets, you just need to pass the --new-consumer option).
You can't set offsets, but you can use kafka-consumer-groups.sh tool to "scroll" the feed forward.
The consumer group of your connector has a name of connect-*CONNECTOR NAME*, but you can double check:
unset JMX_PORT; ./bin/kafka-consumer-groups.sh --bootstrap-server *KAFKA HOSTS* --list
To view current offset:
unset JMX_PORT; ./bin/kafka-consumer-groups.sh --bootstrap-server *KAFKA HOSTS* --group connect-*CONNECTOR NAME* --describe
To move the offset forward:
unset JMX_PORT; ./bin/kafka-console-consumer.sh --bootstrap-server *KAFKA HOSTS* --topic *TOPIC* --max-messages 10000 --consumer-property group.id=connect-*CONNECTOR NAME* > /dev/null
I suppose you can move the offset backward as well by deleting the consumer group first, using --delete flag.
Don't forget to pause and resume your connector via Kafka Connect REST API.
In my case(testing reading files into producer and consume in console, all in local only), I just saw this in producer output:
offset.storage.file.filename=/tmp/connect.offsets
So I wanted to open it but it is binary, with some hardly recognizable characters.
I deleted it(rename it also works), and then I can write into the same file and get the file content from consumer again. You have to restart the console producer to take effect because it attempts to read the offset file, if not there, create a new one, so that the offset is reset.
If you want to reset it without deletion, you can use:
bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group <group-name> --reset-offsets --to-earliest --topic <topic_name>
You can check all group names by:
bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --list
and check details of each group:
bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group <group_name> --describe
In production environment, this offset is managed by zookeeper, so more steps (and caution) is needed. You can refer to this page:
https://metabroadcast.com/blog/resetting-kafka-offsets
https://community.hortonworks.com/articles/81357/manually-resetting-offset-for-a-kafka-topic.html
Steps:
kafka-topics --list --zookeeper localhost:2181
kafka-run-class kafka.tools.GetOffsetShell --broker-list localhost:9092 -topic vital_signs --time -1 // -1 for largest, -2 for smallest
set /consumers/{yourConsumerGroup}/offsets/{yourFancyTopic}/{partitionId} {newOffset}