I'm running this kafka command:
/opt/kafka_2.11/bin/kafka-consumer-groups.sh --bootstrap-server xxxxx:9092 \
--describe --group flink-cg
Result is like this:
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG
my_topic 0 481239571 484028280 2788709
The offset keep stuck although my flink is running and have no error in the log file.
How to check if the number of my offset is correct? I'm afraid my current-offset is having wrong number so the value is stuck.
The sole fact that Flink Job is running doesn't necesarily mean that the offset should change. This depends on the configuration of Your job, but by default the offset is only commited on checkpoint, so the first thing to check is if Your job is properly checkpointing (maybe You have configured long time between checkpoints).
If it is or if You have enabled enable.auto.commit then You should check if there is possibly a backpressure for some operators that may be causing problems with reading of records.
It would be easier to tell if You could provide more info about configuration and the job itself.
Related
We have a working topic-consumer setup in kafka. While trying to create the same in another environment, the consumer does not start reading from the topic. I've tried restarting, deleting, renaming and many things but none of them worked.
When I describe the consumer group with the command kafka-consumer-groups --describe I got this result:
GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG
G__BRC_SENDSMS BRC_SENDSMS 0 - 31 -
Mind how current offset is not zero. Since it is not zero, there is also no lag too. Is this normal? What should I do?
create the same in another environment
Only one consumer in group.id=G__BRC_SENDSMS can read from BRC_SENDSMS, partition 0 at the same time. Change the group id, and it should start to read data. Or add more partitions to the topic, and you should see the group rebalance and distribute messages between all consumer instances. If neither work, then check networking configurations in this "other environment", such as the Kafka advertised.listeners.
If there is no current offset, then has your consumer actually committed yet?
My kafka topic has 10 records/messages in total and 2 partitions having 5 messages each. My consumer group has 2 consumers and each of the consumer has already read 5 messages from their assigned partition respectively. Now, I want to re-process/read messages from my topic from start/beginning (offset 0).
I stopped my kafka consumers and ran following command to reset consumer group offset to 0.
./kafka-consumer-groups.sh --group cg1 --reset-offsets --to-offset 0 --topic t1 --execute --bootstrap-server "..."
My expectation was that once I restart my kafka consumers they will start reading records from offset 0 i.e. beginning, but that didn't happen and they polled from their last position i.e. offset 5. Why is that so? I then have to make each of my consumers, explicitly seek to offset 0 (beginning) to re-process/read records from the beginning. And in later tests cycles, I didn't even ran above command to reset offset for kafka consumer group.
My question is, if I have to make my consumers explicitly seek to beginning to make them re-process/read messages again, then what's the purpose of resetting the offset of kafka consumer group?
Handling Kafka consumer offsets is bit more tricky. Consumer program uses auto.offset.reset config only when consumer group used does not have a valid offset committed in an internal Kafka topic.(Other supported offset storage is Zookeeper but internal Kafka topic is used as offset storage in latest Kafka versions).
Consider below scenarios:
Consumer in consumer group named 'group1' has consumed 5 messages from topic 'testtopic' and offset details are committed to internal Kafka topic- Next time when the consumer starts, it will not use 'auto.offset.reset' config. Instead it will fetch the stored offset from storage and will continue fetch messages from the retrieved offset.
Consumer in consumer group named 'group2' is started as a new consumer to fetch messages from 'testtopic'. This is new group and there is no offset details available in internal Kafka topic- 'auto.offset.reset' config is used now to decide where to start; either from beginning of the topic or from latest(only new messages will be consumed).
The issue as per your question is that the command to reset offset not working, you have to manually seek to beginning and start consumer.
kafka-consumer-groups.sh --bootstrap-server <kafka_host:port> --group <group_id> [--topic <topic_name> or --all-topics] --reset-offsets [--to-earliest or --to-offset <offset>] --execute
There are three possibilities for reset command not working.
The log retention period is smaller and offset you are trying to reset is no longer available
A consumer instance in the consumer group is running. In both cases, reset offset command may not work.
Kafka version is <0.11. Reset offset API is available only from Kafka 0.11
From your question, first and third case is unlikely. Please check for second case. Stop any consumer instance running and then try resetting offsets.
Below command can be used to check whether a consumer group has active consumer instance.
kafka-consumer-groups.sh --bootstrap-server <kafka_host:port> --group <group_id> --describe
Sample output:
Consumer group 'group1' has no active members.
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
intro 0 0 99 99
I'm new to Kafka and I've tried the Kafka-Python package.
I managed to setup a simple producer and consumer, which can send and receive messages. In this case the consumer is without using consumer group as below:
consumer = KafkaConsumer(queue_name, bootstrap_servers='kafka:9092')
However, when I started to use the group_id as below, it stops receiving any messages:
consumer = KafkaConsumer(bootstrap_servers='kafka:9092', auto_offset_reset='earliest', group_id='my-group')
consumer.subscribe([queue_name])
For comparison, I've also tried the confluent-kafka-python package, where I have the following consumer code, which also doesn't work:
consumer = Consumer({
'bootstrap.servers': 'kafka:9092',
'group.id': 'mygroup',
'auto.offset.reset': 'earliest'
})
consumer.subscribe([queue_name])
Also running ./kafka-consumer-groups.sh --bootstrap-server localhost:9092 --list gives empty result.
Any configuration I'm missing here?
By default, the consumer starts consuming from the last committed offsets which is probably the last offset in your case.
The auto.offset.reset only applies when there are no committed offsets. As by default the consumer automatically commits offsets, it usually only applies the first time your run it (there are a few other cases but they don't matter in this example).
So to see messages flowing, you need to either start producing once your consumer is running or use a different group name to allow auto.offset.reset to apply.
I am using storm-kafka-client 1.2.1 and creating my spout config for KafkaTridentSpoutOpaque as below
kafkaSpoutConfig = KafkaSpoutConfig.builder(brokerURL, kafkaTopic)
.setProp(ConsumerConfig.GROUP_ID_CONFIG,"storm-kafka-group")
.setProcessingGuarantee(ProcessingGuarantee.AT_MOST_ONCE)
.setProp(ConsumerConfig.CLIENT_ID_CONFIG,InetAddress.getLocalHost().getHostName())
I am unable to find neither my group-id nor the offset in both Kafka and Zookeeper. Through Zookeeper I tried with zkCli.sh and tried ls /consumers but there were none as I think Kafka itself is now maintaining offsets rather than zookeeper.
I tried with Kafka too with the command below
bin/kafka-run-class.sh kafka.admin.ConsumerGroupCommand --list --bootstrap-server localhost:9092
Note: This will not show information about old Zookeeper-based consumers.
console-consumer-20130
console-consumer-82696
console-consumer-6106
console-consumer-67393
console-consumer-14333
console-consumer-21174
console-consumer-64550
Can someone help me how I can find my offset and will it replay my events in Kafka again if I restart the topology ?
Trident doesn't store offsets in Kafka, but in Storm's Zookeeper. If you're running with default settings for Storm's Zookeeper config the path in Storm's Zookeeper will be something like /coordinator/<your-topology-id>/meta.
The objects below that path will contain the first and last offset, as well as topic partition for each batch. So e.g. /coordinator/<your-topology-id>/meta/15 would contain the first and last offset emitted in batch number 15.
Whether the spout replays offsets after restart is controlled by the FirstPollOffsetStrategy you set in the KafkaSpoutConfig. The default is UNCOMMITTED_EARLIEST, which does not start over on restart. See the Javadoc at https://github.com/apache/storm/blob/v1.2.1/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java#L126.
I'm using kafka_2.9.2-0.8.1.1 with zookeeper 3.4.6.
Is there a utility that can automatically remove a consumer group from zookeeper? Or can I just remove everything under /consumers/[group_id] in zookeeper? If the latter, is there anything else I'm missing & can this be done with a live system?
Update:
As of kafka version 2.3.0, there is a new utility:
> bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --delete --group my-group
Related doc: http://kafka.apache.org/documentation/#basic_ops_consumer_lag
See below for more discussion
As of v0.9.0, Kafka ships with a suite of tools in the /bin one of which is the kafka-consumer-groups.sh tool. This will delete a consumer group. ./kafka-consumer-groups.sh --zookeeper <zookeeper_url> --delete --group <group-name>
For new consumers (which use a kafka topic to manage offsets instead of zookeeper) you cannot delete the group information using kafka's built in tools.
Here is an example of trying to delete the group information for a new style consumer using the kafka-consumer-groups.sh script:
bin/kafka-consumer-groups.sh --bootstrap-server "kafka:9092" --delete --group "indexer" --topic "cleaned-logs"
Option '[delete]' is only valid with '[zookeeper]'. Note that there's no need to delete group metadata for the new consumer as the group is deleted when the last committed offset for that group expires.
Here's the important part of that response:
Note that there's no need to delete group metadata for the new consumer as the group is deleted when the last committed offset for that group expires.
This is kind of annoying from a monitoring perspective (esp. when tracking offsets via something like burrow) because it means that if you change consumer group names in your code you'll keep seeing that old groups are behind on their offsets until those offsets expire.
Hypothetically you could write a tombstone to that topic manually (which is what happens during offset expiration) but I haven't found any tools that make this easy.
you can delete group from kafka by CLI
kafka-consumer-groups --bootstrap-server localhost:9092 --delete --group group_name
Currently, as I know, the only way to remove a Kafka consumer group is manually deleting Zookeeper path /consumers/[group_id].
If you just want to delete a consumer group, there is nothing to worry about manually deleting the Zookeeper path, but if you do it for rewinding offsets, the below will be helpful.
First of all, you should stop all the consumers belongs to the consumer group before removing the Zookeeper path. If you don't, those consumers will not consume newly produced messages and will soon close connections to the Zookeeper cluster.
When you restart the consumers, if you want the consumers to start off from the beginning, give auto.offset.reset property to smallest (or earliest in new Kafka releases). The default value of the property is largest (or latest in new Kafka releases) which makes your restarting consumers read after the largest offset which in turn consuming only newly produced messages. For more information about the property, refer to Consumer Config in the Kafka documentation.
FYI, there is a question How can I rewind the offset in the consumer? in Kafka FAQ, but it gave me not much help.