I have an application that uses fs2-kafka for reading business events from a kafka cluster. In that application, I have multiple fs2-kafka consumers, each subscribed to a different topic. But one of the consumers seems to be stuck, as it does not consume any events.
Checking the consumer group's offsets yielded the following results:
GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
consumer topic 0 - 5 - consumer-consumer-1-99c1c19a-faaf-40e6-a3dc-75b7d04e96f9 /10.0.3.2 consumer-consumer-1
(edited this slightly cause privacy)
I have also managed to get the CURRENT-OFFSET to be 1 (though, it seems like no actual consuming happened because none of my logs were triggered), but regardless - the group does not seem to want to move its offset.
The topic has just one partition and there's only one consumer/consumer group reading from that topic. There is no reason I can see for kafka to hold consumers from consuming. If it matters - that topic, as well as any other topic in this cluster, is created automatically, using kafka's "AUTO_CREATE_TOPICS". (this is a dev environment, it's simply more convinient than creating topics by hand)
The strangest thing is this - the same code, working on a different topic, works. Also, as it is always the case with these things, on my laptop the issue does not reproduce. There's barely any differences between my local kafka and the kafka in our dev cluster.
Originally, I had just one consumer group for the entire application. I have now tried multiple consumer groups per consumer and even sharing a consumer for reading from multiple topics. The only topic that's stuck is this one, every other topic works.
I have also tried:
Restarting kafka and the app, updating kafka to a newer version
Manually resetting the consumer group's offsets
Deleting the topic
Apart from deleting all the data of kafka, I believe I have tried everything on my and kafka's side.
Related
How are these two set? Behaviour that I observe with kafka-consumer-groups.sh is that when new message is appended to a certain partition, it increments at first its LOG-END-OFFSET and LAG columns, and after some time, CURRENT-OFFSET column gets incremented and LAG column gets decremented, although no offset was actually commited by any consumer, as there are no active consumers. Am I right, and is this always happening with consumer groups that have no active members, or is there a possibility to turn off the second stage, that simulates commiting offsets by non-existing consumers? This is actually confusing, you have to take into account the information that there are no active members in a consumer group, in order to have the right perspective of what the CURRENT-OFFSET and LAG columns actually mean (not much in that case).
OK, it seems that the consumer actually does continuously connect and poll the messages and commits the offsets, but in a volatile fashion (disconnecting each time) so that kafka-consumer-groups.sh always reports as if there are no active members in a group.
This is a flink job that acts this way. Is that possible?
If the retention policy kicks in, and deletes old messages, the lag could decrease (if published logs are less than deleted ones), since the CURRENT-OFFSET positions itself at the earliest avaliable log.
I'd check what's the retention policy for your topic, since this may be due to deleted messages: The lag doesn't care about purgued messages, only active ones.
This has nothing to do with connecting to and disconnecting from the kafka cluster, that would be way to slow and ineffective. It has to do with the way that flink kafka consumer is implemented, which is described here: Flink Kafka Connector
The committed offsets are only a means to expose the consumer’s
progress for monitoring purposes.
What it basically does, it does not subscribe to topics as standard consumers that use consumer groups and their standard coordinators and leader mechanisms, but it directly assigns partitions, and only commits offsets to a consumer group for monitoring purposes, although it has methods of using these offsets for continuation too, see here, but anyway, that is why these groups appear to kafka as non having active members, and still getting offsets commited.
Having an app that is running in several instances and each instance needs to consume all messages from all partitions of a topic.
I have 2 strategies that I am aware of:
create a unique consumer group id for each app instance and subscribe and commit as usual,
downside is kafka still needs to maintain a consumer group on behalf of each consumer.
ask kafka for all partitions for the topic and assign the consumer to all of those. As I understand there is no longer any consumer group created on behalf of the consumer in Kafka. So the question is if there still is a need for committing offsets as there is no consumer group on the kafka side to keep up to date. The consumer was created without assigning it a 'group.id'.
ask kafka for all partitions for the topic and assign the consumer to
all of those. As I understand there is no longer any consumer group
created on behalf of the consumer in Kafka. So the question is if
there still is a need for committing offsets as there is no consumer
group on the kafka side to keep up to date. The consumer was created
without assigning it a 'group.id'.
When you call consumer.assign() instead of consumer.subscribe() no group.id property is required which means that no group is required or is maintained by Kafka.
Committing offsets is basically keeping track of what has been processed so that you dont process them again. This may as well be done manually also. For example, reading polled messages and writing the offsets to a file once after the messages have been processed.
In this case, your program is responsible for writing the offsets and also reading from the next offset upon restart using consumer.seek()
The only drawback is, if you want to move your consumer from one machine to another, you would need to copy this file also.
You can also store them in some database that is accessible from any machine in case you don't want the file to be copied (though writing to a file may be relatively simpler and faster).
On the other hand, if there is a consumer group, so long as your consumer has access to Kafka, your Kafka will let your consumer automatically read from the last committed offset.
There will always be a consumer group setting. If you're not setting it, whatever consumer you're running will use its default setting or Kafka will assign one.
Kafka will keep track of the offset of all consumers using the consumer group.
There is still a need to commit offsets. If no offsets are being committed, Kafka will have no idea what has been read already.
Here is the command to view all your consumer groups and their lag:
bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --describe --all-groups
Is it possible to have multiple copies of an application listen to the same Kafka group/topic so that only one is reading it at a time, but the other ones will start working if the main one crashes/stops reading?
I need to make an application highly available but can't tolerate doubling the traffic to the data store on the other end of the application by having multiple copies actively running.
FYI - Technically I'm using MapR streams but it adheres to the Kafka API and functionality, in case anyone knows a MapR stream-specific feature that helps the situation.
It is possible. If multi consumers are in same consumer group, when the group subscribes a topic, kafka will do a partition assignment work for your consumers: one partition could only be consumed by only one consumer in a same group.
So you could set your topic to have only one partition, then only one consumer to consume message, others will be idle. Once the consumer is shutdown, it will trigger the group rebalance operation : kafka will do the partition assignment again. And Then in your case , a new consumer will go ahead this work. It will process message from the last committed offset which commited by old consumer.
And if your case supports parallel processing, you could make many process(app) doing same work and set the topic to multi partitions. They will be assigned to consume different partitions and process different messages. So it will speed up your process and also can tolerant the fail over. As above said, if some consumers is failed, kafka will take care it for you, it will assign their paritition to other working consumer. So everything will be ok.
I'm a newbie in Kafka. I had a glance at the Kafka Documentation. It seems that the the message dispatched to a subscribing consumer group is implemented by binding the partition with the consumer instance.
One important thing we should remember when we work with Apache Kafka is the number of consumers in the same consumer group should be less than or equal the number of partitions in the consumed topic. Otherwise, the exceedable consumers will not be received any messages from the topic.
In a non-prod environment, I didn't config the topic partition. In such case, is there only a single partition in Kafka. And If I start multiple consumers sharing the same group and subscribe them to the topic, would the message always dispatched to the same instance in the group? In other words, I have to partition the topic to get the load-balance feature in consumer group?
Thanks!
You are absolutely right. One partitions cannot be processed in paralell (by one consumer group). You can treat partition as atomic and it cannot be split.
If you configure non-prod and prod env with the same amount of partitions per topic, that should help you to find correct number of conumsers and catch problems before moving to prod.
In the Kafka documentation:
Kafka handles this differently. Our topic is divided into a set of
totally ordered partitions, each of which is consumed by one consumer
at any given time. This means that the position of consumer in each
partition is just a single integer, the offset of the next message to
consume. This makes the state about what has been consumed very small,
just one number for each partition. This state can be periodically
checkpointed. This makes the equivalent of message acknowledgements
very cheap.
Yet, following their quick start guide in that same document, I was easily able to:
Create a topic with a single partition
Start a console-producer
Push a few messages
Start a consumer to consume --from-beginning
Start another consumer --from-beginning
And have both consumers successfully consume from the same partition.
But this seems at odds with the documentation above?
When using different consumer groups, consumers can consume the same partitions easily. You may consider group ids as different applications consuming a Kafka topic. Multiple different applications might want to use the data in a Kafka topic differently and thus not to conflict with other applications. That's why two consumers may consume one partition (in fact the only way how two consumers can consume one partition).
And when you start a console consumer it randomly generates a group id for it (link) thus these consumers are doing exactly what I just wrote.