Why did all the offsets disappear for consumers? - apache-kafka

I have a service with kafka consumers. Previously, I created and closed consumers after receiving records every time. I made a change and started using resume / pause without closing consumers (with ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG = false and consumer.commitSync(offsetAndMetadataMap);). The service worked great all week. After 7 days it was restarted. After the restart, all offsets disappeared and consumers began to receive all old records (). How could this happen? Where did the offsets go?

I guess your consumers of that consumer group were not up for the 7 days before the restart?
The internal offset topic which contains data about the offsets of your groups is defined as compacted and delete topic policy,
it means it compact the records to save last value of a key
and also deletes old records from the topic,
the default is 7 days, offset topic retention ,
KAFKA-3806: Increase offsets retention default to 7 days (KIP-186) #4648
it is configurable as any other topic configuration
Offset expiration semantics has slightly changed in this version. According to the new semantics, offsets of partitions in a group will not be removed while the group is subscribed to the corresponding topic and is still active (has active consumers). If group becomes empty all its offsets will be removed after default offset retention period (or the one set by broker) has passed (unless the group becomes active again). Offsets associated with standalone (simple) consumers, that do not use Kafka group management, will be removed after default offset retention period (or the one set by broker) has passed since their last commit.

Related

Kafka retention policy on active consumer group

Does the Kafka clean up the logs only when no consumer is active on a consumer group?
When there is a lag in a partition with an active consumer, I expected the current offset (lag) to also adjust once the time set on the retention policy has passed, but it looks like the lags are still consumable after the retention period had passed as long as the consumer is attached to the group.
I tested with the log.retention.check.interval.ms set to 1ms and log.cleanup.policy to 'delete', along with the topic's retentions.ms set to 1000ms, but the lags were still consumable way past the 1000ms.
When I remove the consumer and add a consumer again to the existing group, the offset gets adjusted as expected.
Does Kafka only adjust the offset when there is no active consumer?
If so, is there a way to update the current offset according to the retention policy other than removing and recreating the consumer?
Thanks in advance.
If there's an active consumer that's committing offsets back to Kafka __consumer_offsets topic, then no, offset information wouldn't ever be removed, despite the original topic segments being removed to where those offsets may no longer exist. As the docs indicate, the group needs to first be inactive, but also need to remain inactive for several minutes.
offsets.retention.minutes
After a consumer group loses all its consumers (i.e. becomes empty) its offsets will be kept for this retention period before getting discarded
(emphasis added)
You can call seekToEarliest / seekToEnd function if you want to always guarantee your group position rather than rely on stored offsets

capturing the data which is just about to get discarded from kafka topic?

Kafka topic's retention period is 7 days. But I need to push data which is expiring because of retention period to new kafka topic or some other storage.
So is there any method where I can access the data which is going to be deleted after 7 days just before it gets deleted? or way to set up some process where it will automatically push data which is going to get deleted to some place else.
Since 0.10 version of kafka each message has a timestamp. Simply setup a consumer group that starts every hour and processes each topic partition from the initial offset (auto.offset.reset=earliest) and pushes on the new topic the messages with the timestamp with incoming expiration (one hour width), then the consumer group stops and is restarted one hour later.

What consumer offset will be set if auto.offset.reset=earliest but topic has no messages

I have Kafka server version 2.4 and set log.retention.hours=168(so that messages in the topic will get deleted after 7 days) and auto.offset.reset=earliest(so that if the consumer doesn't get the last committed offset then it should be processed from the beginning). And since I am using Kafka 2.4 version so by default value offsets.retention.minutes=10080 (since I am not setting this property in my application).
My Topic data is : 1,2,3,4,5,6,7,8,9,10
current consumer offset before shutting down consumer: 10
End offset:10
last committed offset by consumer: 10
So let's say my consumer is not running for the past 7 days and I have started the consumer on the 8th day. So my last committed offset by the consumer will get expired(due to offsets.retention.minutes=10080 property) and topic messages also will get deleted(due to log.retention.hours=168 property).
So wanted to know what consumer offset will be set by auto.offset.reset=earliest property now?
Although no data is available in the Kafka topic, your brokers still know the "next" offset within that partition. In your case the first and last offset of this topic is 10 whereas it does not contain any data.
Therefore, your consumer which already has committed offset 10 will try to read 11 when started again, independent of the consumer configuration auto.offset.reset.
Your example will get even more interesting when your topic has had offsets, say, until 15 while the consumer was shut down after committing offset 10. Now, imagine all offsets were removed from the topic due to the retention policy. If you then start your consumer only then the consumer configuration auto.offset.reset comes into effect as stated in the documentation:
"What to do when there is no initial offset in Kafka or if the current offset does not exist any more on the server (e.g. because that data has been deleted)"
As long as the Kafka topic is empty there is no offset "set" for the consumer. The consumer just tries to find the next available offset, either based on
the last committed offset or,
in case the last committed offset does not exist anymore, the configuration given through auto.offset.reset.
Just as an additional note: Even though the messages seem to get cleaned by the retention policy you may still see some data in the topic due to Data still remains in Kafka topic even after retention time/size
Once the consumer group gets deleted from log, auto.offset.reset will take the precedence and consumers will start consuming data from beginning.
My Topic data is : 1,2,3,4,5,6,7,8,9,10
If the topic has the above data, the consumer will start from beginning, and all 1 to 10 records will be consumed
My Topic data is : 11,12,13,14,15,16,17,18,19,20
In this case if old data is purged due to retention, the consumer will reset the offset to earliest (earliest offset available at that time) and start consuming from there, for example in this scenario it will consume all from 11 to 20 (since 1 to 10 are purged)

Is it possible to lose Consumer group Offsets by Kafka brokers?

I was consuming from a Kafka topic (with a retention of 21 days) in a Kafka Cluster as a consumer (earliest/from beginning) for 15 days continously with x consumer group and on 15th day producer's team stopped producing and I stopped consumer on my side conforming that no messages left over to consume. Then Kafka Cluster was also turned off. Then on 16th day Kafka Cluster turned on and Producer started his producer on 23rd day and I started my consumer on myside as well. But when I started, I was getting messages from beginning not from where I left out eventhough am consuming with same x consumer group. So my question is why this happened? Does Kafka Broker lost information about consumer group?
When a consumer group loses all its consumers its offsets will be kept for the period configured in the broker property offsets.retention.minutes. This property defaults to 10080 which is the equivalent to 7 days — roughly the time taken when your consumer stopped (the 16th day) and when it was resumed (the 23rd day).
You can tweak this property to increase the retention period of offsets. Alternatively, you can also tweak the property offsets.retention.check.interval.ms that dictates how often a check for stale offsets will occur.

Missing events on previously empty partition when restarting kafka streams app

I have a strange issue that I cannot understand how I can resolve. I have a kafka streams app (2.1.0) that reads from a topic with around 40 partitions. The partitions are using a range partition policy so at the moment some of them can be completely empty.
My issue is that during the downtime of the app one of those empty partitions was activated and a number of events were written to it. When the app was restored though, it read all the events from other partitions but it ignored the events already stored to the previous empty partition (the app has OffsetResetPolicy LATEST for the specific topic). On top of that when newer messages arrived to the specific partition it did consume them and somehow bypassed the previous ones.
My assumption is that __consumer_offsets does not have any entry for the specified partition when restoring but how can I avoid this situation without losing events. I mean the topic already exists
with the specified number of partitions.
Does this sound familiar to anybody ? Am I missing something, do I need to set some parameter to kafka because I cannot figure out why this is happening ?
This is expected behaviour.
Your empty partition does not have committed offset in __consumer_offsets. If there are no committed offsets for a partition, the offset policy specified in auto.offset.rest is used to decide at which offset to start consuming the events.
If auto.offset.reset is set to LATEST, your Streams app will only start consuming at the latest offset in the partition, i.e., after the events that were added during downtime and it will only consume events that were written to the partition after downtime.
If auto.offset.reset is set to EARLIEST, your Streams app will start from the earliest offset in the partition and read also the events written to the partition during downtime.
As #mazaneica mentioned in a comment to your question, auto.offset.reset only affects partitions without a committed offset. So your non-empty partitions will be fine, i.e., the Streams app will consume events from where it stopped before the downtime.