Why do the offsets of the consumer-group (app-id) of my Kafka Streams Application get reset after application restart? - apache-kafka

I have a Kafka Streams application for which, whenever I restart it, the offsets for the topic it is consuming get reset. Hence, for all partitions, the lags increase and the app needs to reprocess all the data.
The output topic is receiving a burst of events that were already processed after the App gets restarted, is not that the input topic offsets are getting reset as I said in the previous paragraph. However, the internal topic (KTABLE-SUPPRESS-STATE-STORE) offsets are getting reset, see comments below.
I have ensured the lag is 1 for every partition before the restart (this is for the output topic).
All consumers that belong to that consumer-group-id (app-id) are active.
The restart is immediate, it takes around 30 secs.
The app is using exactly once as processing guarantee.
I have read this answer How does an offset expire for an Apache Kafka consumer group? .
I have tried with auto.offset.reset = latest and auto.offset.reset = earliest.
It seems like the offsets for these topics are not effectively committed, (but I am not sure about this).
I assume that after the restart the app should pick-up from the latest committed offset for that consumer group.
I assume this for the internal topic (KTABLE-SUPPRESS-STATE-STORE)
Does the Kafka Stream API ensure to commit all consumed offset before shutting down? (after calling streams.close())
I would really appreciate any clue about this.
This is the code the App execute:
final StreamsBuilder builder = new StreamsBuilder();
final KStream<..., ...> events = builder
.stream(inputTopicNames, Consumed.with(..., ...)
.filter((k, v) -> ...)
.flatMapValues(v -> ...)
.flatMapValues(v -> ...)
.selectKey((k, v) -> v)
.groupByKey(Grouped.with(..., ...))
.reduce((agg, new) -> {
return agg;
.to(outPutTopicNameOfGroupedData, Produced.with(..., ...));
The offset reset just and always happens (after restarting) with the KTABLE-SUPPRESS-STATE-STORE internal topic created by the Kafka Stream API.
I have tried with the Processing guarantee exactly once and at least once.
Once again, I will really appreciate any clue about this.
This has been solved in the release 2.2.1 (https://issues.apache.org/jira/browse/KAFKA-7895)

The offset reset just and always happens (after restarting) with the KTABLE-SUPPRESS-STATE-STORE internal topic created by the Kafka Stream API.
This is currently (version 2.1) expected behavior, because the suppress() operator works in-memory only. Thus, on restart, the suppress buffer must be recreate from the changelog topic before processing can start.
Note, it is planned to let suppress() write to disk in future releases (cf. https://issues.apache.org/jira/browse/KAFKA-7224). This will avoid the overhead of recreating the buffer from the changelog topic.

I think #Matthias J. Sax 's reply covers most of the internals of suppress. One thing I need to clarify though: when you say "restart the application", what exactly did you do? Did you shutdown the whole application gracefully, and then restart it?

Commit frequency is controlled by the parameter commit.interval.ms. Check whether your offsets are indeed committed. By default, offsets are committed every 100 ms or 30 secs, depending upon your processing guarantee config. Check this out


(Spring) Kafka appears to consume newly produced messages out of order

We have a Spring Boot / Spring Kafka application that is reading from a Kafka topic with a single partition. There is a single instance of the application running and it has a single-threaded KafkaMessageListenerContainer (not Concurrent). We have a single consumer group.
We want to manage offsets ourselves based on committing to a transactional database. At startup, we read initial offsets from our database and seek to that offset and begin reading older messages. (For example with an empty database, we would start at offset 0.) We do this via implementing ConsumerRebalanceListener and seek()ing in that callback. We pause() the KafkaMessageListenerContainer prior to starting it so that we don't read any messages prior to the ConsumerRebalanceListener being invoked (then we resume() the container inside the ConsumerRebalanceListener.onPartitionsAssigned() callback). We acknowledge messages manually as they are consumed.
While in the middle of reading these older messages (1000s of messages and 10s of seconds/minutes into the reading), a separate application produces messages into the same topic and partition we're reading from.
We observe that these newly produced messages are consumed immediately, intermingled with the older messages we're in the process of reading. So we observe message offsets that jump in this single consumer thread: from the basically sequential offsets of the older messages to ones that are from the new messages that were just produced, and then back to the older, sequential ones.
We don't see any errors in reading messages or anything that would trigger retries or anything like that. The reads of newer messages happen in the main thread as do the reads of older messages, so I don't believe there's another listener container running.
How could this happen? Doesn't this seem contrary to the ordering guarantees Kafka is supposed to provide? How can we prevent this behavior?
We have the following settings (some in properties, some in code, please excuse the mix):
properties.consumer.isolationLevel = KafkaProperties.IsolationLevel.READ_COMMITTED
properties.consumer.maxPollRecords = 500
containerProps.ackMode = ContainerProperties.AckMode.MANUAL
containerProps.eosMode = ContainerProperties.EOSMode.BETA
Spring Kafka 2.5.5.RELEASE
Kafka 2.5.1
(we could definitely try upgrading if there was a reason to believe this was the result of a bug that was fixed since then.)
I can share some code snippets for any of the above if it's interesting.

Kafka consumer consuming older messages on restarting

I am consuming Kafka messages from a topic, but the issue is that every time the consumer restarts it reads older processed messages.
I have used auto.offset.reset=earliest. Will setting it manually using commit async help me overcome this issue?
I see that Kafka already has enabled auto commit to true by default.
I have used auto.offset.reset=earliest. Wwill setting it manually
using commit async help me overcome this issue?
When the setting auto.offset.reset=earliest is set the consumer will read from the earliest offset that is available instead of from the last offset. So, the first time you start your process with a new group.id and set this to earliest it will read from the starting offset.
Here is how we the issue can be debugged..
If your consumer group.id is same across every restart, you need to check if the commit is actually happening.
Cross check if you are manually overriding enable.auto.commit to false anywhere.
Next, check the auto commit interval (auto.commit.interval.ms) which is by default 5 sec and see if you have changed it to something higher and that you are restarting your process before the commit is getting triggered.
You can also use commitAsync() or even commitSync() to manually trigger. Use commitSync() (blocking call) for testing if there is any exception while committing. Few possible errors during committing are (from docs)
CommitFailedException - When you are trying to commit to partitions
that are no longer assigned to this consumer because the consumer is
for example no longer part of the group this exception would be thrown
RebalanceInProgressException - If the consumer instance is in the
middle of a rebalance so it is not yet determined which partitions
would be assigned to the consumer.
TimeoutException - if the timeout specified by
default.api.timeout.ms expires before successful completion of the
offset commit
Apart from this..
Also check if you are doing seek() or seekToBeginning() in your consumer code anywhere. If you are doing this and calling poll() you will likely get older messages also.
If you are using Embedded Kafka and doing some testing, the topic and the consumer groups will likely be created everytime you restart your test, there by reading from start. Check if it is a similar case.
Without looking into the code it is hard to tell what exactly is the error. This answer provides only an insight on debugging your scenario.

Apache Storm with Kafka offset management

I have built a sample topology with Storm using Kafka as a source. Here is a problem for which I need a solution.
Every time I kill a topology and start it again, the topology starts processing from the beginning.
Suppose Message A in Topic X was processed by Topology and then I kill the topology.
Now when I again submit the topology and Message A is still there is Topic X. It is processed again.
Is there a solution, maybe some sort of offset management to handle this situation.
You shouldn't use storm-kafka for new code, it is deprecated since the underlying client API is deprecated in Kafka, and removed as of 2.0.0. Instead, use storm-kafka-client.
With storm-kafka-client you want to set a group id, and a first poll offset strategy.
KafkaSpoutConfig.builder(bootstrapServers, "your-topic")
.setProp(ConsumerConfig.GROUP_ID_CONFIG, "kafkaSpoutTestGroup")
The above will make your spout start at the earliest offset first time you start it, and then it will pick up where it left off if you restart it. The group id is used by Kafka to recognize the spout when it restarts, so it can get the stored offset checkpoint back. Other offset strategies will behave differently, you can check the javadoc for the FirstPollOffsetStrategy enum.
The spout will checkpoint how far it got periodically, there is also a setting in the config to control this. The checkpointing is controlled by the setProcessingGuarantee setting in the config, and can be set to have at-least-once (only checkpoint acked offsets), at-most-once (checkpoint before spout emits the message), and "any times" (checkpoint periodically, ignoring acks).
Take a look at one of the example topologies included with Storm https://github.com/apache/storm/blob/dc56e32f3dcdd9396a827a85029d60ed97474786/examples/storm-kafka-client-examples/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutTopologyMainNamedTopics.java#L93.
Make sure when creating your spoutconfig that it has a fixed spout id by which it can identify itself after a restart.
From official Storm site:
Important: When re-deploying a topology make sure that the settings
for SpoutConfig.zkRoot and SpoutConfig.id were not modified, otherwise
the spout will not be able to read its previous consumer state
information (i.e. the offsets) from ZooKeeper -- which may lead to
unexpected behavior and/or to data loss, depending on your use case.
since I am facing with a similar issue, take advantage of it and ask. I have a code like this:
KafkaTridentSpoutConfig.Builder kafkaSpoutConfigBuilder = KafkaTridentSpoutConfig.builder(bootstrapServers, topic);
kafkaSpoutConfigBuilder.setProp(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG, fetchSizeBytes);
kafkaSpoutConfigBuilder.setProp(ConsumerConfig.GROUP_ID_CONFIG, clientId);
kafkaSpoutConfigBuilder.setProp(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringDeserializer");
kafkaSpoutConfigBuilder.setProp(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringDeserializer");
return new KafkaTridentSpoutOpaque(kafkaSpoutConfigBuilder.build());
But everytime I restart Storm Local Cluster, messages are read from the beginning. If I check offsets directly in Kafka for the particular group, there is no lag. It's like offsets from Kafka are not read.
Using Kafka 2.8, Storm 2.2.0. I didn't have this issue with Storm 0.9.X.
Any idea?

How to configure kafka such that we have an option to read from the earliest, latest and also from any given offset?

I know about configuring kafka to read from earliest or latest message.
How do we include an additional option in case I need to read from a previous offset?
The reason I need to do this is that the earlier messages which were read need to be processed again due to some mistake in the processing logic earlier.
In java kafka client, there is some methods about kafka consumer which could be used to specified next consume position.
public void seek(TopicPartition partition,
long offset)
Overrides the fetch offsets that the consumer will use on the next poll(timeout). If this API is invoked for the same partition more than once, the latest offset will be used on the next poll(). Note that you may lose data if this API is arbitrarily used in the middle of consumption, to reset the fetch offsets
This is enough, and there are also seekToBeginning and seekToEnd.
I'm trying to answer a similar but not quite the same question so let's see if my information may help you.
First, I have been working from this other SO question/answer
In short, you want to commit your offsets and the most common solution for that is ZooKeeper. So if your consumer encounters an error or needs to shut down, it can resume where it left off.
Myself I'm working with a high volume stream that is extremely large and my consumer (for a test) needs to start from the very tail each time. The documentation indicates I must use KafkaConsumer seek to declare my starting point.
I'll try to update my findings here once they are successful and reliable. For sure this is a solved problem.

Simple-Kafka-consumer message delivery duplication

I am trying to implement a simple Producer-->Kafka-->Consumer application in Java. I am able to produce as well as consume the messages successfully, but the problem occurs when I restart the consumer, wherein some of the already consumed messages are again getting picked up by consumer from Kafka (not all messages, but a few of the last consumed messages).
I have set autooffset.reset=largest in my consumer and my autocommit.interval.ms property is set to 1000 milliseconds.
Is this 'redelivery of some already consumed messages' a known problem, or is there any other settings that I am missing here?
Basically, is there a way to ensure none of the previously consumed messages are getting picked up/consumed by the consumer?
Kafka uses Zookeeper to store consumer offsets. Since Zookeeper operations are pretty slow, it's not advisable to commit offset after consumption of every message.
It's possible to add shutdown hook to consumer that will manually commit topic offset before exit. However, this won't help in certain situations (like jvm crash or kill -9). To guard againts that situations, I'd advise implementing custom commit logic that will commit offset locally after processing each message (file or local database), and also commit offset to Zookeeper every 1000ms. Upon consumer startup, both these locations should be queried, and maximum of two values should be used as consumption offset.