Kafka consumer consuming older messages on restarting - apache-kafka

I am consuming Kafka messages from a topic, but the issue is that every time the consumer restarts it reads older processed messages.
I have used auto.offset.reset=earliest. Will setting it manually using commit async help me overcome this issue?
I see that Kafka already has enabled auto commit to true by default.

I have used auto.offset.reset=earliest. Wwill setting it manually
using commit async help me overcome this issue?
When the setting auto.offset.reset=earliest is set the consumer will read from the earliest offset that is available instead of from the last offset. So, the first time you start your process with a new group.id and set this to earliest it will read from the starting offset.
Here is how we the issue can be debugged..
If your consumer group.id is same across every restart, you need to check if the commit is actually happening.
Cross check if you are manually overriding enable.auto.commit to false anywhere.
Next, check the auto commit interval (auto.commit.interval.ms) which is by default 5 sec and see if you have changed it to something higher and that you are restarting your process before the commit is getting triggered.
You can also use commitAsync() or even commitSync() to manually trigger. Use commitSync() (blocking call) for testing if there is any exception while committing. Few possible errors during committing are (from docs)
CommitFailedException - When you are trying to commit to partitions
that are no longer assigned to this consumer because the consumer is
for example no longer part of the group this exception would be thrown
RebalanceInProgressException - If the consumer instance is in the
middle of a rebalance so it is not yet determined which partitions
would be assigned to the consumer.
TimeoutException - if the timeout specified by
default.api.timeout.ms expires before successful completion of the
offset commit
Apart from this..
Also check if you are doing seek() or seekToBeginning() in your consumer code anywhere. If you are doing this and calling poll() you will likely get older messages also.
If you are using Embedded Kafka and doing some testing, the topic and the consumer groups will likely be created everytime you restart your test, there by reading from start. Check if it is a similar case.
Without looking into the code it is hard to tell what exactly is the error. This answer provides only an insight on debugging your scenario.

Related

Using seek to listen get only uncommitted offset from beginning

I am using Spring Kafka and have a requirement where I have to listen from a DLQ topic and put the message to another topic after few minutes. Here I am only acknowledging a msg only when it is put to another topic else I am not committing it and calling kafkaListenerEndpointRegistry.stop() which is stopping my kafka consumer. Then there is scheduled cron job running after every 3 minutes and starts the consumer by running kafkaListenerEndpointRegistry.start() and since auto.offset.reset is set to earliest then consumer is getting all msgs from previously uncommitted offset and checking their eligibility to be put on other topic.
This approach is working fine for small volume but for very large volume I am not seeing the expected retries in both topics. So I am suspecting that this might be happening because I am using kafkaListenerEndpointRegistry.stop() to stop the consumer. If I am able to seek to beginning of offset for each partition and get all msgs from uncommitted offset then I don't have to stop and start my consumer.
For this, I tried ConsumerSeekAware.onPartitionAssigned and calling callback.seekToBeginning() to reset offsets. But looks like it's also consuming all committed offset which is increasing huge load on my services. So is there anything I am missing or seekToBeginning always read all msgs(committed and uncommitted).
and is there any way to trigger partition assignment manually while running kafka consumer so that it goes to onPartitionAssigned method?
auto.offset.reset is set to earliest then consumer is getting all msgs from previously uncommitted
auto.offset.reset is meaningless if there is a committed offset; it just determines the behavior if there is no committed offset.
seekToBeginning always read all msgs(committed and uncommitted).
Kafka maintains 2 pointers - the current position and the committed offset; seek has nothing to do with committed offset, seekToBeginning just changes the position to the earliest record, so the next poll will return all records.
This approach is working fine for small volume but for very large volume I am not seeing the expected retries in both topics. So I am suspecting that this might be happening because I am using kafkaListenerEndpointRegistry.stop() to stop the consumer.
That should not be a problem; you might want to consider using a container stopping error handler instead; then throw an exception and the container will stop itself (you should also set the stopImmediate container property).
https://docs.spring.io/spring-kafka/docs/current/reference/html/#container-stopping-error-handlers

Kafka consumer - how to recognized offset skipping/missing offsets?

Setup:
We have a Debezium/Kafka Connect setup with an Debezium Oracle producer and a Confluend JDBC consumer/sink.
Starting position / background / problem:
Due to high traffic we have decreased the log.retention.minutes to 1h which is suitable in 99% of the time.
But in some rare cases one of the kafka consumers gets a slow down and can't keep up any longer. In that case messages will be deleted in Kafka (due to the aforementioned retention period) before they were picked up and handled by the consumer.
In the default config, the consumer then will skip the missing records be choosing the earliest available offset. This leads to inconsistencies on the target side.
Question:
How to handle those situations (if raising the log.retention.minutes isn't an option)?
Note: We would be fine, if the consumer would just throw an exception/stop/etc in case it can't find a message for its given offset.
What we've tried to far...
We tried setting auto.offset.reset to none for the consumer and expected the consumer to stop in case it can't find an offset. In theory this should work. In practice it immeadiately throws an exception when the consumer gets instantiated because there's no first/initial offset.
Final thoughts
So is there another config parameter we could use? (Something like "throw exception if offset is missing/skipped, but not on first start"?) Or is there a JMX metric we could monitor in case a consumer is skipping messages?
setting auto.offset.reset to none for the consumer and expected the consumer to stop in case it can't find an offset
That's what it'll do, yes.
In practice it immediately throws an exception when the consumer gets instantiated because there's no first/initial offset
You'll need to actually initialize the group first, then seek it to the earliest offset. E.g. kafka-consumer-offsets --reset-offsets --to-earliest --group connect-<name>
Something like "throw exception if offset is missing/skipped, but not on first start"?)
There's nothing to differentiate auto.offset.reset between "first" and "next" starts. But, you could create the connector with consumer.override.auto.offset.reset=earliest, then wait for it to be running, then set it back to none with a PUT /config call. Then repeat whenever it stops running again.
JMX metric we could monitor in case a consumer is skipping messages
Not that I know of; the metrics are mostly reporting bytes processed. You'd have to additionally track how many bytes you expect it to read.
You'd need other monitoring solutions to detect log segments being deleted on the broker, and tracking those offset ranges compared to the offsets your consumer is currently reading.

Kafka fails to keep track of last-commited offset

Is there any known issue with kakfa-broker in managing the offsets? Bcz, problem which we are facing is when we try to restart of kafka-consumer(i.e, app restart) sometimes all the offset are reset to 0.
Completely clueless on why are consumers not able to start from the last commited offset.
We are eventually facing this issue in prod wherein the whole q events are replayed again :
spring-boot version -- 2.2.6 release
spring-kafka - 2.3.7 release
kafka-client -2.3.1
apache-kafka - kafka_2.12-2.3.1
We have 10 topics with 50 partitions for each topic which belongs to same group, we increase topic-partition and consumer count at run-time based on load.
auto-commit = false
sync commit each offset after processing
max-poll-records is set to 1
After all this config it runs as expected in local setup, after deployed to prod we see such issues nut not at every restart.
Is there any config that i'm missing.
Completely Clueless!!!!!
Do not enable auto commit per the suggestion in another answer; the listener container will more reliably commit the offsets and, as you say, you don't have the problem all the time.
Is it possible that you receive no records for a week?
Or, is it possible that your broker has a shorter offsets.retention.minutes property?
In 2.0, it was changed from a 1 day default to 1 week. If the offsets have been removed because they expired and you restart the consumer, you'll get the behavior you observe.
You need to make sure that:
1) You are using the same Consumer Group ID
2) auto.offset.reset is set to latest
spring.kafka.consumer.group-id=your-consumer-group-id
spring.kafka.consumer.auto-offset-reset=latest
In case you are still seeing this issue, try to enable auto-commit
spring.kafka.consumer.enable-auto-commit=true
and if the issue goes away then it means that your manual commits are not working as expected.

Kafka enable.auto.commit set to false but poll still fetch "next" messages

I want to tell Kafka when my consumer has successfully processed a record so I have turned auto-commit off by settting enable.auto.commit to false. I have two messages on a topic I am subscribed to at offset zero and one and have created a consumer so that each call to poll will return at most one record (by setting max.poll.records to 1).
I now call consumer.poll(5000) and receive the first message but I do not acknowledge it; I do not call commitSync or commitAsync. If I now call consumer.poll(5000) again, using the same consumer, I expect to get the exact same message I just read but, instead, I receive the second message.
How do I get consumer.poll to keep handing out the same message until I explicitly acknowledge it?
What you described is the expected behaviour. Every time you call poll(), it will return the next messages. The offset you commit is only used when connecting a new consumer so it knows where to (re)start from.
In MessageHub, we've set the session.timeout to 30 seconds. So you need to call poll() slightly faster to avoid being disconnected. If your processing takes longer than that, then I can think of 2 options:
Use Kafka 0.10.2 and set max.poll.interval.ms to tell your Kafka client to keep the session alive (without you having to call poll()) while you process the previous record. (This feature was added in 0.10.1 but we don't support that version. 0.10.2 works because it's capable to work with 0.10.0 brokers)
Use seek() to move back to the previous offset after poll so it keeps returning the same record.
Hope this helps!

Simple-Kafka-consumer message delivery duplication

I am trying to implement a simple Producer-->Kafka-->Consumer application in Java. I am able to produce as well as consume the messages successfully, but the problem occurs when I restart the consumer, wherein some of the already consumed messages are again getting picked up by consumer from Kafka (not all messages, but a few of the last consumed messages).
I have set autooffset.reset=largest in my consumer and my autocommit.interval.ms property is set to 1000 milliseconds.
Is this 'redelivery of some already consumed messages' a known problem, or is there any other settings that I am missing here?
Basically, is there a way to ensure none of the previously consumed messages are getting picked up/consumed by the consumer?
Kafka uses Zookeeper to store consumer offsets. Since Zookeeper operations are pretty slow, it's not advisable to commit offset after consumption of every message.
It's possible to add shutdown hook to consumer that will manually commit topic offset before exit. However, this won't help in certain situations (like jvm crash or kill -9). To guard againts that situations, I'd advise implementing custom commit logic that will commit offset locally after processing each message (file or local database), and also commit offset to Zookeeper every 1000ms. Upon consumer startup, both these locations should be queried, and maximum of two values should be used as consumption offset.