I'm trying to implement Kafka consumer on Java.
Assume that the consumer contains some message-processing logic that may throw an exception. In that case the consumer should sleep for some time and reprocess the last message.
My idea was to use manual offset management: an offset is not committed on fail, thus the consumer presumably will read from the old offset.
During testing I've found out that a message is actually read only once in spite of the fact that an offset is not committed. Last committed offset considered only on application restart.
My questions are:
Whether I'm doing the right thing?
What are usecases for manual offset management?
KafkaConsumer keep the latest offsets in-memory, thus, if an exception occurs (and you recover from it) and you want to read a message a second time, you need to use seek() before polling a second time.
Committing offsets is "only" there, to preserve the offsets when the client is shut down or crashed (ie, offsets stored reliably vs. in-memory). On client start up, the latest committed offsets are fetched and than the client only used it's own in-memory offsets.
Manual offset management is useful, if you want to "bundle" offset commits with some other action (eg, a second "commit" in another system that must be in-sync with committed Kafka offsets).
Related
I am using Spring Kafka and have a requirement where I have to listen from a DLQ topic and put the message to another topic after few minutes. Here I am only acknowledging a msg only when it is put to another topic else I am not committing it and calling kafkaListenerEndpointRegistry.stop() which is stopping my kafka consumer. Then there is scheduled cron job running after every 3 minutes and starts the consumer by running kafkaListenerEndpointRegistry.start() and since auto.offset.reset is set to earliest then consumer is getting all msgs from previously uncommitted offset and checking their eligibility to be put on other topic.
This approach is working fine for small volume but for very large volume I am not seeing the expected retries in both topics. So I am suspecting that this might be happening because I am using kafkaListenerEndpointRegistry.stop() to stop the consumer. If I am able to seek to beginning of offset for each partition and get all msgs from uncommitted offset then I don't have to stop and start my consumer.
For this, I tried ConsumerSeekAware.onPartitionAssigned and calling callback.seekToBeginning() to reset offsets. But looks like it's also consuming all committed offset which is increasing huge load on my services. So is there anything I am missing or seekToBeginning always read all msgs(committed and uncommitted).
and is there any way to trigger partition assignment manually while running kafka consumer so that it goes to onPartitionAssigned method?
auto.offset.reset is set to earliest then consumer is getting all msgs from previously uncommitted
auto.offset.reset is meaningless if there is a committed offset; it just determines the behavior if there is no committed offset.
seekToBeginning always read all msgs(committed and uncommitted).
Kafka maintains 2 pointers - the current position and the committed offset; seek has nothing to do with committed offset, seekToBeginning just changes the position to the earliest record, so the next poll will return all records.
This approach is working fine for small volume but for very large volume I am not seeing the expected retries in both topics. So I am suspecting that this might be happening because I am using kafkaListenerEndpointRegistry.stop() to stop the consumer.
That should not be a problem; you might want to consider using a container stopping error handler instead; then throw an exception and the container will stop itself (you should also set the stopImmediate container property).
https://docs.spring.io/spring-kafka/docs/current/reference/html/#container-stopping-error-handlers
I use Kafka Spring to insert to database processing messages as a batch with container "ConcurrentKafkaListenerContainerFactory" and in case of error occurs
Bad message I will send that messages to another topic.
If connection failed or time out I will rollback both database transaction and producer transaction to prevent false positive
And I don't get assignmentCommitOption option how dose it work and how it different between ALWAYS,NEVER,LATEST_ONLY and LATEST_ONLY_NO_TX,
If there is no current committed offset for a partition that is assigned, this option controls whether or not to commit an initial offset during the assignment.
It is really only useful when using auto.offset.reset=latest.
Consider this scenario.
Application comes up and is assigned a "new" partition; the consumer will be positioned at the end of the topic.
No records are received from that topic/partition and the application is stopped.
A record is then published to the topic/partition and the consumer application restarted.
Since there is still no committed offset for the partition, it will again be positioned at the end and we won't receive the published record.
This may be what you want, but it may not be.
Setting the option to ALWAYS, LATEST_ONLY, or LATEST_ONLY_NO_TX (default) will cause the initial position to be committed during assignment so the published record will be received.
The _NO_TX variant commits the offset via the Consumer, the other one commits it via a transactional producer.
I am going through the definitive guide and come across this phrase
If the committed offset is larger than the offset of the last message
the client actually processed, all messages between the last processed
offset and the committed offset will be missed by the consumer group.
Wondering when this happens ? I understood the other use case where the committed offset is smaller, in which the consumer would have down leaving it not to commit the latest offset.
This doesn't happens from the normal consumption of messages.
I just checked the book, and a few pages later it explains how to commit a provided, arbitrary offset.
The reason why it's mentioned there is to make the point that the client can jump back and forth. I cannot think of a good reason when this would be used other that having to replay some messages due to an error, so an app could jump back and forth the stream replaying a few messages here and there.
This is a possibility when auto commit is enabled and the actual processing of the messages, e.g, DB save, making call to another downstream systems etc, takes longer than than the auto-commit duration (value set for auto.commit.interval.ms). In happy path scenario this won't be a problem but in error case you may want to poll the same messages again. For this reason I disabled auto-commit for use case in my project and used to manually commit the offset once "actual processing" is done.
We have topics with retention set as 7 days (168 hours). Messages are consumed in real-time as and when the producer sends the message. Everything is working as expected. However recently on a production server, Devops changed the time zone from PST to EST accidentally as part of OS patch.
After Kafka server restart, we saw few (not all of them, but random) old messages being consumed by the consumers. We asked Devops to change it back to PST and restart. Again the old messages re-appeared this weekend as well.
We have not seen this problem in lower environments (Dev, QA, Stage etc).
Kafka version: kafka_2.12-0.11.0.2
Any help is highly appreciated.
Adding more info... Recently our CentOS had a patch update and somehow, admins changed from PST timezone to EST and started Kafka servers... After that our consumers started seeing messages from offset 0. After debugging, I found the timezone change and admins changed back from EST to PST after 4 days. Our message producers were sending messages before and after timezone changes regularly. After timezone change from EST back to PST, Kafka servers were restarted and I am seeing the bellow warning.
This log happened when we chnaged back from EST to PST : (server.log)
[2018-06-13 18:36:34,430] WARN Found a corrupted index file due to requirement failed: Corrupt index found, index file (/app/kafka_2.12-0.11.0.2/data/__consumer_offsets-21/00000000000000002076.index) has non-zero size but the last offset is 2076 which is no larger than the base offset 2076.}. deleting /app/kafka_2.12-0.11.0.2/data/__consumer_offsets-21/00000000000000002076.timeindex, /app/kafka_2.12-0.11.0.2/data/__consumer_offsets-21/00000000000000002076.index, and /app/kafka_2.12-0.11.0.2/data/__consumer_offsets-21/00000000000000002076.txnindex and rebuilding index... (kafka.log.Log)
We restarted consumers after 3 days of timezone change back from EST to PST and started seeing consumer messages with offset 0 again.
As on Kafka v2.3.0
You can set
"enable.auto.commit" : "true",// default is true as well
"auto.commit.interval.ms" : "1000"
This means that So after every 1 second, a Consumer is going to commit its Offset to Kafka or every time data is fetched from the specified Topic it will commit the latest Offset.
So no sooner your Kafka Consumer has started and 1 second has elapsed, it will never read the messages that were received by the consumer and committed. This setting does not require Kafka Server to be restarted.
I think this is because you will restart the program before you Commit new offsets.
Managing offsets
For each consumer group, Kafka maintains the committed offset for each partition being consumed. When a consumer processes a message, it doesn't remove it from the partition. Instead, it just updates its current offset using a process called committing the offset.
If a consumer fails after processing a message but before committing its offset, the committed offset information will not reflect the processing of the message. This means that the message will be processed again by the next consumer in that group to be assigned the partition.
Committing offsets automatically
The easiest way to commit offsets is to let the Kafka consumer do it automatically. This is simple but it does give less control than committing manually. By default, a consumer automatically commits offsets every 5 seconds. This default commit happens every 5 seconds, regardless of the progress the consumer is making towards processing the messages. In addition, when the consumer calls poll(), this also causes the latest offset returned from the previous call to poll() to be committed (because it's probably been processed).
If the committed offset overtakes the processing of the messages and there is a consumer failure, it's possible that some messages might not be processed. This is because processing restarts at the committed offset, which is later than the last message to be processed before the failure. For this reason, if reliability is more important than simplicity, it's usually best to commit offsets manually.
Committing offsets manually
If enable.auto.commit is set to false, the consumer commits its offsets manually. It can do this either synchronously or asynchronously. A common pattern is to commit the offset of the latest processed message based on a periodic timer. This pattern means that every message is processed at least once, but the committed offset never overtakes the progress of messages that are actively being processed. The frequency of the periodic timer controls the number of messages that can be reprocessed following a consumer failure. Messages are retrieved again from the last saved committed offset when the application restarts or when the group rebalances.
The committed offset is the offset of the messages from which processing is resumed. This is usually the offset of the most recently processed message plus one.
From this article, which I think is very helpful.
I am trying to implement a simple Producer-->Kafka-->Consumer application in Java. I am able to produce as well as consume the messages successfully, but the problem occurs when I restart the consumer, wherein some of the already consumed messages are again getting picked up by consumer from Kafka (not all messages, but a few of the last consumed messages).
I have set autooffset.reset=largest in my consumer and my autocommit.interval.ms property is set to 1000 milliseconds.
Is this 'redelivery of some already consumed messages' a known problem, or is there any other settings that I am missing here?
Basically, is there a way to ensure none of the previously consumed messages are getting picked up/consumed by the consumer?
Kafka uses Zookeeper to store consumer offsets. Since Zookeeper operations are pretty slow, it's not advisable to commit offset after consumption of every message.
It's possible to add shutdown hook to consumer that will manually commit topic offset before exit. However, this won't help in certain situations (like jvm crash or kill -9). To guard againts that situations, I'd advise implementing custom commit logic that will commit offset locally after processing each message (file or local database), and also commit offset to Zookeeper every 1000ms. Upon consumer startup, both these locations should be queried, and maximum of two values should be used as consumption offset.