Kafka Streams commits offset when producer throws an exception - apache-kafka

In my Kafka streams application I have a single processor that is scheduled to produce output messages every 60 seconds. Output message is built from messages that come from a single input topic. Sometimes it happens that the output message is bigger than the configured limit on broker (1MB by default). An exception is thrown and the application shuts down. Commit interval is set to default (60s).
In such case I would expect that on the next run all messages that were consumed during those 60s preceding the crash would be re-consumed. But in reality the offset of those messages is committed and the messages are not processed again on the next run.
Reading answers to similar questions it seems to me that the offset should not be committed. When I increase commit interval to 120s (processor still punctuates every 60s) then it works as expected and the offset is not committed.
I am using default processing guarantee but I have also tried exactly_once. Both have the same result. Calling context.commit() from processor seems to have no effect on the issue.
Am I doing something wrong here?

The contract of a Processor in Kafka Streams is, that you have fully processed an input record and forward() all corresponding output messages before process() return. -- This contract implies that Kafka Streams is allowed to commit the corresponding offset after process() returns.
It seem you "buffer" messages within process() in-memory to emit them later. This violated this contract. If you want to "buffer" messages, you should attach a state store to the Processor and put all those messages into the store (cf https://kafka.apache.org/25/documentation/streams/developer-guide/processor-api.html#state-stores). The store is managed by Kafka Streams for you and it's fault-tolerant. This way, after an error the state will be recovered and you don't loose any data (even if the input messages are not reprocessed).
I doubt that setting the commit interval to 120 seconds actually works as expected for all cases, because there is no alignment between when a commit happens and when punctuation is called.

Some of this will depend on the client you are using and whether it's based on librdkafka.
Some of the answer will also depend on how you are "looping" over the "poll" method. A typical example will look like the code under "Automatic Offset Committing" at https://kafka.apache.org/23/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html
But this assumes quite a rapid poll loop (100ms + processing time) and a auto.commit.timeout.ms at 1000ms (the default is usually 5000ms).
If I read your question correctly, you seem to consuming messages once per 60 seconds?
Something to be aware of is that the behavior of kafka client is quite tied to how frequently poll is called (some libraries will wrap poll inside something like a "Consume" method). Calling poll frequently is important in order to appear "alive" to the broker. You will get other exceptions if you do not poll at least every max.poll.interval.ms (default 5min). It can lead to clients being kicked out of their consumer groups.
anyway, to the point... auto.commit.interval.ms is just a maximum. If a message has been accepted/acknowledged or StoreOffset has been used, then, on poll, the client can decide to update the offset on the broker. Maybe due to client side buffer size being hit or some other semantic.
Another thing to look at (esp if using a librdkafka based client. others have something similar) is enable.auto.offset.store (default true) this will "Automatically store offset of last message provided to application" so every time you poll/consume a message from the client it will StoreOffset. If you also use auto.commit then your offset may move in ways you might not expect.
See https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md for the full set of config for librdkafka.
There are many/many ways of consuming/acknowledging. I think for your case, the comment for max.poll.interval.ms on the config page might be relevant.
"
Note: It is recommended to set enable.auto.offset.store=false for long-time processing applications and then explicitly store offsets (using offsets_store()) after message processing
"
Sorry that this "answer" is a bit long winded. I hope there are some threads for you to pull on.

Related

Reconsume Kafka Message that failed during processing due to DB error

I am new to Kafka and would like to seek advice on what is the best practice to handle such scenario.
Scenario:
I have a spring boot application that has a consumer method that is listening for messages via the #KafkaListner annotation. Once an incoming message has occurred, the consumer method will process the message, which simply performs database updates to different tables via JdbcTemplate.
If the updates to the tables are successful, I will manually commit the message by calling the acknowledge() method. If the database update fails, instead of calling the acknowledge() method, I will call the nack() method with a given duration (E.g. 10 seconds) such that the message will reappear again to be consumed.
Things to note
I am not concerned with the ordering of the messages. Whatever event comes I just have to consume and process it, that's all.
I am only given a topic (no retryable topic and no dead letter topic)
Here is the problem
If I do the above method, my consumer becomes inconsistent. Let's say if I call the nack() method with a duration of 1min, meaning to say after 1 min, the same message will reappear.
Within this 1 min, there could "x" number of incoming messages to be consumed and processed. The observation made was none of these messages are getting consumed and processed.
What I want to know
Hence, I hope someone will advise me what I am doing wrongly and what is the best practice / way to handle such scenarios.
Thanks!
Records are always received in order; there is no way to defer the current record until later, but continue to process other records after this one when consuming from a single topic.
Kafka topics are a linear log and not a queue.
You would need to send it to another topic; the #RetryableTopic (non-blocking retrties) feature is specifically designed for this use case.
https://docs.spring.io/spring-kafka/docs/current/reference/html/#retry-topic
You could also increase the container concurrency so at least you could continue to process records from other partitions.

Kafka producer resilience config: Fail but never block

I am currently learning some Kafka best practices from netflix (https://www.slideshare.net/wangxia5/netflix-kafka). It is a very good slide. However, I really dont understand one of the slides (slide 18) mentioned about producer resilience configuration, I hope someone in stackoverflow is very kind to give me insight for that (Cant find the video or reach out the author...).
The slide mentioned: Fail but never block in producer resilience configuration.
Block.on.buffer.full=false
Even thought this is the deprecated configuration, I guess the idea is to let producer fail right away rather than block to wait. In the latest kafka configuration, I can use a small value for block.max.ms to fail the producer to sends message rather than blocking it.
Question 1: Why we want to fail it right away, does it means retry later on rather than block it ?
Handle Potential Block for first meta data request
Question 2: I can understand the meta data in the consumer side. i.e registering consumer group and sort of stuff, but what is meta data request for producer point of view ? and is it potentially blocked ? Is there any kafka documentation to describe that
Periodically check whether Kafka producer was open successfully
Question 3: Is there a way we can check that and what benefits for that check ?
Thanks in advance :)
You have to keep in mind how a kafka producer works:
From the API-Documentation:
The producer consists of a pool of buffer space that holds records
that haven't yet been transmitted to the server as well as a
background I/O thread that is responsible for turning these records
into requests and transmitting them to the cluster.
If you call the send method to send a record to the broker, this message will be added to an internal buffer (the size of this buffer can be configured using the buffer.memory configuration property). Now different things can happen:
Happy path: The messages from the buffer will get converted into requests to the broker by the background I/O thread, the broker will ACK this messages and everything will be fine.
The messages can not be send to the kafka broker (connection to broker is broken, you are producing messages faster than they can send out, etc.). In this case it is up to you to decide what to do. Setting the max.block.ms (as an replacement for block.on.buffer.full) to a positive value the send message will block for this amount of time(1) and through a timeout exception afterwards.
Regarding your questions:
(1) If I got the slides right, Netflix explicitly wants to throw away the messages which they can't send to the broker (instead of blocking, retrying, failing ...). This of course highly depends on your application and the kind of messages you are dealing with. If it "just log messages" it might be no big deal. If it comes to financial transactions you may want to
(2) The producer needs some metadata about the cluster. E.g. it needs to know which key goes to which partition. There is a good blogpost by hortonworks how the producer works internaly. I think it is worth reading: https://community.hortonworks.com/articles/72429/how-kafka-producer-work-internally.html
Furthermore the statement:
Handle Potential Block for first meta data request
points to an issues which is as far as I know still around. The very first call of send will do a sync. metadata request to the broker and therefor may take longer.
(3) Connections to the producers are closed by the broker if the producer is idle for some time (see connections.max.idle.ms). I am not aware of some standard way to keep the connection of your consumer alive or even to check if the connection is still alive. What you could do is peridicaly send a metadatarequest to the broker (producer.partitionsFor(anyTopic)). But again maybe this is not an issue for your application.
(1) When it comes to details what is taken into account to calculate the time passed it get's a bit tricky. For max.block.ms it is actually:
metadata fetch time
buffer full block time
serialization time (customized serializer)
partitioning time (customized partitioner)

Kafka only once consumption guarantee

I see in some answers around stack-overflow and in general in the web the idea that Kafka does not support consumption acknowledge or that exactly once consumption is hard to achieve.
In the following entry as a sample
Is there any reason to use RabbitMQ over Kafka?, I can read the following statements:
RabbitMQ will keep all states about consumed/acknowledged/unacknowledged messages while Kafka doesn't
or
Exactly once guarantees are hard to get with Kafka.
This is not what I understand by reading the official Kafka documentation at:
https://kafka.apache.org/documentation/#design_consumerposition
The previous documentation states that Kafka does not use a traditional acknowledge implementation (as RabbitMQ). Instead they rely on the relationship partition-consumer and offset...
This makes the equivalent of message acknowledgements very cheap
Could somebody please explain why "only once consumption guarantee" in Kafka is difficult to achieve? and How this differs from Kafka vs other more traditional Message Broker as RabbitMQ? What am I missing?
If you mean exactly once the problem is like this.
Kafka consumer as you may know use a polling mechanism, that is consumers ask the server for messages. Also, you need to recall that the consumer commit message offsets, that is, it tells the cluster what is the next expected offset. So, imagine what could happen.
Consumer poll for messages and get message with offset = 1.
A) If consumer commit that offset immediately before processing the message, then it can crash and will never receive that message again because it was already committed, on next poll Kafka will return message with offset = 2. This is what they call at most once semantic.
B) If consumer process the message first and then commit the offset, what could happen is that after processing the message but before committing, the consumer crashes, so in that case next poll will get again the same message with offset = 1 and that message will be processed twice. This is what they call at least once.
In order to achieve exactly once, you need to process the message and commit that offset in an atomic operation, where you always do both or none of them. This is not so easy. One way to do this (if possible) is to store the result of the processing along with the offset of the message that generated that result. Then, when consumer starts it looks for the last processed offset outside Kafka and seek to that offset.

Multithreaded Kafka Consumer or PerPartition-PerConsumer

What should be the better approach while implementing kafka consumer.
Objective is read from Kafka and write back to db. Millions of Rows
Approach 1 :
Per Partition - Per Consumer - Wait for message to consume(i.e. written back to db) then proceed to next in polling loop.
Approach 2 :
Per Partition - Per Consumer - Send Record to worker thread or threadpool to be written back to db and later on commit the offset and keep on polling. Offset Management needs to be taken taken care. In this don't wait for message to written back to DB. Just keep on polling, pass the message to worker thread.
Any insights on both of them ?
Thanks
Approach 1:
The approach is applicable only if it is possible for you to estimate the message processing time otherwise it is not recommended.
Problem: In this approach the main problem is keeping the consumer alive, If you will wait for the messages to be completely processed before calling the poll() again, you have to make sure that your consumer should be alive until it calls poll() because kafka maintains a property named "session.timeout.ms". The kafka broker/cluster takes it action on the value of this property, if consumer is unable to call poll() again with in the time period of "session.timeout.ms", broker will mark consumer dead and it will be kicked out. Now, when consumer will finish the message processing and will call poll() again, it is considered as a new joiner and will again give the set of records starting from the offset as it was before. Keeping this scenario in mind, consumer will be stuck in an infinite loop where it will never proceed its offset.
Possible solution 1: To use this approach you need a good value of following property "session.timeout.ms" with the following side effects:
1: Value too low: Consumer will be marked dead as described above and will never proceed its offset, however messages will be processed but every time it finish the messages it will get the previous messages + new messages again.
2: Value too high: Broker will be very late in detecting the genuine failure of consumer that will result in record duplication and will effect the overall throughput.
Possible Solution 2: (Only valid for version 0.10.1.x) Official fix by Kafka in release (0.10.1.0).
In this approach, two notable entities are introduced: a new property "max.poll.interval.ms" that sets the maximum delay between client calls to poll() and a background thread that is responsible for keeping the consumer alive. So, in a scenario, when consumer calls a method poll() and then gets busy in message processing , the internal background thread will keep the heart beat alive and as a result consumer will stay alive. However, this internal background thread will itself remain alive until the timeout value for the property “max.poll.interval.ms” remains valid. So, this thread will wait for the consumer to call poll() with in the time period value of “max.poll.interval.ms” if not, it will send a leave request and will die itself as well."
Again the tricky part in this solution is to find a suitable value of this property: "max.poll.interval.ms" (very important, This time will be the time for which background thread will keep the heartbeat alive without the need of explicit calling poll()).
Approach 2: Using a worker thread is a good idea but then you have to maintain an internal queue or validation for received messages which can be complex and also you need to use manual commits against auto commits. For more information about commits see this and search heading "Commits and Offsets".
Problem: In this approach the main problem is to keep track of messages received and messages processed successfully. As, your consumer will receive the message it will pass message to respective worker thread and will commit the offset and move forward to receive more messages. During this process you have to take care of following issues:
What if the message is received and offset committed but later for whatever reason the worker thread failed to process the message, now how to get that message again ?
What if messages are received by consumer but there are no free worker threads to process ?
Solution: There can be different ways to resolve the above issues and one way is to use the internal queue to keep the messages and manual commits that will be sent only when worker thread will report the successful processing of the message. However a very careful implementation is required because it can leads to complex code and can also results in memory management or threading issues.
Suggestion: Depending upon your requirements, you can use one approach or the other with implementing fixed for the possible issues as described above. However I would recommend a more robust solution will be to use partition pause/resume. In very abstract way your consumer should do following steps:
1: poll () for messages.
2: Pause all the respective topics/partitions.
3: Assigned messages to worker threads and wait for their processing.
4: Keep calling poll() but as the partitions are paused there will be no extra message received while consumer will be kept alive. (Make sure no new topic is registered during this point)
5: If all worker threads should report message processing success/failure then commit the offsets accordingly.
6: Resume all the partitions.
Note: There can be better ways or other solutions possible depending upon your scenario and requirements. It's just an idea or one of the possible solutions.

Simple-Kafka-consumer message delivery duplication

I am trying to implement a simple Producer-->Kafka-->Consumer application in Java. I am able to produce as well as consume the messages successfully, but the problem occurs when I restart the consumer, wherein some of the already consumed messages are again getting picked up by consumer from Kafka (not all messages, but a few of the last consumed messages).
I have set autooffset.reset=largest in my consumer and my autocommit.interval.ms property is set to 1000 milliseconds.
Is this 'redelivery of some already consumed messages' a known problem, or is there any other settings that I am missing here?
Basically, is there a way to ensure none of the previously consumed messages are getting picked up/consumed by the consumer?
Kafka uses Zookeeper to store consumer offsets. Since Zookeeper operations are pretty slow, it's not advisable to commit offset after consumption of every message.
It's possible to add shutdown hook to consumer that will manually commit topic offset before exit. However, this won't help in certain situations (like jvm crash or kill -9). To guard againts that situations, I'd advise implementing custom commit logic that will commit offset locally after processing each message (file or local database), and also commit offset to Zookeeper every 1000ms. Upon consumer startup, both these locations should be queried, and maximum of two values should be used as consumption offset.