We have 2 kafka cluster each with 6 nodes active and standby. And a topic with 12 partions and 12 instances of app is running. Our app is a consumer and all consumers use same consumer group ID to receive events from kafka. Event processing from kafka is sequential which is 'event comes in-> process the event->do manual ack'. This event processing takes approx 5 seconds to complete and do manual acknowledge. Though, they are multiple instances only one event at a time will be processed. But, recently we found an issue in production that consumer re balancing is happening for every 2 seconds and due to this message offset commit(manual ack) is being failed and same event is being sent twice and resulting in duplicate record insertion in database.
Kafka consume config values are:
max.poll.interval.ms = 300000//5 mins
max.poll.records = 500
heartbeat.interval.ms=3000
session.timeout.ms=10000
Errors seen:
commit offset failed since the consumer is not part of active group.
2.Time between subsequent calls to poll() was longer than configured max poll interval milli seconds which typically implies that poll loop is spending too much of time message processing.
but message processing is taking 5 seconds not more than configured max poll interval which is 5 mins. As it is sequential processing, only one consumer can poll and get event at once and other instances had to wait till their turn to poll, is it causing above 2 errors and rebalancing? Appreciate the help.
Related
I have question regarding handling of consumers death due to exceeding the timeout values.
my example configuration:
session.timeout.ms = 10000 (10 seconds)
heartbeat.interval.ms = 2000 (2 seconds)
max.poll.interval.ms = 300000 (5 minutes)
I have 1 topic, 10 partitions, 1 consumer group, 10 consumers (1 partition = 1 consumer).
From my understanding consuming messages in Kafka, very simplified, works as follows:
consumer polls 100 records from topic
a heartbeat signal is sent to broker
processing records in progress
processing records completes
finalize processing (commit, do nothing etc.)
repeat #1-5 in a loop
My question is, what happens if time between heartbeats takes longer than previously configured session.timeout.ms. I understand the part, that if session times out, the broker initializes a re-balance, the consumer which processing took longer than the session.timeout.ms value is marked as dead and a different consumer is assigned/subscribed to that partition.
Okey, but what then...?
Is that long-processing consumer removed/unsubscribed from the topic and my application is left with 9 working consumers? What if all the consumers exceed timeout and are all considered dead, am I left with a running application which does nothing because there are no consumers?
Long-processing consumer finishes processing after re-balancing already took place, does broker initializes re-balance again and consumer is assigned a partition anew? As I understand it continues running #1-5 in a loop and sending a heartbeat to broker initializes also process of adding consumer to the consumers group, from which it was removed after being given dead status, correct?
Application throws some sort of exception indicating that session.timeout.ms was exceeded and the processing is abruptly stopped?
Also what about max.poll.interval.ms property, what if we even exceed that period and consumer X finishes processing after max.poll.interval.ms value? Consumer already exceeded the session.timeout.ms value, it was excluded from consumer group, status set to dead, what difference does it gives us in configuring Kafka consumer?
We have a process which extracts data for processing and this extraction consists of 50+ SQL queries (majority being SELECT's, few UPDATES), they usually go fast but of course all depends on the db load and possible locks etc. and there is a possibility that the processing takes longer than the session's timeout. I do not want to infinitely increase sessions timeout until "I hit the spot". The process is idempotent, if it's repeated X times withing X minutes we do not care.
Please find the answers.
#1. Yes. If all of your consumer instances are kicked out of the consumer group due to session.timeout, then you will be left with Zero consumer instance, eventually, consumer application is dead unless you restart.
#2. This depends, how you write your consumer code with respect to poll() and consumer record iterations. If you have a proper while(true) and try and catch inside, you consumer will be able to re-join the consumer group after processing that long running record.
#3. You will end up with the commit failed exception:
failed: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing max.poll.interval.ms or by reducing the maximum size of batches returned in poll() with max.poll.records.
And again it depends on your code, to auto join into the consumer group.
#4. Answer lies here
session.timeout.ms
The amount of time a consumer can be out of contact with the brokers while still
considered alive defaults to 3 seconds. If more than session.timeout.ms passes
without the consumer sending a heartbeat to the group coordinator, it is considered
dead and the group coordinator will trigger a rebalance of the consumer group to
allocate partitions from the dead consumer to the other consumers in the group. This
property is closely related to heartbeat.interval.ms. heartbeat.interval.ms con‐
trols how frequently the KafkaConsumer poll() method will send a heartbeat to the
group coordinator, whereas session.timeout.ms controls how long a consumer can
go without sending a heartbeat. Therefore, those two properties are typically modi‐
fied together—heatbeat.interval.ms must be lower than session.timeout.ms, and
is usually set to one-third of the timeout value. So if session.timeout.ms is 3 sec‐
onds, heartbeat.interval.ms should be 1 second. Setting session.timeout.ms
lower than the default will allow consumer groups to detect and recover from failure
sooner, but may also cause unwanted rebalances as a result of consumers taking
longer to complete the poll loop or garbage collection. Setting session.timeout.ms
higher will reduce the chance of accidental rebalance, but also means it will take
longer to detect a real failure.
I am implementing Kafka delayed topic consumption with consumer.pause(<partitions>).
Pub/Sub Kafka shim turns pause into a NoOp:
https://github.com/googleapis/java-pubsublite-kafka/blob/v0.6.7/src/main/java/com/google/cloud/pubsublite/kafka/PubsubLiteConsumer.java#L590-L600
Is there any documentation on how to delay consumption of a pub sub lite topic by a set duration?
i.e. I want to consume all messages from a Pub/Sub Lite topic but with a synthetic 4 minute lag.
Here is my algorithm with Kafka native:
call consumer.poll()
resume all assigned partitions consumer.resume(consumer.assignment())
combine previously delayed records with recently polled records
separate records into
records that are old enough to process
records still too young to process
pause partitions for any records that are too young consumer.pause(<partitions of too young>)
keep a buffer of too young records to reconsider on the next pass, called delayed
processes records that are old enough
rinse, repeat
We only commit offsets of records that are old enough, if the process dies any records in the “too young” buffer will remain uncommitted and they will be revisited by whichever consumer receives the partition in the ensuing rebalance.
Is there a more generalized form of this algorithm that will work with native Kafka and Pub/Sub Lite?
Edit: CloudTasks is a bad idea here as it disconnects the offset commit chain. I need to ensure I only commit offsets for records that have gotten an ack from the downstream system.
Something similar to the above would likely work fine if you remove the pause and resume stages. I'd note that with both systems, you are not guaranteed to receive all messages that exist on the server until now in any given poll() call, so you may add extra delay if you are not given any records for a given partition in a poll call.
If you do the following with autocommit enabled, you should effectively delay processing by strictly more than 4 minutes.
call consumer.poll()
sleep until every record 4 minutes old
process records
go to 1.
If you use manual commits, you can make the sleeps closer to 4 minutes on a per-message basis, but with the downside of needing to manage offsets manually:
call consumer.poll()
put records into ordered per-partition buffers
sleep until the oldest record for any partition is 4 minutes in the past
process records which are more than 4 minutes in the past
commit offsets for processed records
go to 1
Scenario:
Committing offsets manually after processing the messages.
session.timeout.ms: 10 seconds
max.poll.interval.ms: 5 minutes
Processing of messages consumed in a "poll()" is taking 6 minutes
Timeline:
A (0 seconds): app starts poll(), have consumed the messages and started processing (will take 6 minutes)
B (3 seconds): a heartbeat is sent
C (6 seconds): another heartbeat is sent
D (5 minutes): another heartbeat is sent (5 * 60 % 3 = 0) BUT "max.poll.interval.ms" (5 minutes) is reached
At point "D" will consumer:
send "LeaveGroup request" to consider this consumer "dead" and re-balance?
continue sending heartbeats every 3 seconds ?
If point "1" is the case, then
a. how will this consumer commit offsets after completing the processing of 6 minutes considering that its partition(s) are changed due to re-balancing at point "D" ?
b. should the "max.poll.interval.ms" be set in prior according to the expected processing time ?
If point "2" is the case, then will we never know if the processing is actually blocked ?
Thankyou.
Starting with Kafka version 0.10.1.0, consumer heartbeats are sent in a background thread, such that the client processing time can be longer then the session timeout without causing the consumer to be considered dead.
However, the max.poll.interval.ms still sets the maximum allowable time for a consumer to call the poll method.
In your case, with a processing time of 6 minutes it would mean at point "d" that your consumer will be considered dead.
Your concerns are right, as the consumer will then not be able to commit the messages after 6 minutes. Your consumer will get a CommitFailedExcpetion (as described in another anser on CommitFailedExcpetion.
To conclude, yes, you need to increase the max.poll.interval.ms time if you already know that your processing time will exceed the default time of 5 minutes.
Another option would be to limit the fetched records during a poll by decreasing the configuration max.poll.records which defaults to 500 and is described as: "The maximum number of records returned in a single call to poll()".
We have several applications consuming from Kafka, that regularly encounter a DisconnectException.
What happens is always like the following:
The application is subscribed on say partitions 5 and 6, messages are processed from both partitions
From time T, no message is consumed on partition 5, only messages of partition 6 are consumed.
At T + around 5 minutes, Kafka consumer spits many log lines:
Error sending fetch request (sessionId=552335215, epoch=INITIAL) to node 0: org.apache.kafka.common.errors.DisconnectException.
After that, the consumption resumes from partition 5 and 6 and catches up the accumulated lag
Same issue occurs if the application consumes a single partition : in this case, no message is consumed for 5 minutes.
My understanding according to https://issues.apache.org/jira/browse/KAFKA-6520 is that in case of connection issue, the Kafka consumer retries (with backoff, up to 1 second by default according to reconnect.backoff.max.ms config), hiding the issue to the end user. The calls to poll() return 0 message, so the polling loop goes on and on.
However, some interrogations:
If the fetch fails due to connection issue, then the broker does not receive these requests and after the "max.poll.interval.ms" (50 seconds in our case) it should expel the consumer and trigger a rebalance. This is not happening, why?
Since the Kafka consumer retries every second, why would it take systematically 5 minutes to reconnect? Unless there is some infrastructure / network issue going on...
Otherwise, any client side configuration param which could explain the 5 minutes delay? Could this delay be somehow related to "metadata.max.age.ms"? (5 min by default)
In kafka documentation i'm trying to understand this property max.poll.interval.ms
The maximum delay between invocations of poll() when using consumer group management. This places an upper bound on the amount of time that the consumer can be idle before fetching more records. If poll() is not called before expiration of this timeout, then the consumer is considered failed and the group will rebalance in order to reassign the partitions to another member.
This mean each poll will happen before the poll-time-out by default it is 5 minutes. So my question is exactly how much time consumer thread takes between two consecutive polls?
For example: Consumer Thread 1
First poll--> with 100 records
--> process 100 records (took 1 minute)
--> consumer submitted offset
Second poll--> with 100 records
--> process 100 records (took 1 minute)
--> consumer submitted offset
Does consumer take time between first and second poll? if yes, why? and how can we change that time ( assume this when topic has huge data)
It's not clear what you mean by "take time between"; if you are talking about the spring-kafka listener container, there is no wait or sleep, if that's what you mean.
The consumer is polled immediately after the offsets are committed.
So, max.poll.interval.ms must be large enough for your listener to process max.poll.records (plus some extra, just in case).
But, no, there are no delays added between polls, just the time it takes the listener to handle the results of the poll.