heartbeat failed for group because it's rebalancing - apache-kafka

What's the exact reason to have heartbeat failure for group because it's rebalancing ? What's the reason for rebalance where all the consumers in group are up ?
Thank you.

Heartbeats are the basic mechanism to check if all consumers are still up and running. If you get a heartbeat failure because the group is rebalancing, it indicates that your consumer instance took too long to send the next heartbeat and was considered dead and thus a rebalance got triggered.
If you want to prevent this from happening, you can either increase the timeout (session.timeout.ms), or make sure your consumer sends heartbeat more often (heartbeat.interval.ms). Heartbeats are basically embedded in poll(), thus, you need to make sure you call poll frequently enough. This can usually be achieved by limit the number of records a single poll returns via max.poll.records (to shorten the time it takes to process all data that got fetched).
Update
Since Kafka 0.10.1, heartbeats are sent in a background thread, and not when poll() is called (cf. https://cwiki.apache.org/confluence/display/KAFKA/KIP-62%3A+Allow+consumer+to+send+heartbeats+from+a+background+thread). In this new design, configuration session.timeout.ms and heartbeat.interval.ms are still the same. Additionally, there is max.poll.interval.ms that determines how often poll() must be called. If you miss to call poll() within max.poll.interval.ms, the heartbeat thread assume that the processing thread died, and will send a leave-group-request that will trigger a rebalance, and the heartbeat thread will stop sending heartbeats afterwards. If you processing thread is ok but just slow, the next call to poll() will initiate another rebalance to re-join the group again.
For more details, cf. Difference between session.timeout.ms and max.poll.interval.ms for Kafka >= 0.10.1

Related

When does kafka consumer get evicted from the group?

I am using spring kafka and want to know when does kafka consumer get evicted from the group. Does it get evicted when the processing time taken is more than the poll interval? If yes then isn't the purpose of the heartbeat to indicate the consumer is alive and if that happens then the consumer should never be evicted unless the process itself fails.
You are correct that the heartbeat thread tells the group that the consumer process is still alive. The reason for additionally considering a consumer to be gone when there is excessive time between polls is to prevent livelock.
Without this, a consumer might never poll, and so would take partitions without making any progress through them.
The question then is really why there is a heartbeat and session timeout. The heartbeat thread is actually doing other stuff (pre-fetching) but I assume the reason it is used to check that consumers are alive is that it is generally talking to the broker more frequently than the polling thread as the latter has to process messages, and so a failed consumer process will be spotted earlier.
In short there are 3 things that can trigger a rebalance - a change in number of partitions at the broker end, polling taking longer than max.poll.interval.ms, and gap between heartbeats longer than session.timeout.ms

Kafka Consumer - continue calling poll() while paused?

I read the docs on using the pause and resume methods for a kafka consumer, and they seem easy enough to implement. However, do I need another thread to continue calling the poll() method while paused to meet the heartbeat requirements and not trigger a rebalance?
My consumer is running SQL scripts after polling the topic and depending the messages returned, the scripts may take longer than the current session.timeout.ms interval (we have increased this value, but the length of time for the scripts to run can vary quiet a bit and regardless of the interval we will exceed it at times). I also want to avoid a rebalance as safe ordering and data integrity are more important than throughput and error detention.
From version 0.10.1.0 heartbeat is sent via a separate thread so pausing your process thread wouldn't affect heartbeat thread.
You can check this for more information.
yes, you need to continue calling poll() on the consumer, even if you pause all partitions, or it will be kicked out of any consumer group its a member of and its assigned partitions will transfer to another consumer. as to which thread ends up calling poll - that doesnt matter (so long as only a single thread interacts with the consumer at a time)
quoting from kip-62:
max.poll.interval.ms. This config sets the maximum delay between client calls to poll(). When the timeout expires, the consumer will stop sending heartbeats and send an explicit LeaveGroup request.

Do Kafka consumers spin on poll() or are they woken up by a broadcast/signal from the broker?

If I poll() from a consumer in a while True: statement, I see that poll() is blocking. If the consumer is up to date with messages from the topic (offset = OFFSET_END) how is the consumer conducting it's blocking poll()?
Does the consumer default adhere to a pub/sub mentality in which it sleeps and waits for a publish and a broadcast/signal from the broker?
Or is the consumer constantly spinning itself checking the topic?
I'm using the confluent python client, if that matters.
Thanks!
kafka consumers are basically long poll loops, driven (asynchronously) by the user thread calling poll().
the whole protocol is request-response, and entirely client driven. there is no form of broker-initiated "push".
fetch.max.wait.ms controls how long any single broker will wait before responding (if no data), while blocking of the user thread is controlled by argument to poll()
Yes, you are right its while a true condition that waits to consume the message till waiting timeout time.
If it receives a message it will return immediately otherwise it will await to passed timeout and return an empty record.
Kafka Broker use the below parameter to control message to send to Consumer
fetch.min.bytes: The broker will wait for this amount of data to fill BEFORE it sends the response to the consumer client.
fetch.wait.max.ms: The broker will wait for this amount of time BEFORE sending a response to the consumer client unless it has enough data to fill the response (fetch.message.max.bytes)
There is a possibility to take a long time to call the next poll() due to the processing of consumed messages. max.poll.interval.ms prevent not to process take so much time and call the next poll within max.poll.interval.ms otherwise consumer leaves the group and trigger rebalance.
You can get more detail about this here
max.poll.interval.ms: By increasing the interval between expected polls, you can give the consumer more time to handle a batch of
records returned from poll(long). The drawback is that increasing this
value may delay a group rebalance since the consumer will only join
the rebalance inside the call to poll. You can use this setting to
bound the time to finish a rebalance, but you risk slower progress if
the consumer cannot actually call poll often enough.
max.poll.records: Use this setting to limit the total records returned from a single call to a poll. This can make it easier to
predict the maximum that must be handled within each poll interval. By
tuning this value, you may be able to reduce the poll interval, which
will reduce the impact of group rebalancing.

Difference between session.timeout.ms and max.poll.interval.ms for Kafka >= 0.10.1

I am unclear why we need both session.timeout.ms and max.poll.interval.ms and when would we use one or the other or both? It seems like both settings indicate the upper bound on the time the coordinator will wait to get the heartbeat from a consumer before assuming it's dead.
Also how does it behave for versions 0.10.1.0+ based on KIP-62?
Before KIP-62, there is only session.timeout.ms (ie, Kafka 0.10.0 and earlier). max.poll.interval.ms is introduced via KIP-62 (part of Kafka 0.10.1).
KIP-62, decouples heartbeats from calls to poll() via a background heartbeat thread, allowing for a longer processing time (ie, time between two consecutive poll()) than heartbeat interval.
Assume processing a message takes 1 minute. If heartbeat and poll are coupled (ie, before KIP-62), you will need to set session.timeout.ms larger than 1 minute to prevent consumer to time out. However, if a consumer dies, it also takes longer than 1 minute to detect the failed consumer.
KIP-62 decouples polling and heartbeat allowing to send heartbeats between two consecutive polls. Now you have two threads running, the heartbeat thread and the processing thread and thus, KIP-62 introduced a timeout for each. session.timeout.ms is for the heartbeat thread while max.poll.interval.ms is for the processing thread.
Assume, you set session.timeout.ms=30000, thus, the consumer heartbeat thread must sent a heartbeat to the broker before this time expires. On the other hand, if processing of a single message takes 1 minutes, you can set max.poll.interval.ms larger than one minute to give the processing thread more time to process a message.
If the processing thread dies, it takes max.poll.interval.ms to detect this. However, if the whole consumer dies (and a dying processing thread most likely crashes the whole consumer including the heartbeat thread), it takes only session.timeout.ms to detect it.
The idea is, to allow for a quick detection of a failing consumer even if processing itself takes quite long.
Implemenation Detail
The new timeout max.poll.interval.ms is mainly a client side concept: if poll() is not called within max.poll.interval.ms, the heartbeat thread will detect this case and send a leave-group request to the broker. -- max.poll.interval.ms is still relevant for consumer group rebalances: if a rebalance is triggered, consumers have max.poll.interval.ms time to re-join the group by calling poll() client side which triggers a join-group request.

Confluent Kafka Consumer Configuration - How session.timeout.ms and max.poll.interval.ms are related?

I'm trying to understand how the default values of below two confluent consumer configurations work together.
max.poll.interval.ms - As per confluent documentation, the default value is 300,000 ms
session.timeout.ms - As per confluent documentation, the default value is 10,000 ms
heartbeat.interval.ms - As per confluent documentation, the default value is 3,000 ms
Let's say if I'm using these default values in my configuration. Now I've a question here.
For example, let's assume for a consumer, consumer is sending heartbeats every 3,000 ms and my first poll happened at the timestamp t1 and then second poll happened at t1 + 20,00 ms. Then would it cause a re-balance because this exceed the 'session.timeout.ms' ? or would it work fine as the consumer did send a heartbeat as per the expected timestamp?
In previous thread Here also explained about session time out and max poll timeout. Let me also explain about my understanding on this.
ConsumerRecords poll(final long timeout):
is used to fetch data sequentially from topic's partition starting from last consumed offset or manual set offset. This will return immediately if there are record available otherwise it will await the passed timeout. If timeout passes will return empty record.
The poll API keep calling to fetch any new message arrived as well as its ensure liveness of consumer.Underneath the covers
session.timeout.ms During each poll Consumer coordinator send heartbeat to broker to ensure that consumer's session live and active. If broker didn't receive any heartbeat till session.timeout.ms broker then broker leave that consumer and do rebalance
You can assume session.timeout.ms is maximum time broker wait to get heartbeat from consumer whereas heartbeat.interval.ms is expected time consumer suppose to send heartbeat to Broker.
thats explained heartbeat.interval.ms always less than session.timeout.ms because ideal case 1/3 of session timeout.
max.poll.interval.ms : The maximum delay between invocations of poll() when using consumer group management. That means consumer maximum time will be idle before fetching more records.If poll() is not called before expiration of this timeout, then the consumer is considered failed and the group will rebalance by calling poll in order to reassign the partitions to another consumer instance.
If we are doing long batch processing its good to increase max.poll.interval.ms but please note increasing this value may delay a group rebalance since the consumer will only join the rebalance inside the call to poll. We can tune by keeping max poll interval low by tuning max.poll.records.
Now let's discuss how they relate to each other.
Consumer while calling poll its check heartbeat, session time out poll time out in background as below manner:
Consumer coordinator check if consumer is not in rebalancing state if still rebalancing then wait coordinator to join the consumer. wait and call poll . Please note if max.poll.interval.ms large it will take more time to rebalance.
After poll and rebalance completed coordinator check session time out
if session timeout has expired without seeing a successful heartbeat, old coordinator will get disconnected so next poll will try to rebalance.
So Session timeout directly dependent time coordinator liveness if session time out consumer coordinator itself get dead and call poll will have to assign new coordinator before rebalancing.
After session timeout check coordinator validate heartbeat.pollTimeoutExpired if poll timeout has expired, which means that the foreground thread has stalled in between calls to poll(), so member explicitly leave the group and call poll to get join new consumer not whole consumer group coordinator.
After session time out and poll time out validation and before sending heart beat status , consumer coordinator check heart beat timeout, if heart beat exceed max delay heart beat time then pause/wait to retry backoff and poll again.
If heartbeat time is also in limit not exceed then consumer coordinator sent sendHeartbeatRequest
In case of sendHeartbeatRequest success thread will reset heartbeat time and call poll but in case of fail and consumer group is not in rebalance state it will wakeup consumer group coordinator to call poll again.
As mentioned on shared link polling is independent with heartbeat so during polling in case poll is quite larger heartbeat still allow to sent heartbeat which make sure your thread are live means session time out doesn't directly link to poll .
session.timeout.ms: Max time to receive heart beat
max.poll.interval.ms: Max time on independent processing thread
So if you set max.poll.interval.ms 300,000 then will have 300,000 ms to next poll that means consumer thread have max 300,000 ms to complete processing. In between heartbeat will keep sending heartbeat request at heartbeat.interval.ms i.e. 3,000 to indicate thread is still live and in case no heartbeat till session.timeout.ms i.e. 10,000 coordinator will be dead and call poll to reassign new coordinator and rebalancing