Consumer rebalance while using autocommit - apache-kafka

We're using consumer kafka client 0.10.2.0 with the following configuration:
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "true");
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, "1000");
props.put(ConsumerConfig.RECEIVE_BUFFER_CONFIG, 64 * 1024);
props.put(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG, 16 * 1024);
props.put(ConsumerConfig.PARTITION_ASSIGNMENT_STRATEGY_CONFIG, RoundRobinAssignor.class.getName());
props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, "30000");
props.put(ConsumerConfig.REQUEST_TIMEOUT_MS_CONFIG, "40000");
props.put(ConsumerConfig.HEARTBEAT_INTERVAL_MS_CONFIG, "10000");
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, "100");
So as you can see we're using autocommit.
The consumer API version that we're using has a dedicated thread for doing autocommit.
So every one second we have an autocommit which means that we have an heartbeat every one second.
Our application processing time may actually take(from time to time) more than 40 seconds (the request time out interval)
What I wanted to ask is:
1 - if the processing time will take , for example , a minute . will there be a rebalance although there is the autocommit heartbean every second?
2 - What more weird is that in case of long execution time it seems that we're getting the same message more than once. Is it normal? If the consumer has committed an offset , why the rebalance make the same offset being used again?
Thanks,
Orel

You can use KafkaConsumer.pause() / KafkaConsumer.resume() to prevent consumer re-balancing during long processing pauses. JavaDocs. Take a look at this question.
Re.2. Are you sure that these offsets are commited?

Just to clarify , AutoCommit check is called in every poll and it checks that the time elapsed is greater than configured time ,if yes then only it does the commit
Eg. if commit interval is 5 secs and poll is happening in 7 secs, In this case , the commit will happen after 7 sec
For your questions
Auto commit doesn't count for heartbeat , if there is long processing time then obviously commit will not happen and will lead to session timeout which in-turn triggers rebalance
This shouldn't happen unless you are seeking/resetting the offset to previously committed offset or the consumer rebalance occurred

From Kafka v0.10.1.0, you don't need to manually trigger auto commit to do heart beat. Kafka consumer itself initiates a new thread for heart-beat mechanism in background. To know more, read KIP-62.
In your case, you can set max.poll.interval.ms to the maximum time taken by your processor to handle the max.poll.record records.

Related

Spring Kafka - Re reading an offset after sometime

I am using #KafkaListener with props as
max.poll.records to 50. (Each record takes 40-60 sec to process)
enable-auto-commit=false
ack-mode to manual immediate
Below is the logic
#KafkaListener(groupId=“ABC”, topic=“Data1” containerFactory=“myCustomContainerFactory”)
public void listen(ConsumerRecord<String, Object> record, Acknowledge ack) {
try{
process(record);
ack.acknowledge();
}
Catch(e){
reprocess() // pause container and seek
}
}
Other props like max.poll.interval.ms, session.timeout.ms or heartbeat are of default values
I am not able to understand whats going wrong here,
Suppose if 500 msg are published to 2 partition
I am not sure why the consumer is not polling records as per max.poll.records prop actually its polls all 500 msg as soon as the application starts or msg are published by producer
Its observed that after processing some records say approx 5-7 mins consumer re reads an offset again.. which actually was read fine processed and acknowledged..
After a hour the log file shows that same messages are read multiple times.
Any help is appreciated
Thanks.
The default max.poll.interval.ms is 300,000 milliseconds (5 minutes).
You either need to reduce max.poll.records or increase the interval - otherwise Kafka will force a rebalance due to a non-responsive consumer.
With such a large processing time, I would recommend max.poll.records=1; you clearly don't need higher throughput.

Kafka consuming same messge multiple times

I am queuing a single message in kafka queue.
Kafka Prop :
enable.auto.commit=true
auto.commit.interval.ms=5000
max.poll.interval.ms=30000
Processing of my message was taking around 10mins. So message was keep on processing after every 5 mins.
Then I have changed the prop max.poll.interval.ms to 20 mins. Now the issue is fixed.
But my question is this : why is this happening. Since I already have auto commit enabled and it should happen every 5sec, then why my messages are not marked committed in the former case
When enable.auto.commit is set to true, then the largest offset is committed every auto.commit.interval.ms of time. However, this happens only whenever poll() is called. In every poll and in your case every 20mins (max.poll.interval.ms), the enable.auto.commit is checked. Whenever you poll(), the consumer checks if it is time to commit the offsets it returned in the last poll.
Now in your case, poll() is called every 20 minutes which means that it might even take up to additional 20 minutes (+5000ms) before committing the offset.

What is the delay time between each poll

In kafka documentation i'm trying to understand this property max.poll.interval.ms
The maximum delay between invocations of poll() when using consumer group management. This places an upper bound on the amount of time that the consumer can be idle before fetching more records. If poll() is not called before expiration of this timeout, then the consumer is considered failed and the group will rebalance in order to reassign the partitions to another member.
This mean each poll will happen before the poll-time-out by default it is 5 minutes. So my question is exactly how much time consumer thread takes between two consecutive polls?
For example: Consumer Thread 1
First poll--> with 100 records
--> process 100 records (took 1 minute)
--> consumer submitted offset
Second poll--> with 100 records
--> process 100 records (took 1 minute)
--> consumer submitted offset
Does consumer take time between first and second poll? if yes, why? and how can we change that time ( assume this when topic has huge data)
It's not clear what you mean by "take time between"; if you are talking about the spring-kafka listener container, there is no wait or sleep, if that's what you mean.
The consumer is polled immediately after the offsets are committed.
So, max.poll.interval.ms must be large enough for your listener to process max.poll.records (plus some extra, just in case).
But, no, there are no delays added between polls, just the time it takes the listener to handle the results of the poll.

Relationship between maxPollRecords and autoCommitEnable in kafka

Can Someone Please give me some good example and relationship between the kafka params maxPollRecords and autoCommitEnable in Kafka.
There is no relationship as such between them . Let me explain the two configs to you.
In Kafka there are two ways a consumer can commit offsets -
1.Manual Offset Commit - where the responsibility of committing offsets lies with the developer.
2.Enable Auto Commit- This is where the Kafka consumer takes the responsibility of committing offsets for you. How it works is, on every poll() call you make on the consumer , it is checked whether it is time to commit the offset ( this is dictated by auto.commit.interval.ms configuration), if it is time, it commits the offset.
For example - Suppose the auto.commit.interval.ms is set to 7 secs and every call to poll() takes 8 secs. So on a particular call to poll(), it will check, if the time to commit offset has elapsed , which in this example would have , then it will commit the offsets fetched from the previous poll.
Offsets are also committed during the closing of a consumer.
Here are some links you can look at -
https://kafka.apache.org/documentation/#consumerconfigs
https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html
Does kafka lose message if consumer holds message longer then auto commit interval time?
Now , onto Max.poll.records. By, this configuration, you can tell the kafka consumer, what are the maximum number of records , you would like it return on a single call to poll(). Note you will generally not change the defaults for this , unless your record processing is slow , and you want to ensure that your consumer is not considered dead , because of the slowness of processing too many records.

Does kafka lose message if consumer holds message longer then auto commit interval time?

Say if auto-commit interval time is 30 seconds, consumer for some reasons could not process the message and hold it longer than 30 seconds then crash. does the auto-commit offset mechanism commits this offset anyway right before consumer crash?
If my assumption is correct, the message is lost as its offset committed but the message itself has not been processed?
Lets consider your Consumer group name is Test and you have a single consumer in the Consumer Group.
When Auto-Commit is enabled, offsets are committed only during poll() calls and during closing of a consumer.
For example- auto.commit.interval.ms is 5 secs, and every call to poll() takes 7 secs. When making every call to poll(), it will check if the auto commit interval has elapsed, if it has, like in the above example, it will commit the offset.
Offsets are also committed during closing of a consumer.
From the documentation -
"Close the consumer, waiting for up to the default timeout of 30 seconds for any needed cleanup. If auto-commit is enabled, this will commit the current offsets if possible within the default timeout".
You can read more about it here -
https://kafka.apache.org/10/javadoc/index.html?org/apache/kafka/clients/consumer/KafkaConsumer.html
Now, onto your question, if poll() is not called again or consumer is not closed, it won't commit the offset.
If the Consumer receives message N, commits it and then crashes before having fully processed it then by default the Consumer will considered this message processed.
Note that the message is still on the broker, so it can be re-consumed to be processed. But that require some logic in your application to not only restart from last committed position but also check if previous records were processed successfully.
If your application typically takes a long time to process messages, maybe you want to switch to manual commit instead of auto. That way you'll be able to better control when you commit and avoid this issue.