stop kafka consumer from consuming messages - apache-kafka

Is there any way to stop kafka consumers from consuming messages for sometime ?
I want consumer to stop for sometime and later start consuming messages from the last unconsumed message.

Most Kafka libraries have close or pause methods on the Consumer implementation. Or, you could throw some fatal exception during consumption.
Resuming from the last uncommitted offset for a consumer group is the default behavior

Related

Time limit for pausing kafka consumer

I wanted to implement a functionality which requires Kafka queue to be paused and resume, what I want's to know that is there any time limit upto which it can be paused?
Kafka doesn't really have "queues", all messages in a topic are there to be consumed by Consumers. Your Consumers can consume messages in the way they prefer, a Consumer can start consuming messages from the beginning or from any offset they want, they can also stop and resume as they want.
When a Consumer consumes messages, it can commit the offsets back to Kafka, if the consumer dies, when it will be back, it will start from the last committed message.
If what you want is to poll a bunch of messages and do something with them for a long period of time, Kafka Consumers have a configuration max.poll.interval.ms that by default is 5 minutes. If you expect to consume a message and to be doing something with it for more than 5 minutes, you should increase that configuration, otherwise the consumer group will think your Consumer has died and will rebalance partitions.

CommitSync in Kafka

I'm using commitSync() after processing of messages in Kafka. I wanted to know for how much time commitSync() tries to commit before raising an error? And if it gives an error then would the same message be polled again later or it is assumed to be consumed?
If you don't specify a timeout, commitSync() blocks for the duration specified by default.api.timeout.ms. This is 60 seconds by default.
If it fails, that consumer instance will not poll the same messages again, it is considering consumed.
However, if that consumer instance was to crash, a new instance using the same consumer group would restart from the last successfully committed position.

Kafka Consumer Rebalancing : In-Flight Message Processing is Aborted

So when our application Scales-Up / Scales-Down, The Kafka Consumer Group Rebalances.
For example when the application Scales-Down, one of the consumer is killed and the partitions which were earlier assigned to this consumer is distributed across the other consumers in the group, When this process happens i have errors in my application logs saying the processing of the in flight message has been aborted
I know the entire consumer group pauses (i.e) does read any new messages while the consumer group is rebalancing. But what happens to the messages which were read by the consumers before pausing ? Can we gracefully handle the messages which are currently being processed ?
Thanks in advance!
The messages which were read but not committed will be ignored when consumer rebalance occurs.After the consumer rebalance is completed the consumers will resume consuming from the last committed offset , so you won't be loosing any message.

Should Kafka consumers be started before producers?

When I have a kafka console producer message produce some messages and then start a consumer, I am not getting the messages.
However i am receiving message produced by the producer after a consumer has been started.Should Kafka consumers be started before producers?
--from- beginning seems to give all messages including ones that are consumed.
Please help me with this on both console level and java client example for starting producer first and consuming by starting a consumer.
Kafka stores messages for a configurable amount of time. Default is a week. Consumers do not need to be "available" to receive messages, but they do need to know where they should start reading from
The console consumer has the default option of looking at the latest offset for all partitions. So if you're not actively producing data you see nothing as a consumer. You can specify a group flag for the console consumer or a Java client, and that's what tracks what offsets are read within the Kafka protocol and where a read request will resume from if you stopped that consumer in a group
Otherwise, I think you can only give an offset along with a single partition to consume from

Kafka Message at-least-once mode at multi-consumer

Kafka messaging use at-least-once message delivery to ensure every message to be processed, and uses a message offset to indicates which message is to deliver next.
When there are multiple consumers, if some deadly message cause a consumer crash during message processing, will this message be redelivered to other consumers and spread the death? If some slow message blocked a single consumer, can other consumers keep going and process subsequent messages?
Or even worse, if a slow and deadly message caused a consumer crash, will it cause other consumers start from its offset again?
There are a few things to consider here:
A Kafka topic partition can be consumed by one consumer in a consumer group at a time. So if two consumers belong to two different groups they can consume from the same partition simultaneously.
Stored offsets are per consumer group. So each topic partition has a stored offset for each active (or recently active) consumer group with consumer(s) subscribed to that partition.
Offsets can be auto-committed at certain intervals, or manually committed (by the consumer application).
So let's look at the scenarios you described.
Some deadly message causes a consumer crash during message processing
If offsets are auto-committed, chances are by the time the processing of the message fails and crashes the consumer, the offset is already committed and the next consumer in the group that takes over would not see that message anymore.
If offsets are manually committed after processing is done, then the offset of that message will not be committed (for simplicity, I am assuming one message is read and processed at a time, but this can be easily generalized) because of the consumer crash. So any other consumer in the group that is (will be) subscribed to that topic will read the message again after taking over that partition. So it's possible that it will crash other consumers too. If offsets are committed before message processing, then the next consumers won't see the message because the offset is already committed when the first consumer crashed.
Some slow message blocks a single consumer: As long as the consumer is considered alive no other consumer in the group will take over. If the slowness goes beyond the consumer's session.timeout.ms the consumer will be considered dead and removed from the group. So whether another consumer in the group will read that message depends on how/when the offset is committed.
Slow and deadly message causes a consumer crash: This scenario should be similar to the previous ones in terms of how Kafka handles it. Either slowness is detected first or the crash occurs first. Again the main thing is how/when the offset is committed.
I hope that helps with your questions.