How to handle exceptions and message reprocessing in apache kafka

How to handle exceptions and message reprocessing in apache kafka - apache-kafka

I have a kafka cluster. There is only one topic and to this topic 3 different consumer groups are taking the same messages from the topic and processing differently according to their own logic.
is there any problem with creating same topic for multiple consumer groups?
I am getting this doubt, as i am trying to implement exception topic and try to reprocess this messages.
suppose, i have message "secret" in topic A.
my all 3 consumer groups took the message "secret".
2 of my consumer groups successfully completed the processing of message.
But for one of my consumer group failed to process the message.
so i kept the message in topic "failed_topic".
I want to try to process this message for my failed consumer. But if i keep this message in my actual topic A, the other 2 consumer groups process this message second time.
Can some one please let me know how i can implement perfect reprocessing for this scenario ?

First of all in Kafka each consumer group has its own offset for each topic-partition subscribed and these offsets are managed seperately by consumer groups. So failing in one consumer group doesn't affect other consumer groups.
You can check current offsets for a consumer group with this cli command:
bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --describe --group my-group
is there any problem with creating same topic for multiple consumer
groups
No. There is no problem. Actually this is a normal behaivour of topic based publisher/subscriber pattern.
To implement re-processing logic there are some important points to consider:
You should keep calling poll() even you are re-processing same
message. Otherwise after max.poll.interval.ms your consumer
will be considered dead and be revoked.
By calling poll() you will get messages that your consumer group have not
read yet. So when you poll() you will get messages up to
max.poll.records when you poll() again, for this time you will get
next group of messages. So for reprocessing failed messages you need
to call seek method.
public void seek(TopicPartition partition, long offset) : Overrides
the fetch offsets that the consumer will use on the next poll(timeout)
Ideally your number of consumers in consumer group should be
equal to number of partitions of the topic subscribed. Kafka will
take care of assigning partitions to consumers evenly. (one partition
per consumer) But even this condition is satisfied at the very
beginning, after some time a consumer may die and Kafka may assign
more than one partitions to one consumer. This can lead some problems. Suppose that your consumer is responsible for two partitions, when you poll() you will get messages from both of these partitions and when a message cannot be consumed you should seek all of the partitions which is assigned (not just the one failed message comes from). Otherwise you may skip some messages.
Let's try to write some pseudocode to implement re-process logic in case of exception by using these informations:
public void consumeLoop() {
while (true) {
currentRecord = consumer.poll(); //max.poll.records = 1
if (currentRecord != null) {
try {
processMessage(currentRecord);
} catch (Exception e) {
consumer.seek(new TopicPartition(currentRecord.topic(), currentRecord.partition()), currentRecord.offset());
continue;
}
consumer.commitSync(Collections.singletonMap(topicPartition, new OffsetAndMetadata(currentRecord.offset() + 1)));
}
}
}
Notes about the code:
max.poll.records is set to one to make seek process simple.
In every exception we seek and poll to get same message again. (we
have to poll to be considered alive by Kafka)
auto.commit is disabled

is there any problem with creating same topic for multiple consumer groups?
Not at all
if i keep this message in my actual topic A, the other 2 consumer groups process this message second time.
Exactly, and you would create a loop (third group would fail, put it back, 2 accept it, third fails again, etc, etc)
Basically, you are asking about a "dead-letter queue" which would be a specific topic for each consumer group. Kafka can hold tens of thousands of topics, so this shouldn't be an issue in your use-case.

Related

Is it possible to reset offsets to a topic for a kafka consumer group in a kafka connector?

My kafka sink connector reads from multiple topics (configured with 10 tasks) and processes upwards of 300 records from all topics. Based on the information held in each record, the connector may perform certain operations.
Here is an example of the key:value pair in a trigger record:
"REPROCESS":"my-topic-1"
Upon reading this record, I would then need to reset the offsets of the topic 'my-topic-1' to 0 in each of its partitions.
I have read in many places that creating a new KafkaConsumer, subscribing to the topic's partitions, then calling the subscribe(...) method is the recommended way. For example,
public class MyTask extends SinkTask {
#Override
public void put(Collection<SinkRecord> records) {
records.forEach(record -> {
if (record.key().toString().equals("REPROCESS")) {
reprocessTopicRecords(record);
} else {
// do something else
}
});
}
private void reprocessTopicRecords(SinkRecord record) {
KafkaConsumer<JsonNode, JsonNode> reprocessorConsumer =
new KafkaConsumer<>(reprocessorProps, deserializer, deserializer);
reprocessorConsumer.subscribe(Arrays.asList(record.value().toString()),
new ConsumerRebalanceListener() {
public void onPartitionsRevoked(Collection<TopicPartition> partitions) {}
public void onPartitionsAssigned(Collection<TopicPartition> partitions) {
// do offset reset here
}
}
);
}
}
However, the above strategy does not work for my case because:
1. It depends on a group rebalance taking place (does not always happen)
2. 'partitions' passed to the onPartitionsAssigned method are dynamically assigned partitions, meaning these are only a subset to the full set of partitions that will need to have their offset reset. For example, this SinkTask will be assigned only 2 of the 8 partitions that hold the records for 'my-topic-1'.
I've also looked into using assign() but this is not compatible with the distributed consumer model (consumer groups) in the SinkConnector/SinkTask implementation.
I am aware that the kafka command line tool kafka-consumer-groups can do exactly what I want (I think):
https://gist.github.com/marwei/cd40657c481f94ebe273ecc16601674b
To summarize, I want to reset the offsets of all partitions for a given topic using Java APIs and let the Sink Connector pick up the offset changes and continue to do what it has been doing (processing records).
Thanks in advance.

I was able to achieve resetting offsets for a kafka connect consumer group by using a series of Confluent's kafka-rest-proxy APIs: https://docs.confluent.io/current/kafka-rest/api.html
This implementation no longer requires the 'trigger record' approach firs described in the original post and is purely Rest API based.
Temporarily delete the kafka connector (this deletes the connector's consumers and )
Create a consumer instance for the same consumer group ("connect-")
Have the instance subscribe to the requested topic you want to reset
Do a dummy poll ('subscribe' is evaluated lazily')
Reset consumer group topic offsets for specified topic
Do a dummy poll ('seek' is evaluated lazily') Commit the current offset state (in the proxy) for the consumer
Re-create kafka connector (with same connector name) - after re-balancing, consumers will join the group and read the last committed offset (starting from 0)
Delete the temporary consumer instance
If you are able to use the CLI, Steps 2-6 can be replaced with:
kafka-consumer-groups --bootstrap-server <kafkahost:port> --group <group_id> --topic <topic_name> --reset-offsets --to-earliest --execute
As for those of you trying to do this in the kafka connector code through native Java APIs, you're out of luck :-(

You're looking for the seek method. Either to an offset
consumer.seek(new TopicPartition("topic-name", partition), offset);
Or seekToBeginning
However, I feel like you'd be competing with the Connect Sink API's consumer group. In other words, assuming you setup the consumer with a separate group id, then you're essentially consuming records twice here from the source topic, once by Connect, and then your own consumer instance.
Unless you explicitly seek Connect's own consumer instance as well (which is not exposed), you'd be getting into a weird state. For example, your task only executes on new records to the topic, despite the fact your own consumer would be looking at an old offset, or you'd still be getting even newer events while still processing old ones
Also, eventually you might get a reprocess event at the very beginning of the topic due to retention policies, expiring old records, for example, causing your consumer to not progress at all and constantly rebalancing its group by seeking to the beginning

We had to do a very similar offset resetting exercise.
KafkaConsumer.seek() combined with KafkaConsumer.commitSync() worked well.
There is another option that is worth mentioning, if you are dealing with lots of topics and partitions (javadoc):
AdminClient.alterConsumerGroupOffsets(
String groupId,
Map<TopicPartition,OffsetAndMetadata> offsets
)
We were lucky because we had the luxury to stop the Kafka Connect instance for a while, so there's no consumer group competing.

Kafka multiple consumer

When we have multiple consumer reading from the topic with single partition Is there any possibility that all the consumer will get all the message.
I have created the two consumers with manual offset commit.started the first consumer and after 2 mins started 2nd consumer . The second consumer is reading from the message from where the 1st consumer stopped reading. Is there any possibility that the 2nd consumer will read all the message from beginning.I'm new to kafka please help me out.

In your consumer, you would be using commitSync which commits offset returned on last poll. Now, when you start your 2nd consumer, since it is in same consumer group it will read messages from last committed offset.
Messages which your consumer will consumes depends on the ConsumerGroup it belongs to. Suppose you have 2 partitions and 2 consumers in single Consumer Group, then each consumer will read from different partitions which helps to achieve parallelism.
So, if you want your 2nd consumer to read from beginning, you can do one of 2 things:
a) Try putting 2nd consumer in different consumer group. For this consumer group, there won't be any offset stored anywhere. At this time, auto.offset.reset config will decide the starting offset. Set auto.offset.reset to earliest(reset the offset to earliest offset) or to latest(reset the offset to latest offset).
b) Seek to start of all partitions your consumer is assigned by using: consumer.seekToBeginning(consumer.assignment())
Documentation: https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html#seekToBeginning-java.util.Collection-
https://kafka.apache.org/documentation/#consumerconfigs

Partition is always assigned to unique consumer in single consumer group irrespective of multiplpe consumers. It means only that consumer can read the data and others won't consume data until the partition is assigned to them. When consumer goes down, partition rebalance happens and it will be assigned to another consumer. Since you are performing manual commit, new consumer will start reading from committed offset.

Does consumer consume from replica partitions if multiple consumers running under same consumer group?

I am writing a kafka consumer application. I have a topic with 4 partitions - 1 is leader and 3 are followers. Producer uses key to identify a partition to push a message.
If I write a consumer and run it on different nodes or start 4 instances of same consumer, how message consuming will happen ? Does all 4 instances will get same messages ?
What happens in the case of multiple consumer(same group) consuming a single topic?
Do they get same data?
How offset is managed? Is it separate for each consumer?

I would suggest that you read at least first few chapters of confluent's definitive guide to kafka to get a priliminary understanding of how kafka works.
I've kept my answers brief. Please refer to the book for detailed explanation.
How offset is managed? Is it separate for each consumer?
Depends on the group id. Only one offset is managed for a group.
What happens in the case of multiple consumer(same group) consuming a single topic?
Consumers can be multiple - all can be identified by the same or different groups.
If 2 consumers belong to the same group, both will not get all messages.
Do they get same data?
No. Once a message is sent and a read is committed, the offset is incremented for that group. So a different consumer with the same group will not receive that message.
Hope that helps :)

What happens in the case of multiple consumer(same group) consuming a single topic?
Answer: Producers send records to a particular partition based on the record’s key here. The default partitioner for Java uses a hash of the record’s key to choose the partition. When there are multiple consumers in same consumer group, each consumer gets different partition. So, in this case, only single consumer receives all the messages. When the consumer which is receiving messages goes down, group coordinator (one of the brokers in the cluster) triggers rebalance and then that partition is assigned to one of the available consumer.
Do they get same data?
Answer: If consumer commits consumed messages to partition and goes down, so as stated above, rebalance occurs. The consumer who gets this partition, will not get messages. But if consumer goes down before committing its then the consumer who gets this partition, will get messages.
How offset is managed? Is it separate for each consumer?
Answer: No, offset is not separate to each consumer. Partition never gets assigned to multiple consumers in same consumer group at a time. The consumer who gets partition assigned, gets offset as well by default.

kafka consumer reads the same message

I have a single Topic with 5 partitions.
I have 5 threads, each creating a Consumer
All consumer are with the same consumer group using group.id.
I also gave each consumer a different and unique client.id
I see that 2 consumers are reading the same message to process
Should kafka handle this?
How do I troubleshoot it?

Consumers within the same group should not receive the same messages. The partitions should be split across all consumers and at any time Kafka's consumer group logic ensures only 1 consumer is assigned to each partition.
The exception is if 1 consumer crashes before it's able to commit its offset. In that case, the new consumer that gets assigned the partition will re-consume from the last committed offset.
You can use the consumer group tool kafka-consumer-groups that comes with Kafka to check the partitions assigned to each consumer in your group.

Kafka Message at-least-once mode at multi-consumer

Kafka messaging use at-least-once message delivery to ensure every message to be processed, and uses a message offset to indicates which message is to deliver next.
When there are multiple consumers, if some deadly message cause a consumer crash during message processing, will this message be redelivered to other consumers and spread the death? If some slow message blocked a single consumer, can other consumers keep going and process subsequent messages?
Or even worse, if a slow and deadly message caused a consumer crash, will it cause other consumers start from its offset again?

There are a few things to consider here:
A Kafka topic partition can be consumed by one consumer in a consumer group at a time. So if two consumers belong to two different groups they can consume from the same partition simultaneously.
Stored offsets are per consumer group. So each topic partition has a stored offset for each active (or recently active) consumer group with consumer(s) subscribed to that partition.
Offsets can be auto-committed at certain intervals, or manually committed (by the consumer application).
So let's look at the scenarios you described.
Some deadly message causes a consumer crash during message processing
If offsets are auto-committed, chances are by the time the processing of the message fails and crashes the consumer, the offset is already committed and the next consumer in the group that takes over would not see that message anymore.
If offsets are manually committed after processing is done, then the offset of that message will not be committed (for simplicity, I am assuming one message is read and processed at a time, but this can be easily generalized) because of the consumer crash. So any other consumer in the group that is (will be) subscribed to that topic will read the message again after taking over that partition. So it's possible that it will crash other consumers too. If offsets are committed before message processing, then the next consumers won't see the message because the offset is already committed when the first consumer crashed.
Some slow message blocks a single consumer: As long as the consumer is considered alive no other consumer in the group will take over. If the slowness goes beyond the consumer's session.timeout.ms the consumer will be considered dead and removed from the group. So whether another consumer in the group will read that message depends on how/when the offset is committed.
Slow and deadly message causes a consumer crash: This scenario should be similar to the previous ones in terms of how Kafka handles it. Either slowness is detected first or the crash occurs first. Again the main thing is how/when the offset is committed.
I hope that helps with your questions.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse