Is it possible to configure/code a Kafka consumer application for "Exactly Once" failure recovery w/o calling Producer methods? - apache-kafka

Is it possible to configure/code a Kafka consumer application to unilaterally implement "Exactly Once Semantics" to handle failure recovery (i.e., resume where left off after a comm failure, etc) independent of producer code (calling KafkaProducer methods, etc)?
After some googling, it appears all the "Exactly Once Semantics" (EOS) demos I've found (at least so far) involve calling methods on both producer and consumer instances within the same application to accomplish this.
Here's an example: https://www.baeldung.com/kafka-exactly-once
Can an independent consumer/client application be configured for EOS failure recovery/resume - independent of producer code (i.e., calling KafkaProducer methods, etc)?
If so, can you point me to an example?

No, an independent consumer can not be configured to consume messages from Kafka exactly-once.
You can either have it as "at-most-once" or "at-least-once". Making it exactly-once highly depends on what the consumer is doing with the data and how and when you commit the messages back to Kafka.
You would have to implement this on your own. As an example you could have a look at the implementation of Spark Structured Streaming (also: spark-sql-kafka library) which makes use of write-ahead-logs in order to ensure exactly-once semantics.

Although the other answer is correct, I would state briefly this in a slightly different fashion:
the target / sink needs to be idempotent (KV store or UPSert to something like KUDU)
and the source replayable.
Quoting from this blog explains it well imho, https://www.waitingforcode.com/apache-spark-structured-streaming/fault-tolerance-apache-spark-structured-streaming/read:
"...
Indeed, neither the replayable source nor commit log don't guarantee
exactly-once processing itself. What if the batch commit fails ? As
told previously, the engine will detect the last committed offsets as
offsets to reprocess and output once again the processed data to the
sink. It'll obviously lead to a duplicated output. But it'd be the
case only when the writes and the sink aren't idempotent.
An idempotent write is the one that generates the same written data
for given input. The idempotent sink is the one that writes given
generated row only once, even if it's sent multiple times. A good
example of such sink are key-value data stores. Now, if the writer is
idempotent, obviously it generates the same keys every time and since
the row identification is key-based, the whole process is idempotent.
Together with replayable source it guarantees exactly-once end-2-end
processing.
..."
As an English native speaker not 100% sure the don't is correct, but I think we can get the drift.

Related

Is message deduplication essential on the Kafka consumer side?

Kafka documentation states the following as the top scenario:
To process payments and financial transactions in real-time, such as
in stock exchanges, banks, and insurances
Also, regarding the main concepts, right at the very top:
Kafka provides various guarantees such as the ability to process
events exactly-once.
It’s funny the document says:
Many systems claim to provide "exactly once" delivery semantics, but
it is important to read the fine print, most of these claims are
misleading…
It seems obvious that payments/financial transactions must be processed „exactly-once“, but the rest of Kafka documentation doesn't make it obvious how this should be accomplished.
Let’s focus on the producer/publisher side:
If a producer attempts to publish a message and experiences a network
error it cannot be sure if this error happened before or after the
message was committed. This is similar to the semantics of inserting
into a database table with an autogenerated key. … Since 0.11.0.0, the
Kafka producer also supports an idempotent delivery option which
guarantees that resending will not result in duplicate entries in the
log.
KafkaProducer only ensures that it doesn’t incorrectly resubmit messages (resulting in duplicates) itself. Kafka cannot cover the case where client app code crashes (along with KafkaProducer) and it is not sure if it previously invoked send (or commitTransaction in case of transactional producer) which means that application-level retry will result in duplicate processing.
Exactly-once delivery for other destination systems generally
requires cooperation with such systems, but Kafka provides the offset
which makes implementing this feasible (see also Kafka Connect).
The above statement is only partially correct, meaning that while it exposes offsets on the Consumer side, it doesn’t make exactly-once feasible at all on the producer side.
Kafka consume-process-produce loop enables exactly-once processing leveraging sendOffsetsToTransaction, but again cannot cover the case of the possibility of duplicates on the first producer in the chain.
The provided official demo for EOS (Exactly once semantics) only provides an example for consume-process-produce EOS.
Solutions involving DB transaction log readers which read already committed transactions, also cannot be sure if they will produce duplicate messages in case they crash.
There is no support for a distributed transaction (XA) involving a database and the Kafka producer.
Does all of this mean that in order to ensure exactly once processing for payments and financial transactions (Kafka top use case!), we absolutely must perform business-level message deduplication on the consumer side, inspite of the Kafka transport-level “guarantees”/claims?
Note: I’m aware of:
Kafka Idempotent producer
but I would like a clear answer if deduplication is inevitable on the consumer side.
You must deduplicate on consumer side since rebalance on consumer side can really cause processing of events more than once in a consumer group based on fetch size and commit interval parameters.
If a consumer exits without acknowledging back to broker, Kafka will assign those events to another consumer in the group. Example if you are pulling a batch size of 5 events, if consumer dies or goes for a restart after processing first 3(If the external api/db fails OR the worse case your server runs out of memory and crashes), the current consumer dies abruptly without making a commit back/ack to broker. Hence the same batch gets assigned to another consumer from group(rebalance) where it starts supplies the same event batch again which will result in re-processing of same set of records resulting in duplication. A good read here : https://quarkus.io/blog/kafka-commit-strategies/
You can make use of internal state store of Kafka for deduplication. Here there is no offset/partition tracking, its kind of cache(persistent time bound on cluster).
In my case we push correlationId(a unique business identifier in incoming event) into it on successful processing of events, and all new events are checked against this before processing to make sure its not a duplicate event. Enabling state store will create more internal topics in Kafka cluster, just an FYI.
https://kafka.apache.org/10/documentation/streams/developer-guide/processor-api.html#state-stores

Processing Unprocessed Records in Kafka on Recovery/Rebalance

I'm using Spring Kafka to interface with my Kafka instance. Assume that I have a single topic with, say, 2+ partitions.
In the instances where, for example, my Spring Kafka-based application crashes (or even rebalances), and then comes back online and there are messages waiting in the topic, I'm currently using a strategy where the latest committed offsets for each partition are stored in an external store, which I then look up on a consumer's assignment to a partition and then seek to that offset to resume processing.
(This is based on a strategy I'd read about in an O'Reilly book.)
Is there a better way of handling this situation in order to implement "exactly once" semantics and not to miss any waiting messages? Or is there a better/more idiomatic way with Spring Kafka to handle this situation?
Thanks in advance.
Is there a reason you dont checkpoint your offsets to kafka itself?
generally, your options for "exactly once" processing are:
store your offsets and your side-effects together transactionally. this is only possible if your side effects go into a transaction-capable system (say a database)
use kafka transactions. this is a simplified variant of 1 as long as your side effects go to the same kafka cluster you read from
come up with a scheme that allows you to detect and disregard duplicates downstream of your kafka pipeline (aka idempotence)

How to ensure exactly once semantics while processing kafka messages in Apache Storm

I needed exactly once delivery in my app. I explored kafka and realised that to have message produced exactly once, I have to set idempotence=true in producer config. This also sets acks=all, making producer resend messages till all replicas have committed it. To ensure that consumer does not do duplicate processing or leave any message unprocessed, it is advised to commit the processing output and offset to external database in same database transaction, so that either both of them will be persisted or none avoiding duplicate and no processing.
In consumer, message is left processed if consumer first commits it but fails before processing it and message is processed more than once if consumers first processes it but fails before committing it.
Q1. Now I was guessing how can I imitate the same with Apache Storm. I guess exactly once production of message can be ensured by setting idemptence=true in KafkaBolt. Am I right?
I was guessing how I can ensure missed and duplicate message processing in Storm. For example, this doc page says if I anchor a tuple (by passing it as first parameter to OutputCollector.emit()) and then pass the tuple to OutputCollector.ack() or OutputCollector.fail(), Storm will ensure data loss. This is what it exactly says:
Now that you understand the reliability algorithm, let's go over all the failure cases and see how in each case Storm avoids data loss:
A tuple isn't acked because the task died: In this case the spout tuple ids at the root of the trees for the failed tuple will time out and be replayed.
Acker task dies: In this case all the spout tuples the acker was tracking will time out and be replayed.
Spout task dies: In this case the source that the spout talks to is responsible for replaying the messages. For example, queues like Kestrel and RabbitMQ will place all pending messages back on the queue when a client disconnects.
Q2. I guess this ensures that message is not left unprocessed, but does not avoid duplicate processing of messages. Am I correct with this? Also is there anything else that Storm offers to ensure exactly once semantics like kafka that I am missing?
Regarding Q1: Yes, you can get the same behavior from the KafkaBolt by setting that property, the KafkaBolt simply wraps a KafkaProducer.
Regarding semantics on the consuming side, you have the same options with Storm as you do with Kafka. When you read a message from Kafka, you can choose to commit before or after you do your processing (e.g. write to a database). If you do it before, and the program crashes, you will lose the message. Let's call this at-most-once processing. If you do it after, you risk processing the same message twice if the program crashes after the processing but before the commit, called at-least-once processing.
So, regarding Q2: Yes, using anchored tuples and acking will provide you with at-least-once semantics. Not using anchored tuple would give you at-most-once.
Yes, there is something else Storm offers to ensure exactly once semantics called Trident, but it requires you to write your topology differently, and your data store has to be adapted to it so message deduplication can happen. See the documentation at https://storm.apache.org/releases/2.0.0/Trident-tutorial.html.
Also just to caution you: When documentation for Storm (or Kafka) talk about exactly-once semantics, there are some assumptions made about what kind of processing you'll do. For example, when Storm's Trident docs talk about exactly-once, there's an assumption that you'll adapt your database so you can decide when given a message whether it has already been stored. When Kafka's documentation talks about exactly-once, the assumption is that your processing will be reading from Kafka, doing some computation (most likely with no side effects) and writing back to Kafka.
This is just to say that for some types of processing, you may still need to pick between at-least-once and at-most-once. If you can make your processing idempotent, at-least-once is a good option.
Finally if your processing fits the "read from Kafka, do computation, write to Kafka" model, you can likely get nicer semantics out of Kafka Streams than Storm, as Storm can't provide the exactly-once semantics Kafka can provide in that case.

When to use Kafka transactional API?

I was trying to understand Kafka's transactional API. This link defines atomic read-process-write cycle as follows:
First, let’s consider what an atomic read-process-write cycle means. In a nutshell, it means that if an application consumes a message A at offset X of some topic-partition tp0, and writes message B to topic-partition tp1 after doing some processing on message A such that B = F(A), then the read-process-write cycle is atomic only if messages A and B are considered successfully consumed and published together, or not at all.
It further says says following:
Using vanilla Kafka producers and consumers configured for at-least-once delivery semantics, a stream processing application could lose exactly once processing semantics in the following ways:
The producer.send() could result in duplicate writes of message B due to internal retries. This is addressed by the idempotent producer and is not the focus of the rest of this post.
We may reprocess the input message A, resulting in duplicate B messages being written to the output, violating the exactly once processing semantics. Reprocessing may happen if the stream processing application crashes after writing B but before marking A as consumed. Thus when it resumes, it will consume A again and write B again, causing a duplicate.
Finally, in distributed environments, applications will crash or—worse!—temporarily lose connectivity to the rest of the system. Typically, new instances are automatically started to replace the ones which were deemed lost. Through this process, we may have multiple instances processing the same input topics and writing to the same output topics, causing duplicate outputs and violating the exactly once processing semantics. We call this the problem of “zombie instances.”
We designed transaction APIs in Kafka to solve the second and third problems. Transactions enable exactly-once processing in read-process-write cycles by making these cycles atomic and by facilitating zombie fencing.
Doubts:
Points 2 and 3 above describe when message duplication can occur which are dealt with using transactional API. Does transactional API also help to avoid message loss in any scenario?
Most online (for example, here and here) examples of Kafka transactional API involve:
while (true)
{
ConsumerRecords records = consumer.poll(Long.MAX_VALUE);
producer.beginTransaction();
for (ConsumerRecord record : records)
producer.send(producerRecord(“outputTopic”, record));
producer.sendOffsetsToTransaction(currentOffsets(consumer), group);
producer.commitTransaction();
}
This is basically read-process-write loop. So does transactional API useful only in read-process-write loop?
This article gives example of transactional API in non read-process-write scenario:
producer.initTransactions();
try {
producer.beginTransaction();
producer.send(record1);
producer.send(record2);
producer.commitTransaction();
} catch(ProducerFencedException e) {
producer.close();
} catch(KafkaException e) {
producer.abortTransaction();
}
It says:
This allows a producer to send a batch of messages to multiple partitions such that either all messages in the batch are eventually visible to any consumer or none are ever visible to consumers.
Is this example correct and shows another way to use transactional API different from read-process-write loop? (Note that it also does not commit offset to transaction.)
In my application, I simply consume messages from kafka, do processing and log them to the database. That is my whole pipeline.
a. So, I guess this is not read-process-write cycle. Is Kafka transactional API of any use to my scenario?
b. Also I need to ensure that each message is processed exactly once. I guess setting idempotent=true in producer will suffice and I dont need transactional API, right?
c. I may run multiple instances of pipeline, but I am not writing processing output to Kafka. So I guess this will never involve zombies (duplicate producers writing to kafka). So, I guess transactional API wont help me to avoid duplicate processing scenario, right? (I might have to persist both offset along with processing output to the database in the same database transaction and read the offset during producer restart to avoid duplicate processing.)
a. So, I guess this is not read-process-write cycle. Is Kafka
transactional API of any use to my scenario?
It is a read-process-write, except you are writing to a database instead of Kafka. Kafka has its own transaction manager and thus writing inside a transaction with idempotency would enable exactly once processing, assuming you can resume the state of your consumer-write processor correctly. You cannot do that with a DB because the DB's transaction manager doesn't sync with Kafka's. What you can do instead is make sure that even if kafka transactions are not atomic with respect to your database, they are still eventually consistent.
Let's assume your consumer reads, writes to the DB and then acks. If the DB fails you don't ack and you can resume normally based on the offset. If the ack fails you will process twice and save to the DB twice. If you can make this operation idempotent, then you are safe. This means that your processor must be pure and the DB has to dedupe: processing the same message twice should always lead to the same result on the DB.
b. Also I need to ensure that each message is processed exactly once.
I guess setting idempotent=true in producer will suffice and I dont
need transactional API, right?
Assuming that you respect the requirements from point a, exactly once processing with persistence on a different store also requires that between your initial write and the duplicate no other change has happened to the objects that you are saving. Imagine having a value written as X, then some other actor changes it to Y, then the message is reprocessed and changes it back to X. This can be avoided for example, by making your database table be a log, similar to a kafka topic.
c. I may run multiple instances of pipeline, but I am not writing processing output to Kafka. So I guess this will never involve zombies (duplicate producers writing to kafka). So, I guess transactional API wont help me to avoid duplicate processing scenario, right? (I might have to persist both offset along with processing output to the database in the same database transaction and read the offset during producer restart to avoid duplicate processing.)
It is the producer which writes to the topic you consume from that may create zombie messages. That producer needs to play nice with kafka so that zombies are ignored. The transactional API together with your consumer will make sure that this producer writes atomically and your consumer reads committed messages, albeit not atomically. If you want exactly once idempotency is enough. If the messages are supposed to be atomically written you need transactions too. Either way your read-write/consume-produce processor needs to be pure and you have to dedupe. Your DB is also part of this processor since the DB is the one that actually persists.
I've looked for a bit on the internet, maybe this link helps you: processing guarantees
The links you posted: exactly once semantics and transactions in kafka are great.

Kafka: Is it good practice too keep topic offset in database?

I have started learning kafka. I don't have much idea of live project where kafka is used.
Wanted to know if offset can be saved in database apart from committing in broker?
I think it should always be saved otherwise some record will be missed or re-processed.
Taking an example if offset is not saved in database, when application(consumer) is deployed or restarted during that time if some message is sent to broker at that time, that will be missed as when consumer will be up it will read next onward record or(from start)
the short answer to your question is "its complicated" :-)
the long answer to your question is something like:
kafka (without extra configuration and/or careful design of your code) is an at-least-once system (see official documentation). this means that yes, your consumer may see a particular set of records more than once. this wont happen on a graceful shutdown/rebalance, but will definitely happen if your application crashes.
newer versions of kafka support so called "exactly once". this involves configuring your clients differently (and a significant performance and latency hit), and the guarantees only ever hold if all your inputs and outputs are from/to the exact same kafka cluster. so if your consumer does anything like call an external HTTP API or insert into a database in response to seeing a kafka record we are back to at-least-once.
if your outputs go to a transactional system (like a classic ACID database) a common pattern would be to start a transaction, and in that transaction record both your outputs and the consumer offsets (you would also need to change your code to restore from these DB offsets and not the kafka default). this has better guarantees (but still wont help if your code interacts with non-transactional systems, like making an HTTP call)
another common design pattern to overcome at-least-once is to somehow "tag" every operation you do (record you produce, http call you make ...) with some UUID that derives from the original kafka records comsumed to produce this output. this means if your consumer sees the same record again, it will perform the same operations again, and repeat the same "tag" value. this shifts the burden to downstream systems that must now remember (at least for some period of time) all the "tags" they have seen so they could disregard a repeat operation, or somehow design all your operations to be idempotent