Kafka commitTransaction acknowledgement failure - apache-kafka

According to Kafka's commitTransaction documentation, commitTransaction will fail with TimeoutException if it doesn't receive a response in certain time
Note that this method will raise TimeoutException if the transaction cannot be committed before expiration of max.block.ms. Additionally, it will raise InterruptException if interrupted. It is safe to retry in either case, but it is not possible to attempt a different operation (such as abortTransaction) since the commit may already be in the progress of completing. If not retrying, the only option is to close the producer.
Consider an application in which a Kafka producer sends a group of records as Transaction A.
After the records have been successfully sent to the topic, Kafka producer will execute commitTransaction .
Kafka cluster receives the commit transaction request and successfully commits records that are part of transaction A. Kafka cluster sends an acknowledgement regarding successful commit.
However, due to some issue this acknowledgement is lost, causing a Timeout exception at Kafka producer's commitTransaction call. Thus even though the records have been committed on Kafka cluster, from producer's perspective the commit failed.
Generally in such a scenario the application would retry sending the transaction A records in a new transaction B, but this would lead to duplication of records as they were already committed as part of transaction A.
Is the above described scenario possible?
How do you handle loss of commitTransaction acknowledgement and the eventual duplication of records that is caused by it?

Related

Kafka Streams Apps Threads fail transaction and are fenced and restarted after Kafka broker restart

We are noticing Streams Apps threads fail transactions during rolling restarts of our Kafka Brokers. The transaction failure causes stream thread fencing which in turn causes a restart of the thread and re-balancing. The re-balancing causes some delay in processing. Our goal is to make broker restarts as smooth as possible and prevent processing delays as much as possible.
For our rolling Broker restarts we use the controlled.shutdown=true configuration, and before each restart we wait for all partitions to be in-sync across all replicas.
For our Streams Apps we have properly configured group.instance.id and an appropriate session.timeout.ms so that rolling restarts of the streams apps themselves are smooth and without re-balances.
From the Kafka Streams app logs I have identified a sequence of events leading up to the fencing:
Broker starts shutting down
App logs error producing to topic due to NOT_LEADER_OR_FOLLOWER
App heartbeats failing because coordinator is restarting broker
App discovers new group coordinator (this bounces a a bit between the restarting broker and live brokers)
App stabilizes
Broker starting up again
App fails to do fetch request to starting broker due to FETCH_SESSION_ID_NOT_FOUND
App discovers starting broker as transaction coordinator
App transaction fails due to one of two reasons:
InvalidProducerEpochException: Producer attempted to produce with an old epoch.
ProducerFencedException: There is a newer producer with the same transactionalId which fences the current one
Stream threads end up in fatal error state, get fenced and restarted which causes a rebalance.
What could be causing the two exceptions that cause stream thread transactions to fail? My intuition is that the broker starting up is assigned as transaction coordinator before it has synced its transaction states with the in-sync brokers. This could explain old epochs or different transactional ids to be known by that broker.
How can we further identify what is going wrong here and how it can be improved?
you can set request.timeout.ms in kafka streams which will make stream API wait for a longer period of time. if kafka broker is not up in a given period of time then only it will throw an exception which can be handled by using ProductionExceptionHandler as described in Handling exceptions in Kafka streams

What happens when ISR are down but message was written in leader

When there's a failure writing a message to all ISR replicas, but the message is already persisted on the leader, what happens?
Even if the request fails, is the data still is available for consumers?
Are consumers able to read "uncommitted" data?
The message is not visible to the KafkaConsumer until the topic configuration min.insync.replicas is fulfilled.
Kafka is expecting the failed replica broker to get up and running again, so the replication can complete.
Note, that this scenario is only relevant if your KafkaProducer configuration acks is set to all. When you set it to 0 or 1, the Consumer will be able to consume the message as soon as the data arrives at the partition leader.
In general, the client (here KafkaProducer) only communicates with the partition leader and depending on its mode such as synchronous, asynchronous or fire-and-forget and its acks configuration waits or does not wait for a reply. The replication of the data itself is independent of the producer and is handled by a (single-thread) on the partition leader broker. The leader will only notify on the success or failure on the replication and the leader/replicas continue to ensure to satisfy the replication set with the topic configuration replication.factor.
The producer may still try to resend the message in case the message was only acknowledged by producer but not by the replicas, depending on its acks setting and if the exception is retriable or not. In that case you will end up having duplicate values. You can avoid this by enabling idempotence.

How to handle kafka consumer failures

I am trying understand how to handle failed consumer records. How to
we know there is record failure. What I am seeing is when the record
processing failed in the consumer with runtime exception consumer is
keep on retrying. But when the next record is available to process it
is commiting offset of the latest record, which is expected. My
question how to we know about failed record. In older messaging
systems failed messages are rolled back to queues and processing stops
there. Then we know the queue is down and we can take action.
I can record the failed record into some db table,but what happens if this recording fails?
I can move failures to error/ dead letter queues, again what happens if this moving fails?
I am using kafka 2.6 with spring boot 2.3.4. Any help would be appreciated
Sounds like you would need to disable auto commits and manually commit the offsets yourself when your scope of "sucessfully processed" is achieved. If you include external processes like a database, then you will also need to increase Kafka client timeouts so it doesnt think the consumer is dead while waiting on error logging/handling.

Is it possible to log or handle automatic kafka producer retries when acks=all

I know I can set acks=all in Kafka producer configuration to make producer to wait for acknowledgement from leader after all replicas receive the message sent. If the acknowledgement timeout occurs, producer retries sending message. This happens transparently without requiring any code changes. Is it possible to have some stats of those retries. Is it possible to know which message involved retries and how many retries. Does Kafka provide any kind of hook to be called before / after retry so that we can log some message?

Put() vs Flush() in Kafka Connector Sink Task

I am trying to send the data in a batch to a NOSQL database using Kafka Sink Connector. I am following https://kafka.apache.org/documentation/#connect documentation and confused about where the logic of sending records has to be implemented. Please help me in understanding how the records are processed internally and what has to be used Put() or Flush() to process the records in a batch.
When a Kafka Connect worker is running a sink task, it will consume messages from the topic partition(s) assigned to the task. As it does so, it repeatedly passes a batch of messages to the sink task through the put(Collection<SinkRecord>) method. This will continue as long as the connector and its tasks are running.
Kafka Connect also will periodically record the progress of the sink tasks, namely the offset of the most recently processed message on each topic partition. This is called committing the offsets, and it does this so that if the connector stops unexpectedly and uncleanly, Kafka Connect knows where in each topic partition the task should resume processing messages. But just before Kafka Connect writes the offsets to Kafka, the Kafka Connect worker gives the sink connector an opportunity to do work during this stage via the flush(...) method.
A particular sink connector might not need to do anything (if put(...) did all of the work), or it might use this opportunity to submit all of the messages already processed via put(...) to the data store. For example, Confluent's JDBC sink connector writes each batch of messages passed through the put(...) method using a transaction (the size of which can be controlled via the connector's consumer settings), and thus the flush(...) method doesn't need to do anything. Confluent's ElasticSearch sink connector, on the other hand, simply accumulates all of the messages for a series of put(...) methods and only writes them to Elasticsearch during flush(...).
The frequency that the offsets are committed for source and sink connectors is controlled by the connector's offset.flush.interval.ms configuration property. The default is to commit offsets every 60 seconds, which is infrequent enough to improve performance and reduce overhead, but is frequent enough to cap the potential amount of re-processing should the connector task unexpectedly die. Note that when the connector is shutdown gracefully or experiences an exception, Kafka Connect will always have a chance to commit the offsets. It's only when the Kafka Connect worker is killed unexpectedly that it might not have a chance to commit the offsets identifying what messages had been processed. Thus, only after restarting after such a failure will the connector potentially re-process some messages that it did just prior to the failure. And it's because messages will potentially be seen at least once that the messages should be idempotent. Take all of this plus your connectors' behavior into account when determining appropriate values for this setting.
Have a look at the Confluent documentation for Kafka Connect as well as open source sink connectors for more examples and details.