Kafka & JPA transaction management - apache-kafka

I am new to Kafka and trying to do transaction management for Kafka and DB transaction. I already read many articles on this topic, but so far I am able to test only 1 scenario successfully.
#Transactional
public void updateData(InputData data)
{
repository.save(data);
kafkaTemplate.send(data.id,data);
}
In this case if Kafka transactions fails, DB transaction will be rollback. This works fine.
But is it possible to do Kafka transaction first & then DB transaction? and if DB transaction fails, then the Kafka transaction will be aborted & message posted on Kafka topic will be in uncommitted state?
I tested such scenario, but it didn't work. Message posted on topic was not in uncommitted state. Hence want to check possibility of this scenario.

I solved this problem. It is solved by using nested #transactional annotation. I put #transactional("kafkaTxManager) on the method in which I am starting kafka transaction & put #transactional("chainkafkaTxmanager") on the method where i am starting DB transaction.

Related

Kafka transactions how to emulate an error to rollback transaction

I want to test Kafka transactional behavior in depth. I need to test what happens in the transaction boundary if this is a possibility.
For example, I want the commit of the transaction to fail so that I can test what happens during a transaction rollback.
Is this possible?
For example, when talking about database transactions, one easy way to test a failed transaction is to try to persist a record with an field with length greater than the maximum.
But in Kafka I don't know how to emulate something similar to trigger the kafka transaction rollback.

Kafka transaction management at producer end

I am looking for how Kafka behave when the producer is running in transaction.
I have a oracle database insert operations running in same transaction which rollback the changes if the transaction is rolled back.
How does Kafka producer behave in case of transaction rollback.
Will the message be rolled back or Kafka doesn't support rollback.
I know the JMS message are committed to queue only when transaction is committed. Looking for similar solutions if it is supported.
Note : Producer code is written using spring boot.
You are trying to update two systems
update a record in your oracle database
sending a event to apache kafka
This represents a challenge as you would like it to be atomic, either everything gets executed or nothing, otherwise you will end up with inconsistencies between your database and kafka.
You might send a Kafka message even if the database transaction was rollbacked.
Or the other way around (if you are sending the message just after the commit), you might commit the database transaction and crash (for some reason) just before sending the Kafka event.
One of the simplest solution is to use the outbox pattern:
Let's say you want to update an order table and send orderEvent to kafka
Instead of sending the event to kafka in the same transaction
You can save it a database table (outbox) using the same transaction as the order update
A separate process will read data from outbox table and make sure it's sent to kafka (using at least once semantic)
Your consumer need to be idempotent.
In this post, I explain more in detail how to implement this solution
https://mirakl.tech/sending-kafka-message-in-a-transactional-way-34d6d19bb7b2

SAGA and local transactions with Kafka and Postgres in Spring Boot

Haven't worked with SAGAs and spring-kafka (and spring-cloud-stream-kafka-binder) for a while.
Context: there are several (3+) Spring-boot microservices that have to span business transaction in order to keep data in eventually consistent state. They use Database-per-Service approach (each service stores data in Postgres) and collaborate via Kafka as an Event-Store.
I'm going to apply SAGA (either choreography or orchestration approach, let's stick to the first one) to manage transaction over multiple services.
The question is: how to support local transactions when using RDBMS (Postgres) as a data store along with Kafka as an Event-Store/messaging middleware?
In nowadays, does actually spring-kafka support JTA transactions and would it be enough to wrap RDBMS and Kafka Producer into #Transactional methods? Or do we still have to apply some of Transactional microservices patterns (like Transactional Outbox, Transaction Log Tailing or Polling Publisher)?
Thanks in advance
Kafka does not support JTA/XA. The best you can do is "Best Effort 1PC" - see Dave Syer's Javaworld article; you have to handle possible duplicates.
Spring for Apache Kafka provides the ChainedKafkaTransactionManager; for consumer-initiated transactions, the CKTM should be injected into the listener container.
The CKTM should have the KTM first, followed by the RDBMS, so the RDBMS transaction will be committed first; if it fails, the Kafka tx will roll back and the record redelivered. If the DB succeeds but Kafka fails, the record will be redelivered (with default configuration).
For producer-only transactions, you can use #Transactional. In that case, the TM can just be the RDBMS TM and Spring-Kafka will synchronize a local Kafka transaction, committing last.
See here for more information.

How to implement kafka manual offset commits for database transactions

I am using kafka, services from dotnet to take messages from kafka, and once I take the message from kafka, I send this message to the database and then performing some operations in the database.
My code works as below..
Dotnet service will execute one method, where i am returning message from kafka.. So here kafka is noting down as that message is consumed by my code.
After this messge is returned, i am performing database operations, so if something is wrong in db, it will not get processed.. But when i retry the process from my service, it will not take the same message, as the message is already noted as consumed in kafka.
So I want to make sure the consumer is committing not after my service return kafka message, but it should note as consumed only after the db operation is performed and all ok.. Then it should make note as consumed in kafka.
Can anyone suggest how can I implement manual offset commits for this case.

How does Spring Kafka/Spring Cloud Stream guarantee the transactionality / atomicity involving a Database and Kafka?

Spring Kafka, and thus Spring Cloud Stream, allow us to create transactional Producers and Processors. We can see that functionality in action in one of the sample projects: https://github.com/spring-cloud/spring-cloud-stream-samples/tree/master/transaction-kafka-samples:
#Transactional
#StreamListener(Processor.INPUT)
#SendTo(Processor.OUTPUT)
public PersonEvent process(PersonEvent data) {
logger.info("Received event={}", data);
Person person = new Person();
person.setName(data.getName());
if(shouldFail.get()) {
shouldFail.set(false);
throw new RuntimeException("Simulated network error");
} else {
//We fail every other request as a test
shouldFail.set(true);
}
logger.info("Saving person={}", person);
Person savedPerson = repository.save(person);
PersonEvent event = new PersonEvent();
event.setName(savedPerson.getName());
event.setType("PersonSaved");
logger.info("Sent event={}", event);
return event;
}
In this excerpt, there's a read from a Kafka topic, a write in a database and another write to another Kafka topic, all of this transactionally.
What I wonder, and would like to have answered is how is that technically achieved and implemented.
Since the datasource and Kafka don't participate in a XA transaction (2 phase commit), how does the implementation guarantee that a local transaction can read from Kafka, commit to a database and write to Kafka all of this transactionally?
There is no guarantee, only within Kafka itself.
Spring provides transaction synchronization so the commits are close together but it is possible for the DB to commit and the Kafka does not. So you have to deal with the possibility of duplicates.
The correct way to do this, when using spring-kafka directly, is NOT with #Transactional but to use a ChainedKafkaTransactionManager in the listener container.
See Transaction Synchronization.
Also see Distributed transactions in Spring, with and without XA and the "Best Efforts 1PC pattern" for background.
However, with Stream, there is no support for the chained transaction manager, so the #Transactional is required (with the DB transaction manager). This will provide similar results to chained tx manager, with the DB committing first, just before Kafka.