I have many transactional consumers with a ChainedKafkaTransactionManager based on a JpaTransactionManager and a KafkaTransactionManager (all #KafkaListener's).
The JPA one needs a ThreadLocal variable to be set, to be able to know to which DB to connect to (is the tenant id).
When starting the application, in the onPartitionsAssigned listener, spring-kafka is trying to create a chained txn, hence trying to create a JPA txn, but there's no tenant set, then it fails.
That tenant is set through a http filter and/or kafka interceptors (through event headers).
I tried using the auto-wired KafkaListenerEndpointRegistry with setAutoStartup(false), but I see that the consumers don't receive any events, probably because they aren't initialized yet (I thought they were initialized on-demand).
If I set a mock tenant id and call registry.start() when the application is ready, the initializations seem to be done in other threads (probably because I'm using a ConcurrentKafkaListenerContainerFactory), so it doesn't work.
Is there a way to avoid the JPA transaction on that initial onPartitionsAssigned listener, that is part of the consumer initialization?
If your chained TM has the KafkaTM first, followed by JPA TM (which would be the normal case), you can achieve similar functionality by just injecting the Kafka TM into the container and using #Transactional (with just the JPA TM on the listener) to start the JPA transaction when the listener is called.
The time between the transaction commits will be marginally increased but it would provide similar functionality.
If that won't work for you, open a GitHub issue; we can either disable the initial commit on assignment, or do it without a transaction at all (optionally).
Related
For migration purpose, we need to produce records using two different serialisers, and thus two different KafkaProducers ( one String and one Avro) in the same transaction.
But all the transaction stuff is done through one KafkaProducer instance as follows :
kafkaProducer.beginTransaction();
...
kafkaProducer.send(record);
...
kafkaProducer.commitTransaction();
Can I use a second kafkaProducer (with the second serializer) and use the same transactionnal.id and do like this :
kafkaProducer.beginTransaction();
...
kafkaProducer.send(record);
kafkaProducer2.send(record);
...
kafkaProducer.commitTransaction();
All will be part of the same transaction , all consistent ?
EDIT 1 :
According to what I saw in the java implementation, there is some mechanism when calling commitTransaction() like calling flush() on the producer itself.. so I think the model above won't work..
Any chance of achieving this without instantiating a full new instance of everything in parallel ?
You can only have a single producer active in a transaction at a time.
If you start 2 producers with the same transactional.id, one of them will be fenced and won't be able to commit its records and all ecords won't be part of the same transaction.
You need to use a single producer and one possible workaround is to configure it to use the BytesSerializer and handle the convertion of your Objects to bytes in your logic explicitely.
I have a stateless EJB which inserts data into database, sends a response immediately and in the last step calls an asynchronous EJB. Asynchronous EJB can run for long (I mean 5-10 mins which is longer then JPA transaction timeout). The asynchronous ejb needs to read (and work on it) the same record tree (only read) as the one persisted by stateless EJB.
Is seems that the asynchronous bean tries to read the record tree before it was commited or inserted (JPA) by the statelsss EJB so record tree is not visible by async bean.
Stateless EJB:
#Stateless
public class ReceiverBean {
public void receiverOfIncomingRequest(data) {
long id = persistRequest(data);
sendResponseToJmsBasedOnIncomingData(data);
processorAsyncBean.calculate(id);
}
}
}
Asynchronous EJB:
#Stateless
public class ProcessorAsyncBean {
#Asynchronous
public void calculate(id) {
Data data = dao.getById(id); <- DATA IS ALLWAYS NULL HERE!
// the following method going to send
// data to external system via internet (TCP/IP)
Result result = doSomethingForLongWithData(data);
updateData(id, result);
}
#TransactionAttribute(TransactionAttributeType.REQUIRES_NEW)
public void updateData(id, result) {
dao.update(id, result);
}
Maybe I can use a JMS queue to send a signal with ID to the processor bean instead of calling asyc ejb (and message driven bean read data from database) but I want to avoid that if possible.
Another solution can be to pass the whole record tree as a detached JPA object to the processor async EJB instead of reading data back from database.
Can I make async EJB work well in this structure somehow?
-- UPDATE --
I was thinking about using Weblogic JMS. There is another issue here. In case of big load, when there are 100 000 or more data in queue (that will be normal) and there is no internet connection then all of my data in the queue will fail. In case of that exception (or any) appears during sending data via internet (by doSomethingForLongWithData method) the data will be rollbacked to the original queue based on the redelivery-limit and repetitaion settings of Weblogic. This rollback event will generate 100 000 or more threads on Weblogic in the managed server to manage redelivery. That new tons of background processes can kill or at least slow down the server.
I can use IBM MQ as well because we have MQ infrastructure. MQ does not have this kind of affect on Weblogic server but MQ does not have redelivery-limit and delay function. So in case of error (rollback) the message will appear immediately on the MQ again, without delay and I built a hand mill. Thread.sleep() in the catch condition is not a solution in EE application I guess...
Is seems that the asynchronous bean tries to read the record tree before it was commited or inserted (JPA) by the statelsss EJB so record tree is not visible by async bean.
This is expected behavior with bean managed transactions. Your are starting the asynchronous EJB from the EJB with its own transaction context. The asynchronous EJB never uses the callers transaction context (see EJB spec 4.5.3).
As long as you are not using transaction isolation level "read uncommited" with your persistence, you won't see the still not commited data from the caller.
You must think about the case, when the asynch job won't commit (e.g. applicationserver shutdown or abnormal abortion). Is the following calculation and update critical? Is the asynchronous process recoverable if not executed successfully or not even called?
You can think about using bean managed transactions, commiting before calling the asynchronous EJB. Or you can delegate the data update to another EJB with a new transactin context. This will be commited before the call of the asynchronous EJB. This is usally ok for uncritical stuff, missing or failing.
Using persistent and transactional JMS messages along with a dead letter queue has the advantage of a reliable processing of your caclulation and update, even with stopping / starting application server in between or with temporal errors during processing.
You just need to call async method next to the one with transaction markup, so when transaction is committed.
For example, caller of receiverOfIncomingRequest() method, could add
processorAsyncBean.calculate(id);
call next to it.
UPDATE : extended example
CallerMDB
#TransactionAttribute(TransactionAttributeType.NOT_SUPPORTED)
public void onMessage(Message message) {
long id = receiverBean.receiverOfIncomingRequest(data);
processorAsyncBean.calculate(id);
}
ReceiverBean
#TransactionAttribute(TransactionAttributeType.REQUIRED)
public long receiverOfIncomingRequest(data) {
long id = persistRequest(data);
sendResponseToJmsBasedOnIncomingData(data);
return id;
}
We have a micro-services architecture, with Kafka used as the communication mechanism between the services. Some of the services have their own databases. Say the user makes a call to Service A, which should result in a record (or set of records) being created in that service’s database. Additionally, this event should be reported to other services, as an item on a Kafka topic. What is the best way of ensuring that the database record(s) are only written if the Kafka topic is successfully updated (essentially creating a distributed transaction around the database update and the Kafka update)?
We are thinking of using spring-kafka (in a Spring Boot WebFlux service), and I can see that it has a KafkaTransactionManager, but from what I understand this is more about Kafka transactions themselves (ensuring consistency across the Kafka producers and consumers), rather than synchronising transactions across two systems (see here: “Kafka doesn't support XA and you have to deal with the possibility that the DB tx might commit while the Kafka tx rolls back.”). Additionally, I think this class relies on Spring’s transaction framework which, at least as far as I currently understand, is thread-bound, and won’t work if using a reactive approach (e.g. WebFlux) where different parts of an operation may execute on different threads. (We are using reactive-pg-client, so are manually handling transactions, rather than using Spring’s framework.)
Some options I can think of:
Don’t write the data to the database: only write it to Kafka. Then use a consumer (in Service A) to update the database. This seems like it might not be the most efficient, and will have problems in that the service which the user called cannot immediately see the database changes it should have just created.
Don’t write directly to Kafka: write to the database only, and use something like Debezium to report the change to Kafka. The problem here is that the changes are based on individual database records, whereas the business significant event to store in Kafka might involve a combination of data from multiple tables.
Write to the database first (if that fails, do nothing and just throw the exception). Then, when writing to Kafka, assume that the write might fail. Use the built-in auto-retry functionality to get it to keep trying for a while. If that eventually completely fails, try to write to a dead letter queue and create some sort of manual mechanism for admins to sort it out. And if writing to the DLQ fails (i.e. Kafka is completely down), just log it some other way (e.g. to the database), and again create some sort of manual mechanism for admins to sort it out.
Anyone got any thoughts or advice on the above, or able to correct any mistakes in my assumptions above?
Thanks in advance!
I'd suggest to use a slightly altered variant of approach 2.
Write into your database only, but in addition to the actual table writes, also write "events" into a special table within that same database; these event records would contain the aggregations you need. In the easiest way, you'd simply insert another entity e.g. mapped by JPA, which contains a JSON property with the aggregate payload. Of course this could be automated by some means of transaction listener / framework component.
Then use Debezium to capture the changes just from that table and stream them into Kafka. That way you have both: eventually consistent state in Kafka (the events in Kafka may trail behind or you might see a few events a second time after a restart, but eventually they'll reflect the database state) without the need for distributed transactions, and the business level event semantics you're after.
(Disclaimer: I'm the lead of Debezium; funnily enough I'm just in the process of writing a blog post discussing this approach in more detail)
Here are the posts
https://debezium.io/blog/2018/09/20/materializing-aggregate-views-with-hibernate-and-debezium/
https://debezium.io/blog/2019/02/19/reliable-microservices-data-exchange-with-the-outbox-pattern/
first of all, I have to say that I’m no Kafka, nor a Spring expert but I think that it’s more a conceptual challenge when writing to independent resources and the solution should be adaptable to your technology stack. Furthermore, I should say that this solution tries to solve the problem without an external component like Debezium, because in my opinion each additional component brings challenges in testing, maintaining and running an application which is often underestimated when choosing such an option. Also not every database can be used as a Debezium-source.
To make sure that we are talking about the same goals, let’s clarify the situation in an simplified airline example, where customers can buy tickets. After a successful order the customer will receive a message (mail, push-notification, …) that is sent by an external messaging system (the system we have to talk with).
In a traditional JMS world with an XA transaction between our database (where we store orders) and the JMS provider it would look like the following: The client sets the order to our app where we start a transaction. The app stores the order in its database. Then the message is sent to JMS and you can commit the transaction. Both operations participate at the transaction even when they’re talking to their own resources. As the XA transaction guarantees ACID we’re fine.
Let’s bring Kafka (or any other resource that is not able to participate at the XA transaction) in the game. As there is no coordinator that syncs both transactions anymore the main idea of the following is to split processing in two parts with a persistent state.
When you store the order in your database you can also store the message (with aggregated data) in the same database (e.g. as JSON in a CLOB-column) that you want to send to Kafka afterwards. Same resource – ACID guaranteed, everything fine so far. Now you need a mechanism that polls your “KafkaTasks”-Table for new tasks that should be send to a Kafka-Topic (e.g. with a timer service, maybe #Scheduled annotation can be used in Spring). After the message has been successfully sent to Kafka you can delete the task entry. This ensures that the message to Kafka is only sent when the order is also successfully stored in application database. Did we achieve the same guarantees as we have when using a XA transaction? Unfortunately, no, as there is still the chance that writing to Kafka works but the deletion of the task fails. In this case the retry-mechanism (you would need one as mentioned in your question) would reprocess the task an sends the message twice. If your business case is happy with this “at-least-once”-guarantee you’re done here with a imho semi-complex solution that could be easily implemented as framework functionality so not everyone has to bother with the details.
If you need “exactly-once” then you cannot store your state in the application database (in this case “deletion of a task” is the “state”) but instead you must store it in Kafka (assuming that you have ACID guarantees between two Kafka topics). An example: Let’s say you have 100 tasks in the table (IDs 1 to 100) and the task job processes the first 10. You write your Kafka messages to their topic and another message with the ID 10 to “your topic”. All in the same Kafka-transaction. In the next cycle you consume your topic (value is 10) and take this value to get the next 10 tasks (and delete the already processed tasks).
If there are easier (in-application) solutions with the same guarantees I’m looking forward to hear from you!
Sorry for the long answer but I hope it helps.
All the approach described above are the best way to approach the problem and are well defined pattern. You can explore these in the links provided below.
Pattern: Transactional outbox
Publish an event or message as part of a database transaction by saving it in an OUTBOX in the database.
http://microservices.io/patterns/data/transactional-outbox.html
Pattern: Polling publisher
Publish messages by polling the outbox in the database.
http://microservices.io/patterns/data/polling-publisher.html
Pattern: Transaction log tailing
Publish changes made to the database by tailing the transaction log.
http://microservices.io/patterns/data/transaction-log-tailing.html
Debezium is a valid answer but (as I've experienced) it can require some extra overhead of running an extra pod and making sure that pod doesn't fall over. This could just be me griping about a few back to back instances where pods OOM errored and didn't come back up, networking rule rollouts dropped some messages, WAL access to an aws aurora db started behaving oddly... It seems that everything that could have gone wrong, did. Not saying Debezium is bad, it's fantastically stable, but often for devs running it becomes a networking skill rather than a coding skill.
As a KISS solution using normal coding solutions that will work 99.99% of the time (and inform you of the .01%) would be:
Start Transaction
Sync save to DB
-> If fail, then bail out.
Async send message to kafka.
Block until the topic reports that it has received the
message.
-> if it times out or fails Abort Transaction.
-> if it succeeds Commit Transaction.
I'd suggest to use a new approach 2-phase message. In this new approach, much less codes are needed, and you don't need Debeziums any more.
https://betterprogramming.pub/an-alternative-to-outbox-pattern-7564562843ae
For this new approach, what you need to do is:
When writing your database, write an event record to an auxiliary table.
Submit a 2-phase message to DTM
Write a service to query whether an event is saved in the auxiliary table.
With the help of DTM SDK, you can accomplish the above 3 steps with 8 lines in Go, much less codes than other solutions.
msg := dtmcli.NewMsg(DtmServer, gid).
Add(busi.Busi+"/TransIn", &TransReq{Amount: 30})
err := msg.DoAndSubmitDB(busi.Busi+"/QueryPrepared", db, func(tx *sql.Tx) error {
return AdjustBalance(tx, busi.TransOutUID, -req.Amount)
})
app.GET(BusiAPI+"/QueryPrepared", dtmutil.WrapHandler2(func(c *gin.Context) interface{} {
return MustBarrierFromGin(c).QueryPrepared(db)
}))
Each of your origin options has its disadvantage:
The user cannot immediately see the database changes it have just created.
Debezium will capture the log of the database, which may be much larger than the events you wanted. Also deployment and maintenance of Debezium is not an easy job.
"built-in auto-retry functionality" is not cheap, it may require much codes or maintenance efforts.
I am trying to implement an event driven architecture to handle distributed transactions. Each service has its own database and uses Kafka to send messages to inform other microservices about the operations.
An example:
Order service -------> | Kafka |------->Payment Service
| |
Orders MariaDB DB Payment MariaDB Database
Order receives an order request. It has to store the new Order in its DB and publish a message so that Payment Service realizes it has to charge for the item:
private OrderBusiness orderBusiness;
#PostMapping
public Order createOrder(#RequestBody Order order){
logger.debug("createOrder()");
//a.- Save the order in the DB
orderBusiness.createOrder(order);
//b. Publish in the topic so that Payment Service charges for the item.
try{
orderSource.output().send(MessageBuilder.withPayload(order).build());
}catch(Exception e){
logger.error("{}", e);
}
return order;
}
These are my doubts:
Steps a.- (save in Order DB) and b.- (publish the message) should be performed in a transaction, atomically. How can I achieve that?
This is related to the previous one: I send the message with: orderSource.output().send(MessageBuilder.withPayload(order).build()); This operations is asynchronous and ALWAYS returns true, no matter if the Kafka broker is down. How can I know that the message has reached the Kafka broker?
Steps a.- (save in Order DB) and b.- (publish the message) should be
performed in a transaction, atomically. How can I achieve that?
Kafka currently does not support transactions (and thus also no rollback or commit), which you'd need to synchronize something like this. So in short: you can't do what you want to do. This will change in the near-ish future, when KIP-98 is merged, but that might take some time yet. Also, even with transactions in Kafka, an atomic transaction across two systems is a very hard thing to do, everything that follows will only be improved upon by transactional support in Kafka, it will still not entirely solve your issue. For that you would need to look into implementing some form of two phase commit across your systems.
You can get somewhat close by configuring producer properties, but in the end you will have to chose between at least once or at most once for one of your systems (MariaDB or Kafka).
Let's start with what you can do in Kafka do ensure delivery of a message and further down we'll dive into your options for the overall process flow and what the consequences are.
Guaranteed delivery
You can configure how many brokers have to confirm receipt of your messages, before the request is returned to you with the parameter acks: by setting this to all you tell the broker to wait until all replicas have acknowledged your message before returning an answer to you. This is still no 100% guarantee that your message will not be lost, since it has only been written to the page cache yet and there are theoretical scenarios with a broker failing before it is persisted to disc, where the message might still be lost. But this is as good a guarantee as you are going to get.
You can further reduce the risk of data loss by lowering the intervall at which brokers force an fsync to disc (emphasized text and/or flush.ms) but please be aware, that these values can bring with them heavy performance penalties.
In addition to these settings you will need to wait for your Kafka producer to return the response for your request to you and check whether an exception occurred. This sort of ties into the second part of your question, so I will go into that further down.
If the response is clean, you can be as sure as possible that your data got to Kafka and start worrying about MariaDB.
Everything we have covered so far only addresses how to ensure that Kafka got your messages, but you also need to write data into MariaDB, and this can fail as well, which would make it necessary to recall a message you potentially already sent to Kafka - and this you can't do.
So basically you need to choose one system in which you are better able to deal with duplicates/missing values (depending on whether or not you resend partial failures) and that will influence the order you do things in.
Option 1
In this option you initialize a transaction in MariaDB, then send the message to Kafka, wait for a response and if the send was successful you commit the transaction in MariaDB. Should sending to Kafka fail, you can rollback your transaction in MariaDB and everything is dandy.
If however, sending to Kafka is successful and your commit to MariaDB fails for some reason, then there is no way of getting back the message from Kafka. So you will either be missing a message in MariaDB or have a duplicate message in Kafka, if you resend everything later on.
Option 2
This is pretty much just the other way around, but you are probably better able to delete a message that was written in MariaDB, depending on your data model.
Of course you can mitigate both approaches by keeping track of failed sends and retrying just these later on, but all of that is more of a bandaid on the bigger issue.
Personally I'd go with approach 1, since the chance of a commit failing should be somewhat smaller than the send itself and implement some sort of dupe check on the other side of Kafka.
This is related to the previous one: I send the message with:
orderSource.output().send(MessageBuilder.withPayload(order).build());
This operations is asynchronous and ALWAYS returns true, no matter if
the Kafka broker is down. How can I know that the message has reached
the Kafka broker?
Now first of, I'll admit I am unfamiliar with Spring, so this may not be of use to you, but the following code snippet illustrates one way of checking produce responses for exceptions.
By calling flush you block until all sends have finished (and either failed or succeeded) and then check the results.
Producer<String, String> producer = new KafkaProducer<>(myConfig);
final ArrayList<Exception> exceptionList = new ArrayList<>();
for(MessageType message : messages){
producer.send(new ProducerRecord<String, String>("myTopic", message.getKey(), message.getValue()), new Callback() {
#Override
public void onCompletion(RecordMetadata metadata, Exception exception) {
if (exception != null) {
exceptionList.add(exception);
}
}
});
}
producer.flush();
if (!exceptionList.isEmpty()) {
// do stuff
}
I think the proper way for implementing Event Sourcing is by having Kafka be filled directly from events pushed by a plugin that reads from the RDBMS binlog e.g using Confluent BottledWater (https://www.confluent.io/blog/bottled-water-real-time-integration-of-postgresql-and-kafka/) or more active Debezium (http://debezium.io/). Then consuming Microservices can listen to those events, consume them and act on their respective databases being eventually consistent with the RDBMS database.
Have a look here to my full answer for a guideline:
https://stackoverflow.com/a/43607887/986160
I would like to translate the concept of JMS topics using HornetQ core API.
The problem i see from my brief examination it would appear the main class JMSServerManagerImpl (from hornetq-jms.jar) uses jndi to coordinate the various collaborators it requires. I would like to avoid jndi as it is not self contained and is a globally shared object which is a problem especially in an osgi environment. One alternative is to copy starting at JMSServerManagerImpl but that seems like a lot of work.
I would rather have a confirmation that my approach to emulating how topics are supported in hornetq is the right way to solve this problem. If anyone has sufficient knowledge perhaps they can comment on what i think is the approach to writing my own emulation of topics using the core api.
ASSUMPTION
if a message consumer fails (via rollback) the container will try deliverying the message to another different consumer for the same topic.
EMULATION
wrap each message that is added for the topic.
sender sends msg w/ an acknowledgement handler set.
the wrapper for (1) would rollback after the real listener returns.
the sender then acknowledges delivery
I am assuming after 4 the msg is delivered after being given to all msg receivers. If i have made any mistakes or my assumptions are wrong please comment. Im not sure exactly if this assumption of how acknowledgements work is correct so any pointers would be nice.
If you are trying to figure out how to send a message to multiple consumers using the core API; here is what I recommend
Create queue 1 and bind to address1
Create queue 2 and bind to address1
Make queue N and bind to address 1
Send a message on address1
Start N consumers where each consumer listens on queue 1-N
This way it basically works like a topic.
http://hornetq.sourceforge.net/docs/hornetq-2.0.0.BETA5/user-manual/en/html/using-jms.html
7.5. Directly instantiating JMS Resources without using JNDI