JPA EntityManager - when the transaction starts? - jpa

I am confused about the life cycle of the transactions, the entitymanagers and the persistence context in the EJB container.
I use the entitymanager this way:
#PersistenceContext(unitName = "..")
private EntityManager em;
in every stateless ejb.
My question is as simple as:
When the transaction starts ?
How the transaction is propagated ? ie when stateless ejbs call each others, does they keep using the same transaction ?
When the transaction is committed ?

For container-managed transactions:
The transaction (TX) starts when the first transactional method is invoked. Per default, all EJB methods are transactional ( equivalent to TransactionAttributeType.REQUIRED, which is the default setting).
The default TX propagation keeps the same TX over all local EJB calls. This is equivalent to an explicit TrasactionAttributeType.REQUIRED on all invoked methods
The transaction is committed when the first method in the invocation chain (the one the TX has been created for) returns.
You can have a fine-grained control over the TX propagation by annotating your EJB methods with different TransactionAttributeTypes.

Related

XA or non XA in JEE

I have question about this paragraph
"Initially, all transactions are local. If a non-XA data source connection is the first resource connection enlisted in a transaction scope, it will become a global transaction when a (second) XA data source connection joins it. If a second non-XA data source connection attempts to join, an exception is thrown." -> link https://docs.oracle.com/cd/E19229-01/819-1644/detrans.html (Global and Local TRansaction).
Can I have the first connection non XA and the second XA? So the first become xa without any Exception thrown? (I'm in doubt)
Can I have fist transaction marked xa, second marked xa and third non xa? (I suppose no)
what happens if the first ejb trans-type=required use XA on db and call a remote EJB trans-type=required(deployed in another app server) with a db non-xa? Could I have in this moment two distinct transaction so that xa is not the right choice? What happens if two ejb are in the same server but in two distinct ear?
"In scenarios where there is only a single one-phase commit resource provider that participates in the transaction and where all the two-phase commit resource-providers that participate in the transaction are used in a read-only fashion. In this case, the two-phase commit resources all vote read-only during the prepare phase of two-phase commit. Because the one-phase commit resource provider is the only provider to complete any updates, the one-phase commit resource does not have to be prepared."
https://www.ibm.com/support/knowledgecenter/SSEQTP_8.5.5/com.ibm.websphere.base.doc/ae/cjta_trans.html
What mean for readonly ? So we can mix xa updates with readonly non xa?
Some of these should really be split out into separate questions. I can answer the first couple of questions.
Can I have the first connection non XA and the second XA?
Yes, if you are willing to use Last Participant Support
So the first become xa without any Exception thrown?
No, the transaction manager cannot convert a non-xa capable connection into one that is xa capable. A normal non-xa commit or rollback will be performed on the connection, but it still participates in the transaction alongside the XA resources. I'll discuss how this is done further down in summarizing the Last Participant Support optimization.
Can I have fist transaction marked xa, second marked xa and third non xa?
I assume you meant to say first connection marked xa, and so forth. Yes, you can do this relying on Last Participant Support
What mean for readonly ?
read-only refers to usage of the transactional resource in a way that does not modify any data. For example, you might run a query that locks a row in a database and reads data from it, but does not perform any updates on it.
So we can mix xa updates with readonly non xa?
You have this in reverse. The document that you cited indicates that the XA resources can be read only and the non-xa resource can make updates. This works because the XA resources have a spec-defined way of indicating to the transaction manager that they did not modify any data (by voting XA_RDONLY in their response to the xa.prepare request). Because they haven't written any data, they only need to release their locks, so the commit of the overall transaction just reduces to non-xa commit/rollback of the one-phase resource and then either resolution of the xa-capable resources (commit or rollback) would have the same effect.
Last Participant Support
Last Participant Support, mentioned earlier, is a feature of the application server that simulates the participation of a non-xa resource as part of a transaction alongside one or more xa-capable resources. There are some risks involved in relying on this optimization, namely a timing window where the transaction can be left in-doubt, requiring manual intervention to resolve it.
Here is how it works:
You operate on all of the enlisted resources (xa and non-xa) as you normally would, and when you are ready, you invoke the userTransaction.commit operation (or rely on container managed transactions to issue the commit for you). When the transaction manager receives the request to commit, it sees that there is a non-xa resource involved and orders the prepare/commit operations to the backend in a special way. First, it tells all of the xa-capable resources to do xa.prepare, and receives the vote from each of them. If all indicate that they have successfully prepared and would be able to commit, then the transaction manager proceeds to issue a commit to the non-xa resource. If the commit of the non-xa resource succeeds, then the transaction manager commits all of the xa-capable resources. Even if the system goes down at this point, it is written in the recovery log that these resources must commit, and the transaction manager will later find them during a recovery attempt and commit them, with their corresponding records in the back end being locked until that happens. If the commit of the non-xa resource fails, then the transaction manager would instead proceed to roll back all of the xa-capable resources. The risk here comes from the possibility that the request to commit the non-xa capable resources might not return at all, leaving the transaction manager no way of knowing whether that resource has committed or rolled back, and thus no way knowing whether to commit or roll back the xa-capable resources, leaving the transaction in-doubt and in need of manual intervention to properly recover. Only enable/rely upon Last Participant Support if you are okay with accepting this risk.

JPA locks PESSEMISTIC_WRITE and FEW transactions

I'm using PESSEMSTIC_WRITE lock on my repository method. So that is locks my object till end of transaction. However, I've got a problem, within one endpoint, controller -> service -> I start transaction then I need to update my object and send message to kafka, after that I need within this method again update my object and send to kafka. So because it's one transaction, changes works only local in cache. But I need to save in database then send to kafka, then again change my object and save to database and send to kafka message, I can't use REQUIRES_NEW and create a new transaction in any way, because my object is locked. So how I can deal with it?
This lock is used in many parts of my project to fix parallel transactions.
You should create new service which will orchestrate the flow. That way you will be able to obtain the same pessimistic lock again in the second operation.
#Service
class OrchestratorService {
...
void executeFlow() {
someService.executeFirstOperationAndSendKafkaEvent()
someService.executeSecondOperationAndSendKafkaEvent()
}
}
#Service
class SomeService {
#Transactional(REQUIRES_NEW)
void executeFirstOperationAndSendKafkaEvent() {
// any lock which obtained inside this method will be released once this method finishes
...
}
#Transactional(REQUIRES_NEW)
void executeSecondOperationAndSendKafkaEvent() {
// any lock which obtained inside this method will be released once this method finishes
...
}
}
There is one more important aspect worth to mention - sending kafka event is not transactional. #Transactional guarantees only that changes made to datasource will be transactional (in this case DB). Hence following scenarios are possible:
if event is sent inside transaction scope, transaction can be rollbacked after succesfull sending kafka event
if event is sent outside transaction commit, event sending may fail after succesful commiting transaction
Due to this nature it's good to split the process into few phases:
apply business changes in DB and store a flag in DB that kafka event should be sent, but it hasn't been done yet,
outside TX scope send event to kafka
in new TX change the flag that event has been sent, or schedule retry if there was error during sending event.

On Partitions Assignment and ChainedKafkaTransactionManager at startup with JPA

I have many transactional consumers with a ChainedKafkaTransactionManager based on a JpaTransactionManager and a KafkaTransactionManager (all #KafkaListener's).
The JPA one needs a ThreadLocal variable to be set, to be able to know to which DB to connect to (is the tenant id).
When starting the application, in the onPartitionsAssigned listener, spring-kafka is trying to create a chained txn, hence trying to create a JPA txn, but there's no tenant set, then it fails.
That tenant is set through a http filter and/or kafka interceptors (through event headers).
I tried using the auto-wired KafkaListenerEndpointRegistry with setAutoStartup(false), but I see that the consumers don't receive any events, probably because they aren't initialized yet (I thought they were initialized on-demand).
If I set a mock tenant id and call registry.start() when the application is ready, the initializations seem to be done in other threads (probably because I'm using a ConcurrentKafkaListenerContainerFactory), so it doesn't work.
Is there a way to avoid the JPA transaction on that initial onPartitionsAssigned listener, that is part of the consumer initialization?
If your chained TM has the KafkaTM first, followed by JPA TM (which would be the normal case), you can achieve similar functionality by just injecting the Kafka TM into the container and using #Transactional (with just the JPA TM on the listener) to start the JPA transaction when the listener is called.
The time between the transaction commits will be marginally increased but it would provide similar functionality.
If that won't work for you, open a GitHub issue; we can either disable the initial commit on assignment, or do it without a transaction at all (optionally).

Is JPA Entity Manager is thread local?

I read somewhere entity manager is thread local, so does it mean that if thread exits its execution all entities which are in persistent state to this entity manager will again become detached and available for modification if thread of execution completes.
I have a microservice architecture in which each microservice has 30 consumer threads and I am doing both read, write and delete operation, is it necessary to clear the entitymanager ?.

JPA transaction handling between #Stateless and #Asynchronous EJBs

I have a stateless EJB which inserts data into database, sends a response immediately and in the last step calls an asynchronous EJB. Asynchronous EJB can run for long (I mean 5-10 mins which is longer then JPA transaction timeout). The asynchronous ejb needs to read (and work on it) the same record tree (only read) as the one persisted by stateless EJB.
Is seems that the asynchronous bean tries to read the record tree before it was commited or inserted (JPA) by the statelsss EJB so record tree is not visible by async bean.
Stateless EJB:
#Stateless
public class ReceiverBean {
public void receiverOfIncomingRequest(data) {
long id = persistRequest(data);
sendResponseToJmsBasedOnIncomingData(data);
processorAsyncBean.calculate(id);
}
}
}
Asynchronous EJB:
#Stateless
public class ProcessorAsyncBean {
#Asynchronous
public void calculate(id) {
Data data = dao.getById(id); <- DATA IS ALLWAYS NULL HERE!
// the following method going to send
// data to external system via internet (TCP/IP)
Result result = doSomethingForLongWithData(data);
updateData(id, result);
}
#TransactionAttribute(TransactionAttributeType.REQUIRES_NEW)
public void updateData(id, result) {
dao.update(id, result);
}
Maybe I can use a JMS queue to send a signal with ID to the processor bean instead of calling asyc ejb (and message driven bean read data from database) but I want to avoid that if possible.
Another solution can be to pass the whole record tree as a detached JPA object to the processor async EJB instead of reading data back from database.
Can I make async EJB work well in this structure somehow?
-- UPDATE --
I was thinking about using Weblogic JMS. There is another issue here. In case of big load, when there are 100 000 or more data in queue (that will be normal) and there is no internet connection then all of my data in the queue will fail. In case of that exception (or any) appears during sending data via internet (by doSomethingForLongWithData method) the data will be rollbacked to the original queue based on the redelivery-limit and repetitaion settings of Weblogic. This rollback event will generate 100 000 or more threads on Weblogic in the managed server to manage redelivery. That new tons of background processes can kill or at least slow down the server.
I can use IBM MQ as well because we have MQ infrastructure. MQ does not have this kind of affect on Weblogic server but MQ does not have redelivery-limit and delay function. So in case of error (rollback) the message will appear immediately on the MQ again, without delay and I built a hand mill. Thread.sleep() in the catch condition is not a solution in EE application I guess...
Is seems that the asynchronous bean tries to read the record tree before it was commited or inserted (JPA) by the statelsss EJB so record tree is not visible by async bean.
This is expected behavior with bean managed transactions. Your are starting the asynchronous EJB from the EJB with its own transaction context. The asynchronous EJB never uses the callers transaction context (see EJB spec 4.5.3).
As long as you are not using transaction isolation level "read uncommited" with your persistence, you won't see the still not commited data from the caller.
You must think about the case, when the asynch job won't commit (e.g. applicationserver shutdown or abnormal abortion). Is the following calculation and update critical? Is the asynchronous process recoverable if not executed successfully or not even called?
You can think about using bean managed transactions, commiting before calling the asynchronous EJB. Or you can delegate the data update to another EJB with a new transactin context. This will be commited before the call of the asynchronous EJB. This is usally ok for uncritical stuff, missing or failing.
Using persistent and transactional JMS messages along with a dead letter queue has the advantage of a reliable processing of your caclulation and update, even with stopping / starting application server in between or with temporal errors during processing.
You just need to call async method next to the one with transaction markup, so when transaction is committed.
For example, caller of receiverOfIncomingRequest() method, could add
processorAsyncBean.calculate(id);
call next to it.
UPDATE : extended example
CallerMDB
#TransactionAttribute(TransactionAttributeType.NOT_SUPPORTED)
public void onMessage(Message message) {
long id = receiverBean.receiverOfIncomingRequest(data);
processorAsyncBean.calculate(id);
}
ReceiverBean
#TransactionAttribute(TransactionAttributeType.REQUIRED)
public long receiverOfIncomingRequest(data) {
long id = persistRequest(data);
sendResponseToJmsBasedOnIncomingData(data);
return id;
}