Spring data jpa - how to guarantee flush order? - spring-data-jpa

I have a quite complex save process of Spring data JPA repos in one Transaction:
mainRepo.save();
relatedRepo1.save();
relatedRepoOfRelatedRepo1.save();
...
And in the end I call (on mainRepo):
#Modifying
#Query("update mainEntity set finished = true where id = :id")
void setFinishedTrue(#Param("id") UUID id);
I want to guarantee that when setFinishedTrue(id) is called, all the related data are actually on the database because it will start an integration process that requires all needed data is available.

If you are using standard settings JPA will flush data before executing queries. So you are fine.
If you want to be really really really sure you can add an explicit flush operation.
You can do this by either using the JpaRepository.flush operation or by injecting the EntityManager and call flush on it explicitly.

Related

How to use document locks to prevent external modification of records during Spring Data Mongodb transactions

I have a question regarding Spring Data Mongo and Mongo Transactions.
I have successfully implemented Transactions, and have verified the commit and rollback works as expected utilizing the Spring #Transactional annotation.
However, I am having a hard time getting the transactions to work the way I would expect in the Spring Data environment.
Spring data does Mongo -> Java Object mapping. So, the typical pattern for updating something is to fetch it from the database, and then make modifications, then save it back to the database. Prior to implementing transactions, we have been using Spring's Optimistic Locking to account for the possibility of updates happening to a record between the fetch and the updated.
I was hoping that I would be able to not include the optimistic locking infrastructure for all of my updates once we were able to use Transactions. So, I was hoping that, in the context of a transaction, the fetch would create a lock, so that I could then do my updates and save, and I would be isolated so that no one could get in and make changes like previously.
However, based on what I have seen, the fetch does not create any kind of lock, so nothing prevents any other connection from updating the record, which means it appears that I have to maintain all of my optimistic locking code despite having native mongodb transaction support.
I know I could use mongodb findAndUpdate methods to do my updates and that would not allow interim modifications from occurring, but that is contrary to the standard pattern of Spring Data which loads the data into a Java Object. So, rather than just being able to manipulate Java Objects, I would have to either sprinkle mongo specific code throughout the app, or create Repository methods for every particular type of update I want to make.
Does anyone have any suggestions on how to handle this situation cleanly while maintaining the Spring Data paradigm of just using Java Objects?
Thanks in advance!
I was unable to find any way to do a 'read' lock within a Spring/MongoDB transaction.
However, in order to be able continue to use following pattern:
fetch record
make changes
save record
I ended up creating a method which does a findAndModify in order to 'lock' a record during fetch, then I can make the changes and do the save, and it all happens in the same transaction. If another process/thread attempts to update a 'locked' record during the transaction, it is blocked until my transaction completes.
For the lockForUpdate method, I leveraged the version field that Spring already uses for Optimistic locking, simply because it is convenient and can easily be modified for a simply lock operation.
I also added my implementation to a Base Repository implementation to enable 'lockForUpdate' on all repositories.
This is the gist of my solution with a bit of domain specific complexity removed:
public class BaseRepositoryImpl<T, ID extends Serializable> extends SimpleMongoRepository<T, ID>
implements BaseRepository<T, ID> {
private final MongoEntityInformation<T, ID> entityInformation;
private final MongoOperations mongoOperations;
public BaseRepositoryImpl(MongoEntityInformation<T, ID> metadata, MongoOperations mongoOperations) {
super(metadata, mongoOperations);
this.entityInformation = metadata;
this.mongoOperations = mongoOperations;
}
public T lockForUpdate(ID id) {
// Verify the class has a version before trying to increment the version in order to lock a record
try {
getEntityClass().getMethod("getVersion");
} catch (NoSuchMethodException e) {
throw new InvalidConfigurationException("Unable to lock record without a version field", e);
}
return mongoOperations.findAndModify(query(where("_id").is(id)),
new Update().inc("version", 1L), new FindAndModifyOptions().returnNew(true), getEntityClass());
}
private Class<T> getEntityClass() {
return entityInformation.getJavaType();
}
}
Then you can make calls along these lines when in the context of a Transaction:
Record record = recordRepository.lockForUpdate(recordId);
...make changes to record...
recordRepository.save();

Issue with timing of SQL query firing in JPA

I am using JPA for data persistence.
I am unable to explain a behaviour in my program.
I have an entity A which has another entity B as its member.In my code I create new instance of A and set an instance of B (fetched from database) in A,and then I save A using EntityManager. I am using container managed transaction, hence all transactions are supposed to commit at end of the method.
In very same method, after persisting A, I try to fetch an entity of class C. C, like A, has B as its member. I use a JQPL query to fetch C for id of B's instance I associated with A's instance previously.
Issue is that while fetching C, JPA is also executing SQL query to save A. I expect that to happen at end of transaction (ie when method ends).
But its happening while I try to fetch C. If I don't fetch C, then SQL query for saving A is issued when method ends.
What can be the reason for this behaviour?
JPA provider needs to flush the persistence context before query execution if there is the possibility that query results would not be consistent with the current persistent context state.
You can set flush mode to COMMIT for the desired (or all) sessions. Just keep in mind to manually flush the session if a query depends on the dirty persistence context state. Default flush mode is AUTO, meaning that persistence context may be flushed before query execution.
The reason is the database isolation level. It's read_commited by default.
Read more about isolation levels here:
https://en.wikipedia.org/wiki/Isolation_%28database_systems%29#Read_committed
So to not break this isolation JPA MUST execute all SQL statements in the buffer that all data in the transaction has reached the database.

How to prevent non-repeatable query results using persistence API in Java SE?

I am using Java SE and learning about the use of a persistence API (toplink-essentials) to manage entities in a Derby DB. Note: this is (distance learning) university work, but it is not 'homework' this issue crops up in the course materials.
I have two threads operating on the same set of entities. My problem is that every way I have tried, the entities within a query result set (query performed within a transaction) in one thread can be modified so that the result set is no longer valid for the rest of the transaction.
e.g. from one thread this operation is performed:
static void updatePrices(EntityManager manager, double percentage) {
EntityTransaction transaction = manager.getTransaction();
transaction.begin();
Query query = manager.createQuery("SELECT i FROM Instrument i where i.sold = 'no'");
List<Instrument> results = (List<Instrument>) query.getResultList();
// force thread interruption here (testing non-repeatable read)
try { Thread.sleep(2000); } catch (Exception e) { }
for (Instrument i : results) {
i.updatePrice(percentage);
}
transaction.commit();
System.out.println("Price update commited");
}
And if it is interrupted from another thread with this method:
private static void sellInstrument(EntityManager manager, int id)
{
EntityTransaction transaction = manager.getTransaction();
transaction.begin();
Instrument instrument = manager.find(Instrument.class, id);
System.out.println("Selling: " + instrument.toFullString());
instrument.setSold(true);
transaction.commit();
System.out.println("Instrument sale commited");
}
What can happen is that when the thread within updatePrices() resumes it's query resultSet is invalid, and the price of a sold item ends up being updated to different price to that at which it was sold. (The shop wishes to keep records of items that were sold in the DB). Since there are concurrent transactions occuring I am using a different EntityManager for each thread (from the same factory).
Is it possible (through locking or some kind of context propagation) to prevent the results of a query becoming 'invalid' during a (interrupted) transaction? I have an idea that this kind of scenario is what Java EE is for, but what I want to know is whether its doable in Java SE.
Edit:
Taking Vineet and Pascal's advice: using the #Version annotation in the entity's Class (with an additional DB column) causes the large transaction ( updatePrices() ) to fail with OptimisticLockException. This is very expensive if it happens at the end of a large set of query results though. Is there any way to cause my query (inside updatePrices() ) to lock the relevant rows causing the thread inside sellInstrument() to either block or abort throw an exception (then abort)? This would be much cheaper. (From what I understand I do not have pessimistic locking in Toplink Essentials).
Thread safety
I have a doubt about the way you manage your EntityManager. While a EntityManagerFactory is thread-safe (and should be created once at the application startup), an EntityManager is not and you should typically use one EntityManager per thread (or synchronize accesses to it but I would use one per thread).
Concurrency
JPA 1.0 supports (only) optimistic locking (if you use a Version attribute) and two lock modes allowing to avoid dirty read and non repeatable read through the EntityManager.lock() API. I recommend to read Read and Write Locking and/or the whole section 3.4 Optimistic Locking and Concurrency of the JPA 1.0 spec for full details.
PS: Note that Pessimistic locking is not supported in JPA 1.0 or only through provider specific extensions (it has been added to JPA 2.0, as well as other locking options). Just in case, Toplink supports it through the eclipselink.pessimistic-lock query hint.
As written in the JPA wiki, TopLink Essentials is supposed to support pessimistic locking in JPA 1.0 via a query hint:
// eclipselink.pessimistic-lock
Query Query = em.createQuery("select f from Foo f where f.bar=:bar");
query.setParameter("bar", "foobar");
query.setHint("eclipselink.pessimistic-lock", "Lock");
query.getResultList();
I don't use TopLink so I can't confirm this hint is supported in all versions. If it isn't, then you'll have to use a native SQL query if you want to generate a "FOR UPDATE".
You might want to take a look at the EntityManager.lock() method, which allows you to obtain an optimistic or a pessimistic lock on an entity once a transaction has been initialized.
Going by your description of the problem, you wish to lock the database record once it has been 'selected' from the database. This can be achieved via a pessimistic lock, which is more or less equivalent to a SELECT ... FROM tbl FOR UPDATE statement.

Create new or update existing entity at one go with JPA

A have a JPA entity that has timestamp field and is distinguished by a complex identifier field. What I need is to update timestamp in an entity that has already been stored, otherwise create and store new entity with the current timestamp.
As it turns out the task is not as simple as it seems from the first sight. The problem is that in concurrent environment I get nasty "Unique index or primary key violation" exception. Here's my code:
// Load existing entity, if any.
Entity e = entityManager.find(Entity.class, id);
if (e == null) {
// Could not find entity with the specified id in the database, so create new one.
e = entityManager.merge(new Entity(id));
}
// Set current time...
e.setTimestamp(new Date());
// ...and finally save entity.
entityManager.flush();
Please note that in this example entity identifier is not generated on insert, it is known in advance.
When two or more of threads run this block of code in parallel, they may simultaneously get null from entityManager.find(Entity.class, id) method call, so they will attempt to save two or more entities at the same time, with the same identifier resulting in error.
I think that there are few solutions to the problem.
Sure I could synchronize this code block with a global lock to prevent concurrent access to the database, but would it be the most efficient way?
Some databases support very handy MERGE statement that updates existing or creates new row if none exists. But I doubt that OpenJPA (JPA implementation of my choice) supports it.
Event if JPA does not support SQL MERGE, I can always fall back to plain old JDBC and do whatever I want with the database. But I don't want to leave comfortable API and mess with hairy JDBC+SQL combination.
There is a magic trick to fix it using standard JPA API only, but I don't know it yet.
Please help.
You are referring to the transaction isolation of JPA transactions. I.e. what is the behaviour of transactions when they access other transactions' resources.
According to this article:
READ_COMMITTED is the expected default Transaction Isolation level for using [..] EJB3 JPA
This means that - yes, you will have problems with the above code.
But JPA doesn't support custom isolation levels.
This thread discusses the topic more extensively. Depending on whether you use Spring or EJB, I think you can make use of the proper transaction strategy.

JPA NamedQuery does not pick up changes to modified Entity

I have a method that retrieves Entities using a NamedQuery. I update a value of each entity and then run another named query (in the same method and Transaction) filtering by the old value and it returns the same Entities as if I had not changed them.
I understand that the EntityManager needs to be flushed and also that it should happen automatically but that doesn't make any difference.
I enabled hibernate SQL logging and can see that the Entities are not updated when I call flush but when the container transaction commits.
EntityManager entityManager = getPrimaryEntityManager();
MyEntity myEntity = entityManager.find(MyEntityImpl.class, allocationId);
myEntity.setStateId(State.ACTIVE);
// Flush the entity manager to pick up any changes to entity states before we run this query.
entityManager.flush();
Query countQuery = entityManager
.createNamedQuery("MyEntity.getCountByState");
// we're telling the persistence provider that we want the query to do automatic flushing before this
// particular query is executed.
countQuery.setParameter("stateId", State.CHECKING);
Long count = (Long) countQuery.getSingleResult();
// Count should be zero but isn't. It doesn't see my change above
To be honest I'm not that familiar with JPA, but I ran into similar problems with Hiberate's session manager. My fix was to manually remove the specified object from Hibernate's session before querying on it again so it's forced to do a lookup from the database and doesn't get the object from cache. You might try doing the same with JPA's EntityManager.
I've just had the same issue and discovered two things:
Firstly, you should check the FlushMode for the persistence context
and / or the query.
Secondly, make sure that the entity manager is
exactly the same object for both transaction management and query
execution. In my case, I had Mockito spy on the entityManager, which
was enough to break the transaction management.