I am trying to achieve avoiding duplicate insertions with the following:
List<City> cities = repo.findAllByCityCode(cityCode);
List<String> citycodes = requestPayload.getListOfCityCodes();
uniqueList = removeDuplicateCitiesFromDatabase(cities,citycodes);
repo.saveAll(uniqueList);
The above block of code is likely to get triggered multiple times by the same request(due to duplicate events in event-driven microservice application).
I was wondering if I can use both #Transactional with serialized isolation along with Pessimistic locking on the jpa methods to achieve my goal of avoiding duplicate entries.
Related
I have a quite complex save process of Spring data JPA repos in one Transaction:
mainRepo.save();
relatedRepo1.save();
relatedRepoOfRelatedRepo1.save();
...
And in the end I call (on mainRepo):
#Modifying
#Query("update mainEntity set finished = true where id = :id")
void setFinishedTrue(#Param("id") UUID id);
I want to guarantee that when setFinishedTrue(id) is called, all the related data are actually on the database because it will start an integration process that requires all needed data is available.
If you are using standard settings JPA will flush data before executing queries. So you are fine.
If you want to be really really really sure you can add an explicit flush operation.
You can do this by either using the JpaRepository.flush operation or by injecting the EntityManager and call flush on it explicitly.
I have some very large queries the EF is creating resulting in slow response times and high CPU use so I thought as a way to optimize I'd try to implement MARS and Async parallel queries to pull back multiple, simpler result sets in parallel and manipulating in memory.
i.e. I'd like to do this:
public async Task<IEnumerable<TResult>> GetResult<TResult>()
{
using(var context = new Context())
{
return await context.Set<TResult1>().ToListAsync().ConfigureAwait(false);
}
}
IEnumerable<TResult1> result1;
IEnumerable<TResult2> result2;
var result1Task = GetResult<TResult1>();
var result2Task = GetResult<TResult2>();
await Task.WhenAll(result1Task, result2Task).ConfigureAwait(false);
var result1 = result1Task.Result;
var result2 = result2Task.Result;
But not sure if this takes advantage of connection pooling since it creates a new DBContext for each task.
I found this article, but it isn't using Entity Framework.
I found this one using Core and it wasn't a recommended strategy.
And this one using Entity Framework for .NET framework, but is using a stored procedure as the example, but I just want to issue say 3 read queries in parallel, not call an SP.
Ideally looking for a way to achieve multiple result sets using linq to generate the SQL (vs using strings select Id, VendorName From Vendors....) and auto mapping the results to a class without having to use strings (vendorID = (int)vendorReader["BusinessEntityID"];).
Is this possible or a pipe dream?
The requirement for running multiple queries concurrently is usually not solved in ORMs with parallelism in the application. It is not safe to access a single DbContext from multiple threads. Instead, a pattern known as future queries is used. For EF6 this is available in the third party library https://www.nuget.org/packages/Z.EntityFramework.Plus.EF6/
The API is very simple and consists of an extension method that will cause the queries to be added to an internal list until the time when one of the queries is materialized (e.g. by calling ToList). At this time, all the queries are sent to the server in a single batch, and the results are returned together as well.
I have a question regarding Spring Data Mongo and Mongo Transactions.
I have successfully implemented Transactions, and have verified the commit and rollback works as expected utilizing the Spring #Transactional annotation.
However, I am having a hard time getting the transactions to work the way I would expect in the Spring Data environment.
Spring data does Mongo -> Java Object mapping. So, the typical pattern for updating something is to fetch it from the database, and then make modifications, then save it back to the database. Prior to implementing transactions, we have been using Spring's Optimistic Locking to account for the possibility of updates happening to a record between the fetch and the updated.
I was hoping that I would be able to not include the optimistic locking infrastructure for all of my updates once we were able to use Transactions. So, I was hoping that, in the context of a transaction, the fetch would create a lock, so that I could then do my updates and save, and I would be isolated so that no one could get in and make changes like previously.
However, based on what I have seen, the fetch does not create any kind of lock, so nothing prevents any other connection from updating the record, which means it appears that I have to maintain all of my optimistic locking code despite having native mongodb transaction support.
I know I could use mongodb findAndUpdate methods to do my updates and that would not allow interim modifications from occurring, but that is contrary to the standard pattern of Spring Data which loads the data into a Java Object. So, rather than just being able to manipulate Java Objects, I would have to either sprinkle mongo specific code throughout the app, or create Repository methods for every particular type of update I want to make.
Does anyone have any suggestions on how to handle this situation cleanly while maintaining the Spring Data paradigm of just using Java Objects?
Thanks in advance!
I was unable to find any way to do a 'read' lock within a Spring/MongoDB transaction.
However, in order to be able continue to use following pattern:
fetch record
make changes
save record
I ended up creating a method which does a findAndModify in order to 'lock' a record during fetch, then I can make the changes and do the save, and it all happens in the same transaction. If another process/thread attempts to update a 'locked' record during the transaction, it is blocked until my transaction completes.
For the lockForUpdate method, I leveraged the version field that Spring already uses for Optimistic locking, simply because it is convenient and can easily be modified for a simply lock operation.
I also added my implementation to a Base Repository implementation to enable 'lockForUpdate' on all repositories.
This is the gist of my solution with a bit of domain specific complexity removed:
public class BaseRepositoryImpl<T, ID extends Serializable> extends SimpleMongoRepository<T, ID>
implements BaseRepository<T, ID> {
private final MongoEntityInformation<T, ID> entityInformation;
private final MongoOperations mongoOperations;
public BaseRepositoryImpl(MongoEntityInformation<T, ID> metadata, MongoOperations mongoOperations) {
super(metadata, mongoOperations);
this.entityInformation = metadata;
this.mongoOperations = mongoOperations;
}
public T lockForUpdate(ID id) {
// Verify the class has a version before trying to increment the version in order to lock a record
try {
getEntityClass().getMethod("getVersion");
} catch (NoSuchMethodException e) {
throw new InvalidConfigurationException("Unable to lock record without a version field", e);
}
return mongoOperations.findAndModify(query(where("_id").is(id)),
new Update().inc("version", 1L), new FindAndModifyOptions().returnNew(true), getEntityClass());
}
private Class<T> getEntityClass() {
return entityInformation.getJavaType();
}
}
Then you can make calls along these lines when in the context of a Transaction:
Record record = recordRepository.lockForUpdate(recordId);
...make changes to record...
recordRepository.save();
I use spring boot 2 with spring data
In a one to many relation, when we want to remove relation in a rest architecture what should be the good way to do it
Child and Parent continue to exist... only relation must be removed
#DeleteMapping(value="/{id}/child/{childId}")
public void deleteChildRelation(#PathVariable("id") Integer id, #PathVariable("childId") Integer childId){
service.deleteChildRelation(id, childId);
}
We can get parent, remove child and save
Or use query annotation and do something like
#Query("update Child c set c.parent=null where c.id=:id ")
void deleteChildRelation(#Param("id") Long id);
The first approach is the JPA way to do it. It is slower but leaves you with a consistent session employs optimistic locking and it also updates JPAs 2nd level cache. You should use it if this is of use for you.
If you just want the relation to be gone, the second approach is faster and simpler, since it does a single database round trip.
I am using Java SE and learning about the use of a persistence API (toplink-essentials) to manage entities in a Derby DB. Note: this is (distance learning) university work, but it is not 'homework' this issue crops up in the course materials.
I have two threads operating on the same set of entities. My problem is that every way I have tried, the entities within a query result set (query performed within a transaction) in one thread can be modified so that the result set is no longer valid for the rest of the transaction.
e.g. from one thread this operation is performed:
static void updatePrices(EntityManager manager, double percentage) {
EntityTransaction transaction = manager.getTransaction();
transaction.begin();
Query query = manager.createQuery("SELECT i FROM Instrument i where i.sold = 'no'");
List<Instrument> results = (List<Instrument>) query.getResultList();
// force thread interruption here (testing non-repeatable read)
try { Thread.sleep(2000); } catch (Exception e) { }
for (Instrument i : results) {
i.updatePrice(percentage);
}
transaction.commit();
System.out.println("Price update commited");
}
And if it is interrupted from another thread with this method:
private static void sellInstrument(EntityManager manager, int id)
{
EntityTransaction transaction = manager.getTransaction();
transaction.begin();
Instrument instrument = manager.find(Instrument.class, id);
System.out.println("Selling: " + instrument.toFullString());
instrument.setSold(true);
transaction.commit();
System.out.println("Instrument sale commited");
}
What can happen is that when the thread within updatePrices() resumes it's query resultSet is invalid, and the price of a sold item ends up being updated to different price to that at which it was sold. (The shop wishes to keep records of items that were sold in the DB). Since there are concurrent transactions occuring I am using a different EntityManager for each thread (from the same factory).
Is it possible (through locking or some kind of context propagation) to prevent the results of a query becoming 'invalid' during a (interrupted) transaction? I have an idea that this kind of scenario is what Java EE is for, but what I want to know is whether its doable in Java SE.
Edit:
Taking Vineet and Pascal's advice: using the #Version annotation in the entity's Class (with an additional DB column) causes the large transaction ( updatePrices() ) to fail with OptimisticLockException. This is very expensive if it happens at the end of a large set of query results though. Is there any way to cause my query (inside updatePrices() ) to lock the relevant rows causing the thread inside sellInstrument() to either block or abort throw an exception (then abort)? This would be much cheaper. (From what I understand I do not have pessimistic locking in Toplink Essentials).
Thread safety
I have a doubt about the way you manage your EntityManager. While a EntityManagerFactory is thread-safe (and should be created once at the application startup), an EntityManager is not and you should typically use one EntityManager per thread (or synchronize accesses to it but I would use one per thread).
Concurrency
JPA 1.0 supports (only) optimistic locking (if you use a Version attribute) and two lock modes allowing to avoid dirty read and non repeatable read through the EntityManager.lock() API. I recommend to read Read and Write Locking and/or the whole section 3.4 Optimistic Locking and Concurrency of the JPA 1.0 spec for full details.
PS: Note that Pessimistic locking is not supported in JPA 1.0 or only through provider specific extensions (it has been added to JPA 2.0, as well as other locking options). Just in case, Toplink supports it through the eclipselink.pessimistic-lock query hint.
As written in the JPA wiki, TopLink Essentials is supposed to support pessimistic locking in JPA 1.0 via a query hint:
// eclipselink.pessimistic-lock
Query Query = em.createQuery("select f from Foo f where f.bar=:bar");
query.setParameter("bar", "foobar");
query.setHint("eclipselink.pessimistic-lock", "Lock");
query.getResultList();
I don't use TopLink so I can't confirm this hint is supported in all versions. If it isn't, then you'll have to use a native SQL query if you want to generate a "FOR UPDATE".
You might want to take a look at the EntityManager.lock() method, which allows you to obtain an optimistic or a pessimistic lock on an entity once a transaction has been initialized.
Going by your description of the problem, you wish to lock the database record once it has been 'selected' from the database. This can be achieved via a pessimistic lock, which is more or less equivalent to a SELECT ... FROM tbl FOR UPDATE statement.