How to properly use Locking or Transactions to prevent duplicates using Spring Data - spring-data

What is the best way to check if a record exists and if it doesn't, create it (avoiding duplicates)?
Keep in mind that this is a distributed application running across many application servers.
I'm trying to avoid these:
Race Conditions
TOCTOU
A simple example:
Person.java
#Entity
public class Person {
#Id
#GeneratedValue
private long id;
private String firstName;
private String lastName;
//Getters and Setters Omitted
}
PersonRepository.java
public interface PersonRepository extends CrudRepository<Person, Long>{
public Person findByFirstName(String firstName);
}
Some Method
public void someMethod() {
Person john = new Person();
john.setFirstName("John");
john.setLastName("Doe");
if(personRepo.findByFirstName(john.getFirstName()) == null){
personRepo.save(john);
}else{
//Don't Save Person
}
}
Clearly as the code currently stands, there is a chance that the Person could be inserted in the database in between the time I checked if it already exists and when I insert it myself. Thus a duplicate would be created.
How should I avoid this?
Based on my initial research, perhaps a combination of
#Transactional
#Lock
But the exact configuration is what I'm unsure of. Any guidance would be greatly appreciated. To reiterate, this application will be distributed across multiple servers so this must still work in a highly-available, distributed environment.

For Inserts: if you want to prevent same recordsto be persisted, than you may want to take some precoutions on DB side. In your example, if firstname should be unique, then define a unique index on that column, or a agroup of colunsd that should be unique, and let the DB handle the check, you just insert & get exception if you're inserting a record that's already inserted.
For updates: use #Version (javax.persistence.Version) annotation like this:
#Version
private long version;
Define a version column in tables, Hibernate or any other ORM will automatically populate the value & also verison to where clause when entity updated. So if someone try to update the old entity, it prevent this. Be careful, this doesn't throw exception, just return update count as 0, so you may want to check this.

Related

jpa repository save method returns different id from the one inserted into database

I'm using spring data (jpaRepository) + Oracle 11g Database.
Here's the code of my JUnit test:
#Test
public void testAjoutUtilisateur() {
Utilisateur utilisateur = new Utilisateur();
(...)
utilisateur=repository.save(utilisateur);
Utilisateur dbutilisateur = repository.findOne(utilisateur.getIdutilisateur());
assertNotNull(dbutilisateur);
When I debug I find that "utilisateur" object returned by repository.save method has an id like "2100" while the corresponding inserted line in the database have an id like "43".
I have an Oracle database with a sequence and a trigger to have the auto incremented property for the id for my "Utilisateur" table.
Here is the id definition in my Utilisateur entity:
#Entity
#NamedQuery(name="Utilisateur.findAll", query="SELECT u FROM Utilisateur u")
#SequenceGenerator(sequenceName="ID_UTILISATEUR_SEQ", name="ID_UTILISATEUR_SEQ")
public class Utilisateur implements Serializable {
private static final long serialVersionUID = 1L;
#Id
#GeneratedValue(strategy=GenerationType.SEQUENCE, generator="ID_UTILISATEUR_SEQ")
private Long idutilisateur;
Where is the problem? Is it within the save method?
Thank you.
Edit:
I figured out that the problem was already solved by the solution of #jhadesdev and the data lines I was talking about were inserted when the triggers were actives.
Finally, I have to mention that by default the JUnit test seems to not insert data in the database (it inserts then rollback). In order to invalidate this behaviour we have to specify the #TransactionConfiguration(defaultRollback=false) annotation in the test class.
For example (in my case):
#RunWith(SpringJUnit4ClassRunner.class)
#ContextConfiguration(locations = { "classpath:context/dao-context.xml" })
#TransactionConfiguration(defaultRollback=false)
#Transactional
public class UtilisateurRepositoryTest {
Hope it can help someone.
The problem is that two separate mechanisms are in place to generate the key:
one at Hibernate level which is to call a sequence and use the value to populate an Id column and send it to the database as the insert key
and another mechanism at the database that Hibernate does not know about: the column is incremented via a trigger.
Hibernate thinks that the insert was made with the value of the sequence, but in the database something else occurred. The simplest solution would probably be to remove the trigger mechanism, and let Hibernate populate the key based on the sequence only.

JPA #PrePersist & LockModeType.OPTIMISTIC_FORCE_INCREMENT

I came up with interesting situation that I already know how to work around, but I was wondering if there is some elegant solution for this.
I have an Entity, which can not have a #Versio field since it is based on a legacy database, and the table has no column to have this kind of value.
Basically it is something like this:
#Entity
public class MyEntity {
#Id
private int id;
#Temporal(TemporalType.DATE)
private java.util.Date lastUpdated;
}
This is basically just for EULA (End User License Agreement) checking.
I want the Date to be updated when the eula has to be re-accepted (The new eula date is got from other place).
For that I was planning to use:
#PrePersist
#PreUpdate
protected void setPersistTime() {
this.lastUpdated = new Date();
}
The #PrePersist is called correctly when the entity is stored for the first time, but on the subsequent times the JPA seems to think that the entity is the same as before and the #PreUpdate won't be called as there is nothing to change.
I was planning to use
em.refresh(myEntity, LockModeType.OPTIMISTIC_FORCE_INCREMENT);
But that won't work without the #Version which I cannot use due to the legacy db. (no version field I could use and the Date is of wrong type for it).
Btw. Using EclipseLink.

JPA 2.0 retrieve entity by business key

I know there have been a number of similar posts about this, but I couldn't find a clear answer to my problem.
To make it as simple as possible, say I have such an entity:
#Entity
public class Person implements Serializable {
#Id
private Long id; // PK
private String name; // business key
/* getters and setters */
/*
override equals() and hashCode()
to use the **name** field
*/
}
So, id is the PK and name is the business key.
Say that I get a list of names, with possible duplicates, which I want to store.
If I simply create one object per name, and let JPA make it persistent, my final table will contain duplicate names - Not acceptable.
My question is what you think is the best approach, considering the alternatives I describe here below and (especially welcome) your own.
Possible solution 1: check the entity manager
Before creating a new person object, check if one with the same person name is already managed.
Problem: The entity manager can only be queried by PK. IS there any workaround Idon't know about?
Possible solution 2: find objects by query
Query query = em.createQuery("SELECT p FROM Person p WHERE p.name = ...");
List<Person> list = query.getResultList();
Questions: Should the objects requested be already loaded in the em, will this still fetch from database? If so, I suppose it would still be not very efficient if done very frequently, due to parsing the query?
Possible solution 3: keep a separate dictionary
This is possible because equals() and hashCode() are overridden to use the field name.
Map<String,Person> personDict = new HashMap<String,Person>();
for(String n : incomingNames) {
Person p = personDict.get(n);
if (p == null) {
p = new Person();
p.setName(n);
em.persist(p);
personDict.put(n,p);
}
// do something with it
}
Problem 1: Wasting memory for large collections, as this is essentially what the entity manager does (not quite though!)
Problem 2: Suppose that I have a more complex schema, and that after the initial writing my application gets closed, started again, and needs to re-load the database. If all tables are loaded explicitly into the em, then I can easily re-populate the dictionaries (one per entity), but if I use lazy fetch and/or cascade read, then it's not so easy.
I started recently with JPA (I use EclipseLink), so perhaps I am missing something fundamental here, because this issue seems to boil down to a very common usage pattern.
Please enlighten me!
The best solution which I can think of is pretty simple, use a Unique Constraint
#Entity
#UniqueConstraint(columnNames="name")
public class Person implements Serializable {
#Id
private Long id; // PK
private String name; // business key
}
The only way to ensure that the field can be used (correctly) as a key is to create a unique constraint on it. You can do this using #UniqueConstraint(columnNames="name") or using #Column(unique = true).
Upon trying to insert a duplicate key the EntityManager (actually, the DB) will throw an exception. This scenario is also true for a manually set primary key.
The only way to prevent the exception is to do a select on the key and check if it exists.

Persisting a list of an interface type with JPA2

I suspect there's no perfect solution to this problem so least worst solution are more than welcome.
I'm implementing a dashboard using PrimeFaces and I would like to persist the model backing it (using JPA2). I've written my own implementation of DashboardModel and DashboardColumn with the necessary annotations and other fields I need. The model is shown below:
#Entity
public class DashboardSettings implements DashboardModel, Serializable{
#Id
private long id;
#OrderColumn( name="COLUMN_ORDER" )
private List<DashboardColumn> columns;
...a few other fields...
public DashboardSettings() {}
#Override
public void addColumn(DashboardColumn column) {
this.columns.add(column);
}
#Override
public List<DashboardColumn> getColumns() {
return columns;
}
...snip...
}
The problem is the columns field. I would like this field to be persisted into it's own table but because DashboardColumn is an interface (and from a third party so can't be changed) the field currently gets stored in a blob. If I change the type of the columns field to my own implementation (DashboardColumnSettings) which is marked with #Entity the addColumn method would cease to work correctly - it would have to do a type check and cast.
The type check and cast is not the end of the world as this code will only be consumed by our development team but it is a trip hazard. Is there any way to have the columns field persisted while at the same time leaving it as a DashboardColumn?
You can try to use targetEntity attribute, though I'm note sure it would be better than explicit cast:
#OrderColumn( name="COLUMN_ORDER" )
#OneToMany(targetEntity = DashboardColumnSettings.class)
private List<DashboardColumn> columns;
Depends on the JPA implementation (you don't mention which one); the JPA spec doesn't define support for interface fields, nor for Collections of interfaces. DataNucleus JPA certainly allows it, primarily because we support it for JDO also, being something that is part of the JDO spec.

Portable JPA Batch / Bulk Insert

I just jumped on a feature written by someone else that seems slightly inefficient, but my knowledge of JPA isn't that good to find a portable solution that's not Hibernate specific.
In a nutshell the Dao method called within a loop to insert each one of the new entities does a "entityManager.merge(object);".
Isnt' there a way defined in the JPA specs to pass a list of entities to the Dao method and do a bulk / batch insert instead of calling merge for every single object?
Plus since the Dao method is annotated w/ "#Transactional" I'm wondering if every single merge call is happening within its own transaction... which would not help performance.
Any idea?
No there is no batch insert operation in vanilla JPA.
Yes, each insert will be done within its own transaction. The #Transactional attribute (with no qualifiers) means a propagation level of REQUIRED (create a transaction if it doesn't exist already). Assuming you have:
public class Dao {
#Transactional
public void insert(SomeEntity entity) {
...
}
}
you do this:
public class Batch {
private Dao dao;
#Transactional
public void insert(List<SomeEntity> entities) {
for (SomeEntity entity : entities) {
dao.insert(entity);
}
}
public void setDao(Dao dao) {
this.dao = dao;
}
}
That way the entire group of inserts gets wrapped in a single transaction. If you're talking about a very large number of inserts you may want to split it into groups of 1000, 10000 or whatever works as a sufficiently large uncommitted transaction may starve the database of resources and possibly fail due to size alone.
Note: #Transactional is a Spring annotation. See Transactional Management from the Spring Reference.
What you could do, if you were in a crafty mood, is:
#Entity
public class SomeEntityBatch {
#Id
#GeneratedValue
private int batchID;
#OneToMany(cascade = {PERSIST, MERGE})
private List<SomeEntity> entities;
public SomeEntityBatch(List<SomeEntity> entities) {
this.entities = entities;
}
}
List<SomeEntity> entitiesToPersist;
em.persist(new SomeEntityBatch(entitiesToPersist));
// remove the SomeEntityBatch object later
Because of the cascade, that will cause the entities to be inserted in a single operation.
I doubt there is any practical advantage to doing this over simply persisting individual objects in a loop. It would be an interesting to look at the SQL that the JPA implementation emitted, and to benchmark.