JPA Entity out of sync with database using multiple sessions - jpa

I have a Java EE application with JPA implemented using Eclipselink. I have implemented a basic user login system using the ExternalContext session map. However it seems that often a session becomes out of sync with the database.
The basic process is
1. User A Creates a BidOrder entity.
2. User B creates an AskOrder entity.
3. A monitor checks the two orders match and if so creates an OrderBook entity
4. The changes are Pushed to all users (using Primefaces 5.3 Push)
When I check the prices I use this method in my SessionScoped backing bean for my main view:
public void findLatestPrices()
{
logger.log(Level.INFO, "findLatestPrices with user {0} ",user.getUserId());
findAllOrderBooks();
latestPricesId = new ArrayList<String>();
setLatestPricesResults(request.findLatestPrices());
for (Iterator<OrderBook> it = orderBookResults.iterator(); it.hasNext();)
{
OrderBook orderBook = it.next();
logger.log(Level.INFO, "Found {0} orderbook price", orderBook.getPrice());
}
logger.log(Level.INFO, "End of findLatestPrices with user {0} ",user.getUserId());
}
This calls my RequestScoped Stateful ejb:
public List<OrderBook> findLatestPrices() {
List<OrderBook> orderBooks;
List<OrderBook> orderBooksFiltered;
Map<Member, OrderBook> map = new LinkedHashMap<Member, OrderBook>();
try {
orderBooks = em.createNamedQuery(
"findAllOrderBooks").setHint("javax.persistence.cache.storeMode", "REFRESH")
.getResultList();
for (Iterator<OrderBook> it = orderBooks.iterator(); it.hasNext();) {
OrderBook orderBook = it.next();
Member member = orderBook.getBidOrderId().getMember();
map.put(member, orderBook);
logger.log(Level.INFO, "findLatestPrices orderbook price : {0}",
orderBook.getPrice());
logger.log(Level.INFO, "findLatestPrices orderbook bidorder member : {0}",
orderBook.getBidOrderId().getMember().getMemberId());
logger.log(Level.INFO, "findLatestPrices orderbook lastupdate : {0}",
orderBook.getLastUpdate());
}
...}
I create the EntityManager in the above bean in the following way:
#PersistenceContext
private EntityManager em;
From the logging I can see that sessions return data that is out of sync with the database, i.e. single results when I'd expect two etc. As you can see I've tried setHint to refresh the cache. I've also tried #Cacheable(false) on my OrderBook entity and #Cache(refreshAlways=true) but to no avail.
I'm sending a push event in the #PostPersist of the entity that is created (OrderBook). The javascript event handler in my xhtml page then calls the following remotecommand:
<p:remoteCommand name="updateWidget"
autoRun="false"
actionListener="#{parliamentManager.findLatestPrices}"
update="resultDisplay"
onstart="onStart()"
oncomplete="onComplete()"
onsuccess="onSuccess()"
onerror="onError()">
<f:actionListener binding="#{parliamentManager.findTraders()}" />
<f:actionListener binding="# {parliamentManager.findPortfolios()}" />
</p:remoteCommand>
It seems that often the results of findLatestPrices does not include the latest OrderBook entities for all sessions. Is it possible that an entity is not persisted immediately on a call to #PostPersist, working on the theory that the push is sent to some sessions before the entity is fully persisted and reflected by JPA?
To demonstrate I added a simple command button to call updateWidget() manually. If the session is not updated and I click the button it always updates to the latest data.
Thanks,
Zobbo

There is no locking between sessions, so I'm not quite sure what you mean. Optimistic locking to prevent overwriting stale data is recommended in most JPA provider documentation.
You haven't shown or specified how you are obtaining the EntityManager, or how long it is lived, but there are two levels of caching. The first is the EntityManager itself, which is used to track changes to manage entities and maintain their identity. JPA allows but doesn't mandate a second level of caching, shared at the EntityManagerFactory level. This second level of cache is what the javax.persistence.cache.storeMode is aimed at - it controls what happens when entities are pulled from the shared cache. If the entities are already loaded in the first level cache, because this is meant to represent a transactional scope, they are returned as-is, preserving any unsynchronized changes the application might have been made and JPA provider is required to track.
The only way JPA gives to force a refresh of a managed entity is by calling em.refresh(), though it can also be accomplished by calling em.clear, then re-reading the entity using the javax.persistence.cache.storeMode refresh hint. EclipseLink also has an "eclipselink.refresh" query hint that can be used to force the query to refresh the managed entity instance.

Related

JPA/EclipseLink handling of #Version

In an application using EclipseLink 2.5 and container-managed transactions that needs to merge detached entities from time to time, each entity contains a field with the #Version annotation. Some implementation of optimistic locking is necessary, since enitities are mappted to DTOs and sent to the client, which might then request an update on these entities based on the changes they have made to the corresponding DTOs. The problem I am facing is that whenever persist() or merge() are called on an entity, the corresponding entity being added to the persistence context in the case of persist() or the updated entity returned by merge() do not contain the updated version field. To demonstrate this through an example, suppose we have the following entity:
#Entity
public class FooEntity implements Serializable {
#Id
#GeneratedValue(generator = "MyGenerator")
private Long fooId;
#Version
private Long version;
#Column(nullable = false)
private String description;
// generic setters and getters
}
This entity then gets persisted/merged in an EJB in a fashion similar to the following:
#Stateless
public class FooEjb {
#PersistenceContext(unitName = "FooApp")
private EntityManager entityManager;
public FooEntity create() {
FooEntity entity = new FooEntity();
entity.setDescription("bar");
entityManager.persist(entity);
return entity;
}
public FooEntity update(Long fooId, String description) {
FooEntity entityToUpdate = entityManager.find(FooEntity.class, fooId);
if (entityToUpdate != null) {
entityToUpdate.setDescription(description);
return entityManager.merge(entityToUpdate);
}
return null;
}
}
Calling these EJB methods shows the following behavior:
FooEntity newEntity = fooEjbInstance.create();
newEntity.getFooId(); // returns the generated, non-null value; for the sake of example 43L
newEntity.getVersion(); // returns null
entityManager.find(FooEntity.class, 43L).getVersion(); // returns 1L
// entity with fooId == 42L exists and has a version value of 1L
FooEntity updatedEntity = fooEjbInstance.update(42L, "fhtagn");
updatedEntity.getVersion(); // returns the old value, i.e. 1L
entityManager.find(FooEntity.class, 42L).getVersion(); // returns 2L
This makes the returned entity unsuitable for passing to the client, as any state changes made by the client cannot be persisted due to the merge/persist call rightly causing an OptimisticLockException.
The EJB methods are not explicitly annotated with #TransactionAttribute, which per JPA specs should cause the default value of TransactionAttributeType.REQUIRED to be applied. The current theory is that the phenomenon perceived here has to do with the version field being updated only when the transaction is committed. Since by the time one of the EJB methods above returns, its associated transaction has not yet been committed (and will in fact be committed immediately after the method returns), the version field has not yet been updated. There was a mention of in object vs. in cache storing of the version filed in this question, but I have not been able to find definitive documentation on this. Is this as a whole working as designed, either according to JPA 2.0 or the EclipseLink implementation? If so, how could I best deal with the aforementioned problem?
Merge and persist don't need to immediately go to the database, so operations that depend on, triggers, sequencing and versions might need to call flush or commit to have those values set. Flush forces the context to synchronize with the database, and should set the values appropriately in managed objects. ID generation can be set on persist calls - this can happen when sequences allow for pre-allocation, but not usually when identity objects or triggers are used.
Since an EntityManager context represents a transaction, it is completely isolated from other contexts/transactions. Until the transaction commits, version and other changes cannot be picked up by other processes anyway, so it shouldn't matter when synchronization occurs to other processes. JPA states that most exceptions occur either on the persist/merge call or can be delayed until the context synchronizes with the database (flush/commit) depending on the nature of the operation. Optimistic locking is meant for systems where these collisions are infrequent and retries are less expensive than pessimistic locking every operation

Why do changes to my JPA entity not get persisted to the database?

In a Spring Boot Applicaion, I have an entity Task with a status that changes during execution:
#Entity
public class Task {
public enum State {
PENDING,
RUNNING,
DONE
}
#Id #GeneratedValue
private long id;
private String name;
private State state = State.PENDING;
// Setters omitted
public void setState(State state) {
this.state = state; // THIS SHOULD BE WRITTEN TO THE DATABASE
}
public void start() {
this.setState(State.RUNNING);
// do useful stuff
try { Thread.sleep(2000); } catch(InterruptedException e) {}
this.setState(State.DONE);
}
}
If state changes, the object should be saved in the database. I'm using this Spring Data interface as repository:
public interface TaskRepository extends CrudRepository<Task,Long> {}
And this code to create and start a Task:
Task t1 = new Task("Task 1");
Task persisted = taskRepository.save(t1);
persisted.start();
From my understanding persisted is now attached to a persistence session and if the object changes this changes should be stored in the database. But this is not happening, when reloading it the state is PENDING.
Any ideas what I'm doing wrong here?
tl;dr
Attaching an instance to a persistence context does not mean every change of the state of the object gets persisted directly. Change detection only occurs on certain events during the lifecycle of persistence context.
Details
You seem to misunderstood the way change detection works. A very central concept of JPA is the so called persistence context. It is basically an implementation of the unit-of-work pattern. You can add entities to it in two ways: by loading them from the database (executing a query or issuing an EntityManager.find(…)) or by actively adding them to the persistence context. This is what the call to the save(…) method effectively does.
An important point to realize here is that "adding an entity to the persistence context" does not have to be equal to "stored in the database". The persistence provider is free to postpone the database interaction as long as it thinks is reasonable. Providers usually do that to be able to batch up modifying operations on the data. In a lot of cases however, an initial save(…) (which translates to an EntityManager.persist(…)) will be executed directly, e.g. if you're using auto id increment.
That said, now the entity has become a managed entity. That means, the persistence context is aware of it and will persist the changes made to the entity transparent, if events occur that need that to take place. The two most important ones are the following ones:
The persistence context gets closed. In Spring environments the lifecycle of the persistence context is usually bound to a transaction. In your particular example, the repositories have a default transaction (and thus persistence context) boundary. If you need the entity to stay managed around it, you need to extend the transaction lifecycle (usually by introducing a service layer that has #Transactional annotations). In web applications we often see the Open Entity Manager In View Pattern, which is basically a request-bound lifecycle.
The persistence context is flushed. This can either happen manually (by calling EntityManager.flush() or transparently. E.g. if the persistence provider needs to issue a query, it will usually flush the persistence context to make sure, currently pending changes can be found by the query. Imagine you loaded a user, changed his address to a new place and then issue a query to find users by their addresses. The provider will be smart enough to flush the address change first and execute the query afterwards.

Managing Context Lifetime in Entity Framework

I'm having an issue with my Context lifetimes in an N-Tier application.
An example of a wrapper I am using:
Public Class User
Private _user As DB.User
Private context As New DB.MyContainer
Public Sub New(ByVal UserID As Integer)
_user = context.Users.FirstOrDefault(Function(x) x.Id = UserID)
End Sub
Public Sub Save()
context.SaveChanges()
End Function
This method is causing issues in my UI layer. The data can be updated by the UI layer, and this will still return "stale" data because the context has not been disposed. If in Finalize() i set context.Dispose() then i am unable to access any of the properties of the class.
Should i just call .reload() every time, or should i shorten the context? To shorten it wouldn't i have to detach the entity, then reattach it to the new context when Save() is called?
Please see this article:
http://msdn.microsoft.com/en-us/magazine/ee335715.aspx
Create a new ObjectContext instance in a Using statement for each
service method so that it is disposed of before the method returns.
This step is critical for scalability of your service. It makes sure
that database connections are not kept open across service calls and
that temporary state used by a particular operation is garbage
collected when that operation is over. The Entity Framework
automatically caches metadata and other information it needs in the
app domain, and ADO.NET pools database connections, so re-creating the
context each time is a quick operation.

Repository pattern with EF4 CTP5

I'm trying to implement the repository pattern with ef4 ctp5, I came up with something but I'm no expert in ef so I want to know if what I did is good.
this is my db context
public class Db : DbContext
{
public DbSet<User> Users { get; set; }
public DbSet<Role> Roles { get; set; }
}
and the repository: (simplified)
public class Repo<T> : IRepo<T> where T : Entity, new()
{
private readonly DbContext context;
public Repo()
{
context = new Db();
}
public IEnumerable<T> GetAll()
{
return context.Set<T>().AsEnumerable();
}
public long Insert(T o)
{
context.Set<T>().Add(o);
context.SaveChanges();
return o.Id;
}
}
You need to step back and think about what the repository should be doing. A repository is used for retrieving records, adding records, and updating records. The repository you created barely handles the first case, handles the second case but not efficiently, and doesn't at all handle the 3rd case.
Most generic repositories have an interface along the lines of
public interface IRepository<T> where T : class
{
IQueryable<T> Get();
void Add(T item);
void Delete(T item);
void CommitChanges();
}
For retrieving records, you can't just call the whole set with AsEnumerable() because that will load every database record for that table into memory. If you only want Users with the username of username1, you don't need to download every user for the database as that will be a very large database performance hit, and a large client performance hit for no benefit at all.
Instead, as you will see from the interface I posted above, you want to return an IQueryable<T> object. IQuerables allow whatever class that calls the repository to use Linq and add filters to the database query, and once the IQueryable is run, it's completely run on the database, only retrieving the records you want. The database is much better at sorting and filtering data then your systems, so it's best to do as much on the DB as you can.
Now in regards to inserting data, you have the right idea but you don't want to call SaveChanges() immediately. The reason is that it's best to call Savechanges() after all your db operations have been queued. For example, If you want to create a user and his profile in one action, you can't via your method, because each Insert call will cause the data to be inserted into the database then.
Instead what you want is to separate out the Savechanges() call into the CommitChanges method I have above.
This is also needed to handle updating data in your database. In order to change an Entity's data, Entity Framework keeps track of all records it has received and watches them to see if any changes have been made. However, you still have to tell the Entity Framework to send all changed data up to the database. This happenes with the context.SaveChanges() call. Therefore, you need this to be a separate call so you are able to actually update edited data, which your current implementation does not handle.
Edit:
Your comment made me realize another issue that I see. One downfall is that you are creating a data context inside of the repository, and this isn't good. You really should have all (or most) of your created repositories sharing the same instance of your data context.
Entity Framework keeps track of what context an entity is tracked in, and will exception if you attempt to update an entity in one context with another. This can occur in your situation when you start editing entities related to one another. It also means that your SaveChanges() call is not transactional, and each entity is updated/added/deleted in it's own transaction, which can get messy.
My solution to this in my Repositories, is that the DbContext is passed into the repository in the constructor.
I may get voted down for this, but DbContext already is a repository. When you expose your domain models as collection properties of your concrete DbContext, then EF CTP5 creates a repository for you. It presents a collection like interface for access to domain models whilst allowing you to pass queries (as linq, or spec objects) for filtering of results.
If you need an interface, CTP5 doesn't provide one for you. I've wrapped my own around the DBContext and simply exposed the publicly available members from the object. It's an adapter for testability and DI.
I'll comment for clarification if what I said isn't apparently obvious.

JPA, scope, and autosave?

I am using JPA and lets say I do something like this
public class MoRun extends Thread {...
public void run() {
final EntityManagerFactory emFactory = Persistence.createEntityManagerFactory("pu");
EntityManager manager = emFactory.createEntityManager();
manager.setFlushMode(FlushModeType.COMMIT);
someMethod(manager);
...
}
public void someMethod(EntityManager manager){
Query query = manager.createNamedQuery("byStates");
List<State> list = query.getResultList();
for (State state : list) {
if(someTest)
state.setValue(...)
}
...
}
So for those objects that pass "someTest" and values are updated are those changes automatically persisted to the db even though there is no transaction and I don't explicitly "manager.save(state)" the object? I ask because it seems like it is and I was wondering if the flush is doing it?
According to the javadoc of FlushMode (I'm assuming this is a JPA 1.0 question), and as pointed out by #Konrad:
If there is no transaction active, the persistence provider must not flush to the database.
Since you're very likely using a transaction-type="RESOURCE_LOCAL" for your persistence unit, since I don't see any begin/commit surrounding your calls to your EntityManager (which is not good, more on this just after), for me there is no transaction active so I wouldn't expect anything to be flushed.
Anyway, as reminded in the nice JPA Concepts page:
With <persistence-unit transaction-type="RESOURCE_LOCAL">
you are responsible for EntityManager
(PersistenceContext/Cache) creating
and tracking...
You must use
the EntityManagerFactory to get an
EntityManager
The resulting
EntityManager instance is a
PersistenceContext/Cache
An
EntityManagerFactory can be injected via the
#PersistenceUnit annotation only (not #PersistenceContext)
You are
not allowed to use #PersistenceContext to refer to a unit
of type RESOURCE_LOCAL
You
must use the EntityTransaction API to begin/commit around every call to your
EntityManger
Calling
entityManagerFactory.createEntityManager()
twice results in two separate
EntityManager instances and therefor
two separate PersistenceContexts/Caches.
It is
almost never a good idea to have more than one instance of an
EntityManager in use (don't create a
second one unless you've destroyed the
first)
So, in my opinion, you should fix your code here, there is no real point at wondering about unexpected behavior if your code is not correct. Just performs calls to your EntityManager inside a transaction.
How do you know there is no transaction? Are you using it from EJB? In that case I bet there is a transaction.
From docs (http://java.sun.com/javaee/6/docs/api/javax/persistence/FlushModeType.html):
If FlushModeType.COMMIT is set, the
effect of updates made to entities in
the persistence context upon queries
is unspecified.
If there is no transaction active, the
persistence provider must not flush to
the database.
If you are in transaction, attached entities (i.e. those loaded in the same transaction) are automatically recorded to database.