Why do changes to my JPA entity not get persisted to the database? - jpa

In a Spring Boot Applicaion, I have an entity Task with a status that changes during execution:
#Entity
public class Task {
public enum State {
PENDING,
RUNNING,
DONE
}
#Id #GeneratedValue
private long id;
private String name;
private State state = State.PENDING;
// Setters omitted
public void setState(State state) {
this.state = state; // THIS SHOULD BE WRITTEN TO THE DATABASE
}
public void start() {
this.setState(State.RUNNING);
// do useful stuff
try { Thread.sleep(2000); } catch(InterruptedException e) {}
this.setState(State.DONE);
}
}
If state changes, the object should be saved in the database. I'm using this Spring Data interface as repository:
public interface TaskRepository extends CrudRepository<Task,Long> {}
And this code to create and start a Task:
Task t1 = new Task("Task 1");
Task persisted = taskRepository.save(t1);
persisted.start();
From my understanding persisted is now attached to a persistence session and if the object changes this changes should be stored in the database. But this is not happening, when reloading it the state is PENDING.
Any ideas what I'm doing wrong here?

tl;dr
Attaching an instance to a persistence context does not mean every change of the state of the object gets persisted directly. Change detection only occurs on certain events during the lifecycle of persistence context.
Details
You seem to misunderstood the way change detection works. A very central concept of JPA is the so called persistence context. It is basically an implementation of the unit-of-work pattern. You can add entities to it in two ways: by loading them from the database (executing a query or issuing an EntityManager.find(…)) or by actively adding them to the persistence context. This is what the call to the save(…) method effectively does.
An important point to realize here is that "adding an entity to the persistence context" does not have to be equal to "stored in the database". The persistence provider is free to postpone the database interaction as long as it thinks is reasonable. Providers usually do that to be able to batch up modifying operations on the data. In a lot of cases however, an initial save(…) (which translates to an EntityManager.persist(…)) will be executed directly, e.g. if you're using auto id increment.
That said, now the entity has become a managed entity. That means, the persistence context is aware of it and will persist the changes made to the entity transparent, if events occur that need that to take place. The two most important ones are the following ones:
The persistence context gets closed. In Spring environments the lifecycle of the persistence context is usually bound to a transaction. In your particular example, the repositories have a default transaction (and thus persistence context) boundary. If you need the entity to stay managed around it, you need to extend the transaction lifecycle (usually by introducing a service layer that has #Transactional annotations). In web applications we often see the Open Entity Manager In View Pattern, which is basically a request-bound lifecycle.
The persistence context is flushed. This can either happen manually (by calling EntityManager.flush() or transparently. E.g. if the persistence provider needs to issue a query, it will usually flush the persistence context to make sure, currently pending changes can be found by the query. Imagine you loaded a user, changed his address to a new place and then issue a query to find users by their addresses. The provider will be smart enough to flush the address change first and execute the query afterwards.

Related

How do database connections get managed for Spring Data JPA repositories?

I have a question about how Spring Data repositories are handling the datasource connections. Assuming Spring Data repositories open and close the connection and the connection when the method executes, how does the transaction started by declaring #Transactional in my service layer span across multiple repository calls?
Who handles the database connections? The #Transactional annotation or the JPA repository?
tl;dr
Ultimately it's the Spring JPA / Transaction infrastructure managing the connection via the thead-bound management of EntityManager instances. The scope of the transaction is controlled by #Transactional annotations in the user code but ultimately defaulted in Spring Data JPA's repository implementation. Connection acquisition is performed eagerly in case an OpenEntityManagerInViewFilter is used (enabled by default in Spring Boot 1.x and 2.x).
Details
SimpleJpaRepository is equipped with Spring's #Transactional annotations so that it will make sure it runs transactions in cases JPA requires them (e.g. to execute a call to EntityManager.persist(…) or ….merge(…)). Their default configuration makes sure, they automatically take part in transactions started at higher levels of abstraction. I.e. if you have a Spring component that's #Transactional itself, repositories will simply participate in the already running transaction:
#Component
class MyService {
private final FirstRepository first;
private final SecondRepository second;
// Constructor omitted for brevity
#Transactional
void someMethod() {
… = first.save(…);
… = second.save(…);
}
}
Both repositories participate in the transaction and a failure in one of them will roll back the entire transaction.
To achieve that, the JpaTransactionManager will use the transaction management API exposed by JPA's EntityManager to start a transaction and acquire a connection for the lifetime of the EntityManager instance. See JpaTransactionManager.doBegin(…) for details.
The role of an OpenEntityManagerInViewFilter or –Interceptor
Unless explicitly deactivated, Spring Boot 1.x and 2.x web applications run with an OpenEntityManagerInViewFilter deployed. Its used to create an EntityManager and thus acquire a connection pretty early and keep it around until very late in the request processing, namely after the view has been rendered. This has the effect of JPA lazy-loading being available to the view rendering but keeps the connection open for longer than if it was only needed for the actual transactional work.
That topic is quite a controversial one as its a tricky balance between developer convenience (the ability to traverse object relations to loaded lazily in the view rendering phase) at the risk of exactly that triggering expensive additional queries and keeping the resources in use for a longer time.

How are state changes to JPA entities actually tracked

When I java.persistence.EntityManger.find() an #Entity the EntityManager checks the Transaction for an existing instance of its associated persistence context. If one exists, then
if the entity being searched for is present in the context, then that is what is returned to the caller of EntityManager.find
if the entity being searched for is not present in the context, then EntityManager gets it from the datasource and puts it there, and then that is what is returned to the caller of EntityManager.Find
And if the transaction does not contain an existing instance of the manager's associated persistence context, then the manage creates one, associates it with the transaction, finds the entity in the datasource, and adds it to that context for management, and then returns that entity to the caller of find.
--> the result is the same in that the caller now has a an managed entity that exists in the persistence context. (important: the persistent context is attached to the transaction, so if the transaction has ended at the point at which the client gets hold of the 'managed' entity, well then the persistence context is no more and the entity is 'detached' an no longer managed).
Now, when I make state changes using setters or other other internal state changing methods on my #entity instance, those changes are tracked because my entity is part of persistence context that will get flushed to the datasource when the transaction finally commits. My question is HOW are the state changes tracked and by what? If I were making the changes via some intermediary object, then that intermediary object could update the persistence context accordingly, but I'm not (or am I?). I'm making the changes directly using my #entity annotated object. So how are these changes being tracked.
Perhaps there are events that are being listened for? Listened for by what? I'm reading books and articles on the subject, but I can't pin this one point down.
State changes are tracked by jpa vendor's internal implementation during entity's lifecycle.
Dirty checking strategy is vendor specific. Can be done by fields comparing or bytecode enhancements like posted in JPA dirty checking.
Although it's vendor specific, the PersistentContext will be aware of the state changes during state synchronization, on flush or on commit.
It's important to remember all the points where flushes can be done :
Manually
Before querying
Before commit

Managing Context Lifetime in Entity Framework

I'm having an issue with my Context lifetimes in an N-Tier application.
An example of a wrapper I am using:
Public Class User
Private _user As DB.User
Private context As New DB.MyContainer
Public Sub New(ByVal UserID As Integer)
_user = context.Users.FirstOrDefault(Function(x) x.Id = UserID)
End Sub
Public Sub Save()
context.SaveChanges()
End Function
This method is causing issues in my UI layer. The data can be updated by the UI layer, and this will still return "stale" data because the context has not been disposed. If in Finalize() i set context.Dispose() then i am unable to access any of the properties of the class.
Should i just call .reload() every time, or should i shorten the context? To shorten it wouldn't i have to detach the entity, then reattach it to the new context when Save() is called?
Please see this article:
http://msdn.microsoft.com/en-us/magazine/ee335715.aspx
Create a new ObjectContext instance in a Using statement for each
service method so that it is disposed of before the method returns.
This step is critical for scalability of your service. It makes sure
that database connections are not kept open across service calls and
that temporary state used by a particular operation is garbage
collected when that operation is over. The Entity Framework
automatically caches metadata and other information it needs in the
app domain, and ADO.NET pools database connections, so re-creating the
context each time is a quick operation.

Having static Repository class in a webforms project reuses entity framework connections?

I have a
public static class Repository
in my webforms project.
In the static block of that class I setup my entity framework entity object:
private static readonly ProjectEntities db;
static Repository()
{
db = new ProjectEntities("Name=ProjectEntities");
}
Then I setup some public static methods like this:
public static Order GetOrder(int orderID)
{
return db.Orders.First(o => o.OrderID == orderID);
}
The problem is that when for instance deletions fails (because of some constraint), I randomly gets some clues about that in subsequent connections, coming up as exceptions as a result of queries that should be innocent. For instance, exceptions about deletions as a result of select queries.
I never
db.AcceptAllChanges();
upon any exception, and I should not have to, because across page accesses, there should be no trace of failed queries. Or should it? Is the cleaning responsibility on me?
Those problems should not be because of me using static (please say it is not like that), so is it related to entity framework connection pooling?
Generally speaking the entity framework context is meant to be short lived - i.e. it is generally regarded as a unit of work whereby you create it for a particular task and dispose of it at the end. It's a light weight object, and should be used in this way.
You issue is as a result of the object being long lived (i.e. in a singleton shared across requests). In this case the internal state of the context is becoming invalid - i.e. you try to delete something, it cannot persist those changes to the database, and is therefore in an invalid state.
You could probably resolve your issue by calling the refresh method before making use of the object in every case - this will cause the object to update its state based on the database - but this will probably cause other issues.
However, this is the wrong thing to do - the context should be created, used and disposed per request.
Hope this helps.
I would seriously suggest you investigate the lifecycle management of your context object.
Have a look at this excellent answer as to what your options are.

JPA, scope, and autosave?

I am using JPA and lets say I do something like this
public class MoRun extends Thread {...
public void run() {
final EntityManagerFactory emFactory = Persistence.createEntityManagerFactory("pu");
EntityManager manager = emFactory.createEntityManager();
manager.setFlushMode(FlushModeType.COMMIT);
someMethod(manager);
...
}
public void someMethod(EntityManager manager){
Query query = manager.createNamedQuery("byStates");
List<State> list = query.getResultList();
for (State state : list) {
if(someTest)
state.setValue(...)
}
...
}
So for those objects that pass "someTest" and values are updated are those changes automatically persisted to the db even though there is no transaction and I don't explicitly "manager.save(state)" the object? I ask because it seems like it is and I was wondering if the flush is doing it?
According to the javadoc of FlushMode (I'm assuming this is a JPA 1.0 question), and as pointed out by #Konrad:
If there is no transaction active, the persistence provider must not flush to the database.
Since you're very likely using a transaction-type="RESOURCE_LOCAL" for your persistence unit, since I don't see any begin/commit surrounding your calls to your EntityManager (which is not good, more on this just after), for me there is no transaction active so I wouldn't expect anything to be flushed.
Anyway, as reminded in the nice JPA Concepts page:
With <persistence-unit transaction-type="RESOURCE_LOCAL">
you are responsible for EntityManager
(PersistenceContext/Cache) creating
and tracking...
You must use
the EntityManagerFactory to get an
EntityManager
The resulting
EntityManager instance is a
PersistenceContext/Cache
An
EntityManagerFactory can be injected via the
#PersistenceUnit annotation only (not #PersistenceContext)
You are
not allowed to use #PersistenceContext to refer to a unit
of type RESOURCE_LOCAL
You
must use the EntityTransaction API to begin/commit around every call to your
EntityManger
Calling
entityManagerFactory.createEntityManager()
twice results in two separate
EntityManager instances and therefor
two separate PersistenceContexts/Caches.
It is
almost never a good idea to have more than one instance of an
EntityManager in use (don't create a
second one unless you've destroyed the
first)
So, in my opinion, you should fix your code here, there is no real point at wondering about unexpected behavior if your code is not correct. Just performs calls to your EntityManager inside a transaction.
How do you know there is no transaction? Are you using it from EJB? In that case I bet there is a transaction.
From docs (http://java.sun.com/javaee/6/docs/api/javax/persistence/FlushModeType.html):
If FlushModeType.COMMIT is set, the
effect of updates made to entities in
the persistence context upon queries
is unspecified.
If there is no transaction active, the
persistence provider must not flush to
the database.
If you are in transaction, attached entities (i.e. those loaded in the same transaction) are automatically recorded to database.