When does flush and clear commit? - jpa

I'm using JPA EclipseLink 2.0 with Glassfish 3.1.2.2
I want to know if after I call
em.flush()
em.clear()
The objects are immediatly commited to the database. My problem is I'm doing so many transactions that I'm getting OutOfMemory. I want to avoid this by flushing the transaction's objects.
After I flush and clear, I can't see any immediate entity commited to the database, I can only see them AFTER the whole process is done, which tells me this isn't actually commiting.
If flush and clear doesn't commit:
1) What does it actually do?
2) Why am I no longer getting OutOfMemory?
Please tell me if I'm right:
The objects that were allocated in my RAM are sent to the database, but changes are not yet commited. This only means I cleared my RAM, the objects are now in the DB server but the transaction is not yet commited.

Entities are synchronized to the connected database at transaction commit time. If you only have n = 1 ongoing transaction (here: JTA/container managed), changes on one or more entities get written to the DB the moment you call flush() on the EntityManager instance.
However, changes become "visible" only after the transaction has been properly executed by the container (here: Glassfish) which is responsible for transaction handling. For reference, see. section 7.6.1 (p. 294) of JPA Spec 2.0 which defines:
A new persistence context begins when the container-managed entity manager is invoked (Specifically, when one of the methods of the EntityManager interface is invoked) in the scope of an active JTA transaction, and there is no current persistence context already associated with the JTA transaction. The persistence context is created and then associated with the JTA transaction.
The persistence context ends when the associated JTA transaction commits or rolls back, and all entities that were managed by the EntityManager become detached.
In section 3.2.4 (Synchronization to the Database) of the JPA Spec 2.0 we find:
The state of persistent entities is synchronized to the database at transaction commit.
[..]
The persistence provider runtime is permitted to perform synchronization to the database at other times as well when a transaction is active. The flush method can be used by the application to force synchronization.
It applies to entities associated with the persistence context. The EntityManager and Query setFlushMode methods can be used to control synchronization semantics. The effect of FlushModeType.AUTO is defined in section 3.8.7. If FlushModeType.COMMIT is specified, flushing will occur at transaction commit; the persistence provider is permitted, but not required, to perform to flush at other times. If there is no transaction active, the persistence provider must not flush to the database.
Most likely in your scenario, the container (Glassfish) and/or your application is configured for FlushModeType.COMMIT(*1). In case FlushModeType.AUTO is in place, it is up to the Persistence Provider (EclipseLink) which "is responsible for ensuring that all updates to the state of all entities in the persistence context which could potentially affect the result of the query are visible to the processing of the query." (Section 3.8.7, p. 122)
By contrast, the clear() method does NOT commit anything by itself. It simply detaches all managed entities from the current persistence context, thus causing any changes on entities which have not been flushed (committed) to get lost. For reference, see p. 70 of the linked JPA Spec.
With respect to the OutOfMemoryError, it's hard to tell what's causing this under which circumstances, as you did not provide much detail either. However, I would:
read the aforementioned sections of the JPA specification
check how your environment is configured and
reevaluate how your application is written/implemented, potentially making false assumptions on the transaction handling of the container it is running in.
Related to 2., you might check your persistence.xml whether it configures
<property name="eclipselink.persistence-context.flush-mode" value="COMMIT" />
and change it to AUTO to see if there is any difference.
Hope it helps.
Footnotes
*1: But that's a good guess, as you did not provide that much detail on your setup/environment.

On JPA transaction commit, JPA is doing flush automatically. You should see object in DB right after first transaction end, not only after whole process end. Check if really do more transactions or just one.

Related

Why is persistence context called persistence context?

Why is persistence context called persistence context?
Is it context because it acts as a stepping stone until the object is permanently stored in db?
(Because context is a dynamic feeling from where to where rather than a static feeling?)
A context in computing terms is defined as follows:
In computer science, a task context is the minimal set of data used by a task that must be saved to allow a task to be interrupted, and later continued from the same point.
Context_(computing)
A persistence context is a specific context relating to database persistence. Just like any other context it will store required state relating to database persistence.
Is it context because it acts as a stepping stone until the object is permanently stored in db?
JPA works with transactions, sometimes this can appear hidden if using web frameworks that automatically begin and end the transaction for an http request. The persistence context will act as a kind of cache during a transaction storing any database reads. Any updates made are also saved to the context until the transaction is finished or you manually flush, at which point they will be persisted to the database.

transaction-type="RESOURCE_LOCAL with jta-data-source

I came across an old project using OpenJPA with DB2 running on Websphere Liberty 18. In the persistence.xml file there is a persistent unit with the following declaration:
<persistence-unit name="my-pu" transaction-type="RESOURCE_LOCAL">
<provider>org.apache.openjpa.persistence.PersistenceProviderImpl</provider>
<jta-data-source>jdbc/my-data-source</jta-data-source>
</persistence-unit>
In the case that we are using RESOURCE_LOCAL transactions and there is code to manually manage the transactions scattered throughout the whole application, shouldn't the data source be declared as "non-jta-data-source"? Interestingly it seems the application is working fine despite that. Any ideas why it works fine?
<non-jta-data-source> specifies a data source that refuses to enlist in JTA transactions. This means, if you do userTransaction.begin (or take advantage of any API by which the container starts a transaction for you), and you perform some operations on the data source (which is marked with transactional="false" in Liberty) those operations will not be part of the encompassing JTA transaction and can be committed or rolled back independently. It's definitely an advanced pattern, and if you don't know what you are doing, or temporarily forget that the data source doesn't enlist, you can end up writing code that corrupts your data. At this point, you may be wondering why JPA even has such an option. I expect it isn't intended for the end user's usage of JPA programming model at all, but is really for the JPA persistence provider (Hibernate/EclipseLink/OpenJPA) implementation. For example, if you consider the scenario where a JTA transaction is active on the thread and you perform an operation via JPA where the JPA persistence provider needs to generate a unique key for you, and the persistence provider needs to run some database command to reserve the next block of unique keys, the JPA persistence provider can't just do that within your transaction because you might end up rolling it back, and then the same block of unique keys could be given out twice and errors would occur. The JPA persistence provider really needs to suspend your transaction, run its own transaction, and then resume yours. In my opinion suspend/resume would have been the natural solution here, but the JTA spec doesn't provide a standard way to obtain the TransactionManager, and so my guess is that the JPA spec invented its own solution for situations like this of requiring a data source that bypasses transaction enlistment as an alternative. A JPA provider can run its own transactional operations on the non-jta-data-source while your JTA transaction continues on unimpacted by it. You'll also notice with the example I chose, that it doesn't apply to a number of paths through JPA. If your JPA entity is configured to have the database generate the unique keys instead, then the persistence provider doesn't need to perform its own database operations on a non-jta-data-source. If you aren't using JTA transactions, then the persistence provider doesn't need to worry about enlisting in your transaction because it can just use a different connection, so it doesn't need a non-jta-data-source there either.

Is #PostRemove out of transaction?

I found following information from the spec. But it's not clear enough for me who is not an english native.
The PostPersist and PostRemove callback methods are invoked for an entity after the entity has been made persistent or removed. These callbacks will also be invoked on all entities to which these operations are cascaded. The PostPersist and PostRemove methods will be invoked after the database insert and delete operations respectively. These database operations may occur directly after the persist, merge, or remove operations have been invoked or they may occur directly after a flush operation has occurred (which may be at the end of the transaction). Generated primary key values are available in the PostPersist method.
My question is any transaction related jobs can be rolled back after #PostRemove?
Let's say my entity deletes some offline files on #PostRemove
class MyEntity {
#PostRemove
private void onPostRemove() {
// delete offline files related to this entity
// not restorable!
}
}
Is it possible that those offline files deleted from the storage and the entity still left in the database? (by rollback?)
Yes, it is possible that your files are deleted and your entites are still left in db after a rollback. #PostRemove is in transaction.
If you want to be absolutely sure that your files are deleted if and only if the transaction is successfully completed then you should delete the files after the commit() succeeds not using the callback methods. But if you also need to be sure that the entity is removed if and only if the file is deleted, then you have a problem. You need a transactional way of accessing the file system.
For a simple solution move your files into a to_be_deleted-folder during the db-transaction. Therefore you can use the callback methods. The files are finally deleted when commit() succeeds and restored on failure.
If you want to elaborate it a bit more and your application is running in a java EE container, then you might want to look at CDI events or even at a jca adapter. If you are using Spring you can register a TransactionSynchronizationAdapter see this answer.
It depends.
If you're using multiple flushes (EntityManager#flush()), the transaction could still be rolled back. Otherwise, any callbacks prefixed with Post are executed after the database transaction is complete.

How are state changes to JPA entities actually tracked

When I java.persistence.EntityManger.find() an #Entity the EntityManager checks the Transaction for an existing instance of its associated persistence context. If one exists, then
if the entity being searched for is present in the context, then that is what is returned to the caller of EntityManager.find
if the entity being searched for is not present in the context, then EntityManager gets it from the datasource and puts it there, and then that is what is returned to the caller of EntityManager.Find
And if the transaction does not contain an existing instance of the manager's associated persistence context, then the manage creates one, associates it with the transaction, finds the entity in the datasource, and adds it to that context for management, and then returns that entity to the caller of find.
--> the result is the same in that the caller now has a an managed entity that exists in the persistence context. (important: the persistent context is attached to the transaction, so if the transaction has ended at the point at which the client gets hold of the 'managed' entity, well then the persistence context is no more and the entity is 'detached' an no longer managed).
Now, when I make state changes using setters or other other internal state changing methods on my #entity instance, those changes are tracked because my entity is part of persistence context that will get flushed to the datasource when the transaction finally commits. My question is HOW are the state changes tracked and by what? If I were making the changes via some intermediary object, then that intermediary object could update the persistence context accordingly, but I'm not (or am I?). I'm making the changes directly using my #entity annotated object. So how are these changes being tracked.
Perhaps there are events that are being listened for? Listened for by what? I'm reading books and articles on the subject, but I can't pin this one point down.
State changes are tracked by jpa vendor's internal implementation during entity's lifecycle.
Dirty checking strategy is vendor specific. Can be done by fields comparing or bytecode enhancements like posted in JPA dirty checking.
Although it's vendor specific, the PersistentContext will be aware of the state changes during state synchronization, on flush or on commit.
It's important to remember all the points where flushes can be done :
Manually
Before querying
Before commit

Play Framework - JPA - #Transactional error?

I'm experiencing a very strange behaviour with Transactions using play-2.2.2 with JPA and Eclipse-Link.
My Controller-Action is annotated with #Transactional like this:
#Transactional
public static Result submitOrder() {
// class does call private Methods which persist some entities (methods not annotated)
//...
Action is calling private methods to persist data (This should happen in the same transaction since no other Transaction is started).
During the Methods calls (at random locations) data gets written to db (inserts and updates). Debuging shows that the same (active) transaction is used before and after the write. EntityTransactionImpl:commit is never executed and transaction stays active until request is finished ( watched play.db.jpa.JPA.em().getTransaction() )
How is it possible that the data is written although transaction is still active?
It breakes the setRollbackOnly Mechanism since already written data isn't rolled back.
May there be any kind of Timeout that issue these writes.
Can you suggest any debug-entry-point to narrow down the problem (where can i start debuging the actual write-operations, if not in EntityTransactionImpl:commit)?
Dependencies in build.sbt
persistence.xml
The above described behaviour seemed very odd at first, but then i read about FlushMode and now it makes sense.
The FlushMode of eclipselink as well as hibernate is set to FlushModeType.AUTO
FlushModeType.AUTO automatically flushes Entities to the DB when it thinks it's neccessary. This can be because of an readable operation (Query) on a Persited (but not flushed) Entity but it also happened somehow randomly during my observations.
This breaks the rollback-on-failure mechanism, which I thought must be the standard behaviour of #Transactional.
To achive a propper rollback (on failure or if setRollbackOnly() is set), of all persisted but not flushed entities on transcaction commit, you have to explicitly set the FlushMode at the beginning of your Action.
JPA.em().setFlushMode(FlushModeType.COMMIT);
If you're using Eclipselink, you can also set the following property to make it default behaviour:
<property name="eclipselink.persistence-context.flush-mode" value="commit" />
Links which helped me understand:
Eclipselink Context Flushmode
what to use flush mode auto or commit
performance tuning hibernate