Why is persistence context called persistence context? - jpa

Why is persistence context called persistence context?
Is it context because it acts as a stepping stone until the object is permanently stored in db?
(Because context is a dynamic feeling from where to where rather than a static feeling?)

A context in computing terms is defined as follows:
In computer science, a task context is the minimal set of data used by a task that must be saved to allow a task to be interrupted, and later continued from the same point.
Context_(computing)
A persistence context is a specific context relating to database persistence. Just like any other context it will store required state relating to database persistence.
Is it context because it acts as a stepping stone until the object is permanently stored in db?
JPA works with transactions, sometimes this can appear hidden if using web frameworks that automatically begin and end the transaction for an http request. The persistence context will act as a kind of cache during a transaction storing any database reads. Any updates made are also saved to the context until the transaction is finished or you manually flush, at which point they will be persisted to the database.

Related

Should my logs table be managed by entity framework?

I want to log exceptions to my database to ensure failures are recorded. I am using entity framework.
Should I setup an encapsulated logging service that records to a table which is not managed by entity framework or should I just make an ef class called Log?
Im thinking that a log is not really an entity that represents my application parts, but represents meta data which is why I ask.
Consider a separate (bounded) context for your general logging. If logs happen to reference top-level entities you can define minimal entity definitions for these as well. Logging operations are heavy-write, so by keeping a separate DbContext you minimize the spin-up time.
When it comes to auditing (I.e. persisting change tracking) then I commonly use a pattern that hooks directly into the DbContext events and records information based on when entities are updated, inserted, or deleted.

When does flush and clear commit?

I'm using JPA EclipseLink 2.0 with Glassfish 3.1.2.2
I want to know if after I call
em.flush()
em.clear()
The objects are immediatly commited to the database. My problem is I'm doing so many transactions that I'm getting OutOfMemory. I want to avoid this by flushing the transaction's objects.
After I flush and clear, I can't see any immediate entity commited to the database, I can only see them AFTER the whole process is done, which tells me this isn't actually commiting.
If flush and clear doesn't commit:
1) What does it actually do?
2) Why am I no longer getting OutOfMemory?
Please tell me if I'm right:
The objects that were allocated in my RAM are sent to the database, but changes are not yet commited. This only means I cleared my RAM, the objects are now in the DB server but the transaction is not yet commited.
Entities are synchronized to the connected database at transaction commit time. If you only have n = 1 ongoing transaction (here: JTA/container managed), changes on one or more entities get written to the DB the moment you call flush() on the EntityManager instance.
However, changes become "visible" only after the transaction has been properly executed by the container (here: Glassfish) which is responsible for transaction handling. For reference, see. section 7.6.1 (p. 294) of JPA Spec 2.0 which defines:
A new persistence context begins when the container-managed entity manager is invoked (Specifically, when one of the methods of the EntityManager interface is invoked) in the scope of an active JTA transaction, and there is no current persistence context already associated with the JTA transaction. The persistence context is created and then associated with the JTA transaction.
The persistence context ends when the associated JTA transaction commits or rolls back, and all entities that were managed by the EntityManager become detached.
In section 3.2.4 (Synchronization to the Database) of the JPA Spec 2.0 we find:
The state of persistent entities is synchronized to the database at transaction commit.
[..]
The persistence provider runtime is permitted to perform synchronization to the database at other times as well when a transaction is active. The flush method can be used by the application to force synchronization.
It applies to entities associated with the persistence context. The EntityManager and Query setFlushMode methods can be used to control synchronization semantics. The effect of FlushModeType.AUTO is defined in section 3.8.7. If FlushModeType.COMMIT is specified, flushing will occur at transaction commit; the persistence provider is permitted, but not required, to perform to flush at other times. If there is no transaction active, the persistence provider must not flush to the database.
Most likely in your scenario, the container (Glassfish) and/or your application is configured for FlushModeType.COMMIT(*1). In case FlushModeType.AUTO is in place, it is up to the Persistence Provider (EclipseLink) which "is responsible for ensuring that all updates to the state of all entities in the persistence context which could potentially affect the result of the query are visible to the processing of the query." (Section 3.8.7, p. 122)
By contrast, the clear() method does NOT commit anything by itself. It simply detaches all managed entities from the current persistence context, thus causing any changes on entities which have not been flushed (committed) to get lost. For reference, see p. 70 of the linked JPA Spec.
With respect to the OutOfMemoryError, it's hard to tell what's causing this under which circumstances, as you did not provide much detail either. However, I would:
read the aforementioned sections of the JPA specification
check how your environment is configured and
reevaluate how your application is written/implemented, potentially making false assumptions on the transaction handling of the container it is running in.
Related to 2., you might check your persistence.xml whether it configures
<property name="eclipselink.persistence-context.flush-mode" value="COMMIT" />
and change it to AUTO to see if there is any difference.
Hope it helps.
Footnotes
*1: But that's a good guess, as you did not provide that much detail on your setup/environment.
On JPA transaction commit, JPA is doing flush automatically. You should see object in DB right after first transaction end, not only after whole process end. Check if really do more transactions or just one.

How are state changes to JPA entities actually tracked

When I java.persistence.EntityManger.find() an #Entity the EntityManager checks the Transaction for an existing instance of its associated persistence context. If one exists, then
if the entity being searched for is present in the context, then that is what is returned to the caller of EntityManager.find
if the entity being searched for is not present in the context, then EntityManager gets it from the datasource and puts it there, and then that is what is returned to the caller of EntityManager.Find
And if the transaction does not contain an existing instance of the manager's associated persistence context, then the manage creates one, associates it with the transaction, finds the entity in the datasource, and adds it to that context for management, and then returns that entity to the caller of find.
--> the result is the same in that the caller now has a an managed entity that exists in the persistence context. (important: the persistent context is attached to the transaction, so if the transaction has ended at the point at which the client gets hold of the 'managed' entity, well then the persistence context is no more and the entity is 'detached' an no longer managed).
Now, when I make state changes using setters or other other internal state changing methods on my #entity instance, those changes are tracked because my entity is part of persistence context that will get flushed to the datasource when the transaction finally commits. My question is HOW are the state changes tracked and by what? If I were making the changes via some intermediary object, then that intermediary object could update the persistence context accordingly, but I'm not (or am I?). I'm making the changes directly using my #entity annotated object. So how are these changes being tracked.
Perhaps there are events that are being listened for? Listened for by what? I'm reading books and articles on the subject, but I can't pin this one point down.
State changes are tracked by jpa vendor's internal implementation during entity's lifecycle.
Dirty checking strategy is vendor specific. Can be done by fields comparing or bytecode enhancements like posted in JPA dirty checking.
Although it's vendor specific, the PersistentContext will be aware of the state changes during state synchronization, on flush or on commit.
It's important to remember all the points where flushes can be done :
Manually
Before querying
Before commit

Application Managed Transaction (JTA) and Container Managed Persistence (JPA)

Currently I'm working on a software that consists of two parts.
One part is a kind of company wide framework for data processing (kind of self written process engine). It uses JTA UserTransactions and calls a sub processor that is written by our project.
Our "sub processor" is an application on it's own. It uses container managed persistence via JPA. (Websphere with OpenJPA)
A typical workflow is:
process engine loads process data -> start user transaction -> calls sub processor -> writes process data -> ends user transaction
We now experience the following wrong behavior:
The user transaction is committed in the process engine, all the meta data of the process is stored into the db BUT the data the entity manager holds inside our sub processor application is not written to the db.
Is there some manual communication necessary to commit the content of our entity manager?
The Problem we observed had nothing to do with JTA and transactions.
We tried to clean a blob column but this was not possible. I will create another question for it.

Entity Framework Context Management

I have some questions regarding the entity framework context instance management.
My code base is composed of many independent components/agents all running in one process that can only do work by reading messages from their corresponding queues at their convenience(whenever the component is ready, it will pick up the next message from the queue so no concurrency issue at the component level)
Each of the components need to interact with the database independent from other components. I am wondering what is the better way to setup the context instance/s for each component. Here are some of the options
1> Have one instance of the the context used by all components. --> I think this is the worst as it creates many concurrency issues?
2> Give every component an independent instance of the context. --> This looks fine but
is it ok to have many context independent instances in one process while all components are running concurrently in this process?
should I create a new instance of context for every new message that the component will process or keep one context instance for the life of the component? I think the last one makes more sense but I am more used to use context in a using{} bracket and I am not sure if keeping one context live for the life of each component has any complications in the way I am using it?
can I rely on optimistic concrruency so that two different independent components won't put same record in the database given all contexts are in one process?
BTW, I am using entity framework 4.1.
Definitely use one context per component / agent - if component is multithreaded use one context per thread. If each message processing is executed as separate "logical transaction" then use one context per message processing.
Why:
Context internally uses two very important design patterns - Identity map and Unit of work. This answer describes the behaviour enforced by these patterns.
Context and anything else in EF is not thread safe.
Optimistic concurrency doesn't mean that different contexts will not put the same record in the database. Optimistic concurrency means that update statements compare current state in the database with last known state by the context (there is a delay between loading record and saving new values and another context can change the record). If record changed you will get an exception and you must handle it somehow.