lifetime of Entitymanager instance - jpa

I have a requirement in which we create a single Statelebean which creates a Container Managed EntityManager instance(using #PersistenceContext) in EJB3 environment.
In this single Stateless bean created, we create threads which are executed in specific time intervals. This thread would be running for months.
I have a doubt as to whether the single EntityManager instance obtained from the container (using CMP) can be used for the entire lifetime(> 1yrs).

To the lifetime of EntityManager: I think it is more a question of the DB connection lifetime. In this case, when the JPA provider detects a connection time-out, if you configured your JDBC connection string wth autoReconnect=true you would expect that another connection is built. Also you should search for possibilities of setting a big timeout.
On the other side, you probably ignore that in EJB you are not allowed to open new Threads. In your case, you would have some problems when it comes to managed entities (that are changed in different threads) and to transaction problems. Instead I would use the Timer Service.

An EntityManager seems meant to represent a transactional space. To me, it doesn't make sense to use a single transactional space for the entire life of a thread that will be long lived but it depends on your design and provider implementation how feasible this is. If you are going to use a single EM, make sure it is not shared between threads, and monitor its resource usage, as they are required by JPA to cache every entity read through them as managed instances; you might want to occasionally call em.clear() to detach managed instances so they can be garbage collected at logical points.
I don't think injection will work, as the container should tie the EntityManager to the life of the bean it is injected into, not the life of your thread. You will want to obtain the EntityManagerFactory and obtain/manage your own EntityManager lifecycles for your threads.

Related

If a #Transactional annotated method is called from another #Transactional annotated method, does this result in two connections being consumed?

Off late, we have been getting hikari connection pools getting exhausted more than a couple of times. The exception being thrown is as follows:
"org.springframework.transaction.CannotCreateTransactionException:
Could not open JDBC Connection for transaction;
nested exception is java.sql.SQLTransientConnectionException:
HikariPool-1 - Connection is not available, request timed out after 30000ms."
What I have observed is that some unaware developers have added #Transactional annotation to a lot of simple get calls on the DB (Postgres).
We use JdbcTemplate to make DB calls with a default connection pool size of 10.
The public endpoints at the controllers are already annotated as #Transactional. Can adding the #Transactional annotation to the DAO Beans result in creation of nested connections when the service layer, which is a seperate bean, which calls the DAO layer is already #Transactional.
We also have a few scheduled CRON jobs, and I see that these are not exposed via a public api or exposed via a controller, but do I still need to add #Transactional to the parent level methods in such cron/internal methods to be able to optimally make db calls? We are not expecting changes to roll back for these cron jobs. We already use JdbcTemplate which uses Hikari Connection poo. Is #Transactional need at all in such cases to optimise performance.
All configurations are default spring boot configurations, so default Tx.Type is REQUIRED unless explicitly set.
#Transactional should be mainly use in service layer when you are about to use database connections, it should be used on minimum methods that represent a business flow.
I think you are over using it and may create unnecessary connections, thus create timeout on overloaded pool
Notice that you use only one database connection pool,as HikariCP, and not also PGBouncer for example.
Also see more performance/configuration tweaks for HikariCP in its wiki page

Always open EntityManager

I have fount this question, Keeping JPA EntityManager open?, but I still have some concerns.
Is it good idea to have always open EnityManager during application life? Does it consume some resources like database connection? Does it keep entities or it will release them if it uses Weak references? I use EclipseLink 2.x.
Thanks
Zlaja
EntityManager was designed to be rather short-lived. It is technically possible to keep it open for a long time, but sooner or later you will face the following issues:
As you wrote EnityManager keeps loaded entities and indeed it keeps them using weak references (at least with Hibernate, but I'm not sure if this is required by JPA specs). So they should be freed before JVM runs out of memory. Unfortunately, I've seen that keeping large number of entities affects EM performance a lot (negatively of course) when the number grows.
Open EM may consume a database connection, eg. when there are lazy-loadable objects in memory.
EM by definition is not thread-safe, so in web applications (for instance) reusing/sharing one instance is totally unacceptable.
Probably the biggest problem is that when there is any error occurring in EM (eg. on transaction commit due to violation of DB constraints), JPA requires that EM should be closed ASAP and discarded. This will put all your entities residing in memory to detached state, meaning that touching any lazily-loaded collections/references will fail. A workaround for that is to reload all entities, but it's difficult in bigger applications when they are scattered all over the application layers. A solution to that is to start working with detached entities and use EntityManager.merge(). But this usually requires changing programming model, and, in particular, is kind of contradictory to "always-open" entity manager approach. You should use only one approach and stick to it.
So generally it's better to keep EntityManager short-lived, it really simplifies a lot things.

Reusing entity manager by em.clear() or creating a new entity manager?

In my case of application managed transaction, I've to choose between:
Using one single EntityManager and calling clear() before each new transaction. Share the EntityManager using a ThreadLocal.
Creating a new EntityManager for each transaction.
I don't have much experience on JPA. My question is which one is better in terms of performance?
I would recommend creating a new EntityManager per transaction. This is the way JPA was designed. The EntityManager should not be an expensive object to create. (the EntityManagerFactory is very expensive though, so ensure you only have one of those).
The link provided by okwap is very helpfull. To make sure it will not slip through, and to follow the board rules, I put a copy here:
- an EntityManager contains a persistence context, that will track
everything read through it, so to avoid bloated memory, you should
acquire a new one, or clear it at some point
- if you read the same object through two different EntityManager you
will get different objects back, so will loose object identity, which
is something to consider
Based on that, I will add, that reading through two different EntityManager may even give objects with different content, if a database transaction was performed by someone else in the meantime. But if reading repeatedly through the same entitymanager, the 2nd read wil just get the objet from the entitymanager cache, so the newer state wil just not be visible.

Reworking EF nested connections to avoid MSDTC on Azure

I've deployed to Azure and Azure SQL, which doesn't support MSDTC and I'm having trouble understanding how to rework my code to prevent what I assume is nested connections. I'm fairly new to EF and my knowledge of TransactionScope is not wonderful, so I'm not sure that I have the right pattern.
I am trying to use repos, which call on a shared instance of the ObjectContext (I tried to dispose on EndRequest but had issues, so this is another problem for me).
I have a transaction which calls SaveChanges on the ObjectContext instance several times, but at some point it becomes disposed. What is governing this and can you recommend what I can do to get it working correctly?
If you want to avoid issues with distributed transaction you must handle connection manually because you need only one opened connection per TransactionScope = one context instance with one connection used for all queries and database updates. The code should look like:
using (var context = new YourObjectContext()) {
context.Connection.Open();
...
}
I am trying to use repos, which call on a shared instance of the
ObjectContext (I tried to dispose on EndRequest but had issues, so
this is another problem for me).
If you share your context instance among multiple request or even worse if you use just single context instance to handle all your requests you should stop now and completely redesign your application. Otherwise it will not work correctly.

EF usage from thread spawned from Role.OnStart()

I'm using EF code-first to manage my DB connection, with an explicit connection string declared in web.config. I would like to schedule some DB cleaning process (like deleting test transactions every day), so I'm spawning a thread from Role.OnStart() with proper concurrency management among the instances.
But I'm getting DB database exceptions, like the DB not matching my model, whereas I'm sure it does (the same code used from "inside" the app works well). So my guess is that web.config is not used from the thread, so EF probably uses the default connection string.
What would be the best way to use my connection string from there ?
Thanks
The OnStart method doesn't run in the same process as your web application meaning it doesn't make use of the web.config. I suggest you store the connection string in the service configuration and read it from here when initializing your context.
An other advantage is that you change the setting without re-deploying the application.