Spring data JPA and repeated findById calls with same Id on non Transactional method - spring-data-jpa

working with Spring data JPA and having this method in a #Service class
#Transactional(propagation = Propagation.NEVER)
public void getClientBydIdThreeTimes() {
clientRepository.findById(1L);
clientRepository.findById(1L);
clientRepository.findById(1L);
}
will it hit the database three times? It shouldnt because there is not
transactional environment(propagation = Propagation.NEVER) and each
query its a transaction itself and therefore each time the query is
executed, and entityManager is created having its own persistent
context, right?
I am having a wierd behaviour because when I make an http request and this method is executed, only the first query is sent to the database caching the next two calls but if I call this method from inside my application(like a Spring batch task), there are three sql sent to the database. I dont understand it, it should have the same behaviour, right?
Thanks

Related

Spring data jpa - how to guarantee flush order?

I have a quite complex save process of Spring data JPA repos in one Transaction:
mainRepo.save();
relatedRepo1.save();
relatedRepoOfRelatedRepo1.save();
...
And in the end I call (on mainRepo):
#Modifying
#Query("update mainEntity set finished = true where id = :id")
void setFinishedTrue(#Param("id") UUID id);
I want to guarantee that when setFinishedTrue(id) is called, all the related data are actually on the database because it will start an integration process that requires all needed data is available.
If you are using standard settings JPA will flush data before executing queries. So you are fine.
If you want to be really really really sure you can add an explicit flush operation.
You can do this by either using the JpaRepository.flush operation or by injecting the EntityManager and call flush on it explicitly.

Ninject and WCF Web Services make Entity Framework slow

I have the next setup: WCF Web Services hosted in IIS. Entity Framework 6 used to retrieve data from the DB. Web Services are initialized in the Global.asax.cs, which inherits from NinjectHttpApplication (so we use ninject for dependency injection). In this NinjectHttpApplication, on the CreateKernel method we bind the EF DbContext as follows:
protected override IKernel CreateKernel()
{
var kernel = new StandardKernel();
kernel.Bind<DbContext>().To<MyCustomContext>().InTransientScope();
return kernel;
}
Then, every time a service is called, the Context is obtained as follows in its consturctor:
_context = kernel.Get<DbContext>();
Then, the service retrieves data from the DB as follows:
data = _context.Set<TEntity>().Where(<whatever filter>);
Having said that, my problem is the next: I have a service which is being called many times (with a complex and long query with multple joins), and every time it is called, EF takes ages to produce SQL to send to the DB as result of the Linq To Entities that I've coded. The execution of the query in the DB is nothing (600 milliseconds) but EF is taking ages to produce the SQL every single time this service is called. I suspect this is because of kernel.Bind<DbContext>().To<MyContext>().InTransientScope() who is forcing EF to create a new instance of the DbContext every time there is a call.
I've made a few tests with UnitTests and the behavior is totally different: if you instantiate the service multiple times from the same unit test method and you call it, EF takes long to produce the query only the first time, then it takes no time to produce SQL from the subsequent calls (same query but with different parameters to filter the data to retrieve). From the unit test, the CreateKernel() is of course only called once in the Initialize() method (like in the Web Service in the global.asax.cs), so I dont know what is provoking this huge delay. I suspect EF is capable to keep/cache the query pre-compiled with the unit test approach but not in the real web application. Any clue why?
Please note that the Linq to Entities query is parameterized (strings and date are the params).
Any help very appreciated.
I find you bind your DbContext in a InTransientScope, which means every time you get Dbcontext from ninject , it will create a new DbContext for you.
You could consider using InThreadScope() instead of InTransientScope(), which means ninject will return the same instance when it is in the same thread.
There is also SingleTon scope , which means always returning the same instance ,but this will make the dbcontext too big.

Bulkindexing JPA Entities modified during Spring transaction to Elasticsearch index

I have an JPA Entity Class that is also an Elasticsearch Document. The enviroment is a Spring Boot Application using Spring Data Jpa and Spring Data Elasticsearch.
#Entity
#Document(indexname...etc)
#EntityListeners(MyJpaEntityListener.class)
public class MyEntity {
//ID, constructor and stuff following here
}
When an instance of this Entity gets created, updated or deleted it gets reindexed to Elasticsearch. This is currently achieved with an JPA EntityListener which reacts on PostPersist, PostUpdate and PostRemove events.
public class MyJpaEntityListener {
#PostPersist
#PostUpdate
public void postPersistOrUpdate(MyEntity entity) {
//Elasticsearch indexing code gets here
}
#PostRemove
public void postPersistOrUpdate(MyEntity entity) {
//Elasticsearch indexing code gets here
}
}
That´s all working fine at the moment when a single or a few entities get modified during a single transaction. Each modification triggers a separate index operation. But if a lot of entities get modified inside a transaction it is getting slow.
I would like to bulkindex all entities that got modified at the end (or after commit) of a transaction. I took a look at TransactionalEventListeners, AOP and TransactionSynchronizationManager but wasn´t able to come up with a good setup till now.
How can I collect all modified entities per transaction in an elegant way without doing it per hand in every service method myself?
And how can I trigger a bulkindex at the end of a transaction with the collected entities of this transaction.
Thanks for your time and help!
One different and in my opinion elegant approach, as you don't mix your services and entities with elasticsearch related code, is to use spring aspects with #AfterReturning in the service layer transactional methods.
The pointcut expression can be adjusted to catch all the service methods you want.
#Order(1) guaranties that this code will run after the transaction commit.
The code below is just a sample...you have to adapt it to work with your project.
#Aspect
#Component()
#Order(1)
public class StoreDataToElasticAspect {
#Autowired
private SampleElasticsearhRepository elasticsearchRepository;
#AfterReturning(pointcut = "execution(* com.example.DatabaseService.bulkInsert(..))")
public void synonymsInserted(JoinPoint joinPoint) {
Object[] args = joinPoint.getArgs();
//create elasticsearch documents from method params.
//can also inject database services if more information is needed for the documents.
List<String> ids = (List) args[0];
//create batch from ids
elasticsearchRepository.save(batch);
}
}
And here is an example with a logging aspect.

Trigger JPA callbacks only once per transaction or batch

I am deleting rows in a batch as follows (in an EJB).
int i=0;
List<Category> list = // Sent by a client which is JSF in this case.
for(Category category:list) {
if(++i%49==0) {
i=0;
entityManager.flush();
}
entityManager.remove(entityManager.contains(category) ? category : entityManager.merge(category));
}
Where Category is a JPA entity.
There is a callback that listens to this delete event.
#ApplicationScoped
public class CategoryListener {
#PostPersist
#PostUpdate
#PostRemove
public void onChange(Category category) {
//...
}
}
This callback method is invoked as many times as the number of rows which are deleted. For example, this method will be called 10 times, if 10 rows are deleted.
Is there a way to invoke the callback method only once at the end of a transaction i.e. as soon as the EJB method in which this code is executed returns or at least per batch i.e. when entityManager.flush(); occurs? The former is preferred in this case.
Additional Information :
I am doing some real time updates using WebSockets where clients are to be notified when such CRUD operations are performed on a few database tables. It is hereby meaningless to send a message to all the associated clients on deletion of every row which is performed in a batch - every time a single row is deleted. They should rather be notified only once/at once (as soon as) a transaction (or at least a batch) ends.
The following JPA 2.1 criteria batch delete approach does not work because it does not directly operate upon entities. No JPA callbacks will be triggered by this approach neither by using its equivalent JPQL.
CriteriaBuilder criteriaBuilder=entityManager.getCriteriaBuilder();
CriteriaDelete<Category> criteriaDelete = criteriaBuilder.createCriteriaDelete(Category.class);
Root<Category> root = criteriaDelete.from(entityManager.getMetamodel().entity(Category.class));
criteriaDelete.where(root.in(list));
entityManager.createQuery(criteriaDelete).executeUpdate();
I am using EclipseLink 2.5.2 having JPA 2.1
Unfortunately JPA provides entity callbacks, which are required to be called for each entity instances they listen on, so you will need to add in your own functionality to see that the listener is triggered only once per batch/transaction etc. The other alternative is to use provider specific behavior, in this case EclipseLink's session event listeners: https://wiki.eclipse.org/Introduction_to_EclipseLink_Sessions_(ELUG)#Session_Event_Manager_Events to listen for the PostCalculateUnitOfWorkChangeSet event or some other event that gets triggered when you need.

When does EntityManager commit?

I have the following service...
#Stateless
#LocalBean
public class RandomService {
#EJB RandomString stringTokenizer;
#PersistenceContext
EntityManager em;
public String generate(Actions action)
{
Token token = new Token();
token.setAction(action);
token.setExpiry(new Date());
token.setToken(stringTokenizer.randomize());
em.persist(token);
//em.flush();
return String.format("%010d", token.getId()) + token.getToken();
}
}
If I do not put em.flush() then the line token.getId() will return null (Using DB GENERATED SEQUENCE) though I know if I return Token instead of string to the calling service the id is set. So it seems that EM flushes when the service returns a token object but not when I put String. By putting flush I get what I need is that right though?
Do not confuse flushing with committing. During flush() JPA provider physically sends generated SQL to the database and, in your case, reads the generated ID and populates it in the bean. Note that you should always use the returned entity rather than the original one passed to persist():
token = em.persist(token);
Committing, on the other hand, performs database commit. Obviously it will trigger flush() first, but it won't help you here. But since you are asking - every method in EJB is transactional by default. This means the transaction is committed when you leave the first EJB on the stack: if you call one EJB from another, the callee joins the caller transaction by default (see: transaction propagation behaviour).
Also note that the rules when to flush() are a bit complicated since every provider tries to do this as late as possible and in batches.