How to set transaction isolation per request using Hibernate + Dropwizard + Guice for DI and Spring Data JPA? - postgresql

How, without using Spring as the DI framework (as it offers the #Transactional annotation offering custom isolation level for that transaction), could I have simple custom isolation level for specific/sensible transaction in a web service built on Dropwizard + Hibernate + Guice (DI) + Spring Data JPA + PostgreSQL (all recent versions)?
Here's a simple working example of web service with the exact same stack (pull requests are more than welcome):
https://github.com/jeep87c/dropwizard-guice-springDataJPA-hibernate
We use Spring Data JPA as an abstraction layer above Hibernate to simply save us from writing our own implementation for each DAOs. In an ideal world, it would be the only dependency to Spring for the web service but as you'll see in this code sample, we are doing kind of a hack by resolving the DAO implementation with Spring DI framework (using the beanFactory) so we can then register these in Guice. (I'm more than open to a better solution if you have one but this is not the subject of this question)
In this code sample, AbsenceResource.create voluntary perform a dup persist of the payload received. And there's an acceptance test AbsenceResourceAcceptanceTest.rollbackTest testing this compromised API route expecting the rollback to happen.
The business requirement here is, to create a new absence, it must first verify if no other absences collide with this one for the same employee in the same company. The sample repo I provide is actually simpler than my real life scenario where actually it must verify collision with absences and vacations entities for an employee in a multi-tenant (per company) environnement having a single table per entity (multi-tenancy with column filtering on the company id).
To prevent any concurrency issue resulting in the race condition that would let two colliding absences be wrongly inserted, we would like to set the isolation level to Serializable for this kind of specific transaction as per PostgreSQL documentation reveals to be the only choice to avoid such kind of issue.
We looked into dropwizard-hibernate library but unfortunately, it doesn't provide any way to set the isolation level per transaction.
So before I spend hours replacing Guice by Spring as our DI framework in our web service (as it looks like the only option for now), I'm seeking other potential simple solutions that would achieve the same.

Related

transaction-type="RESOURCE_LOCAL with jta-data-source

I came across an old project using OpenJPA with DB2 running on Websphere Liberty 18. In the persistence.xml file there is a persistent unit with the following declaration:
<persistence-unit name="my-pu" transaction-type="RESOURCE_LOCAL">
<provider>org.apache.openjpa.persistence.PersistenceProviderImpl</provider>
<jta-data-source>jdbc/my-data-source</jta-data-source>
</persistence-unit>
In the case that we are using RESOURCE_LOCAL transactions and there is code to manually manage the transactions scattered throughout the whole application, shouldn't the data source be declared as "non-jta-data-source"? Interestingly it seems the application is working fine despite that. Any ideas why it works fine?
<non-jta-data-source> specifies a data source that refuses to enlist in JTA transactions. This means, if you do userTransaction.begin (or take advantage of any API by which the container starts a transaction for you), and you perform some operations on the data source (which is marked with transactional="false" in Liberty) those operations will not be part of the encompassing JTA transaction and can be committed or rolled back independently. It's definitely an advanced pattern, and if you don't know what you are doing, or temporarily forget that the data source doesn't enlist, you can end up writing code that corrupts your data. At this point, you may be wondering why JPA even has such an option. I expect it isn't intended for the end user's usage of JPA programming model at all, but is really for the JPA persistence provider (Hibernate/EclipseLink/OpenJPA) implementation. For example, if you consider the scenario where a JTA transaction is active on the thread and you perform an operation via JPA where the JPA persistence provider needs to generate a unique key for you, and the persistence provider needs to run some database command to reserve the next block of unique keys, the JPA persistence provider can't just do that within your transaction because you might end up rolling it back, and then the same block of unique keys could be given out twice and errors would occur. The JPA persistence provider really needs to suspend your transaction, run its own transaction, and then resume yours. In my opinion suspend/resume would have been the natural solution here, but the JTA spec doesn't provide a standard way to obtain the TransactionManager, and so my guess is that the JPA spec invented its own solution for situations like this of requiring a data source that bypasses transaction enlistment as an alternative. A JPA provider can run its own transactional operations on the non-jta-data-source while your JTA transaction continues on unimpacted by it. You'll also notice with the example I chose, that it doesn't apply to a number of paths through JPA. If your JPA entity is configured to have the database generate the unique keys instead, then the persistence provider doesn't need to perform its own database operations on a non-jta-data-source. If you aren't using JTA transactions, then the persistence provider doesn't need to worry about enlisting in your transaction because it can just use a different connection, so it doesn't need a non-jta-data-source there either.

DDD, CQRS, onion architecture, ef core for enterprise level app

Recently I have found that following approach works great for many projects that I have worked on.
The issue however is, that I read that ef core DbContext is a UoW by itself, and I should NOT create my own UoW and repositories. But in such case, I am unable to abstract my persistance layer from my application logic layer.
TL;DR question is:
Is it possible to NOT to have own repositories nor own UoW and still follow the mentioned architecture with DbContext as UoW?
My architecture is like follows:
Layer 1 (most inner):
Aggregates, Entities, POCO domain classes, Value Objects
Layer 2:
Domain services
Layer 3:
Application services (CQRS commands, queries, handlers) and Repository Interfaces
Layer 4A: (persistance layer)
Repositories implementation (DbContext injected here)
EF Core mappings (ORM mappings)
Layer 4B:
Asp MVC API (DI registered here)
Controllers of API just issues commands and queries (via MediatR).
The advantage of above approach is that the app core (layers 1, 2 and 3) are completely abstracted from persistance.
The disadvantage is that you really have to write your own Repositories.
Is it correct approach? Or am I missing something?
Why is a DbContext is a unit of work?
The DbContext captures all changes that you are making within one single transaction via one single commit (SaveChanges).
Why shouldn't you create your own?
Ideally, you should only be committing to one single data store via one single transaction. If you are either saving to multiple data stores in multiple transactions or saving to the same data store in several transactions, then you face the likely possibility of data corruption. If you are using a distributed transaction across multiple data stores, well then God help you.
SaveChanges should therefore be sufficient, so why create your own?
But what about abstraction?
If SaveChanges is sufficient, then how do we abstract out our dependency on EF? You can introduce an IUnitOfWork interface with a single method, Commit, which you can implement by calling DbContext.SaveChanges.
And repositories?
I am not sure I understand not creating Repositories as a hard rule. As part of abstracting out your persistence layer, it is helpful to have a layer such as IRepository to provide that separation. That said, you should not be creating a repository per table. A repository per Aggregate is more appropriate. Each repository will load the entire Aggregate to ensure consistency within the boundary of the Aggregate.
...
In general, I would caution against following advice that speaks in absolutes if you don't understand the reasoning behind that advice. You should be able to formulate the same conclusion given the same starting information for yourself. Otherwise, you are just applying rote memorization to a pattern that does not always benefit from that approach.

Can couchbase be used as the underlying JobRepository for spring-batch?

We have a requirement where we have to read a batch of a entitytype from the database, submit info about each entity to a service which will callback later with some data to update in the caller entity, save all the caller entities with the updated data. We thought of using spring-batch however we use Couchbase as our database which is eventually consistent and has no support for transactions.
I was going through the spring-batch documentation and I came across the Spring Batch Meta-Data ERD diagram here :
https://docs.spring.io/spring-batch/4.1.x/reference/html/index-single.html#metaDataSchema
With the above information in mind, my question is:
Can Couchbase be used as the underlying job-repository for spring-batch? What are the things I should keep in mind if its possible to use it? Any links to example implementations would be welcome.
The JobRepository needs to be transactional in order for Spring Batch to work properly. Here is an excerpt from the Transaction Configuration for the JobRepository section of the reference documentation:
The behavior of the framework is not well defined if the repository methods are not transactional.
Since Couchbase has no support for transactions as you mentioned, it is not possible to use it as an underlying datasource for the JobRepository.

Using an abstraction layer over DbContext

DbContext in EF Code first implements Unit of Work and Repository patterns as
MSDN site said:
A DbContext instance represents a combination of the Unit Of Work and Repository patterns such that it can be used to query from a database and group together changes that will then be written back to the store as a unit. DbContext is conceptually similar to ObjectContext.
Does it means that using another UoW and Repository abstractions(such as IRepository and IUnitOfWor), over DbContext is wrong?
In the other word does using another abstraction layer over DbContext add any additional value to our code?
Values such as technology independent DAL(Our Domain will depends on IRepository and IUnitofWork instead of DbContext)
Consider this - you currently have two strong ORMs, each having it's pros and cons over the other:
Entity Framework
NHibernate
Additionally there are several more micro ORMs, such as:
Dapper
Massive
PetaPoco
...
And to make things even more complicated, there are clients / drivers for non-SQL databases such as:
C# driver for MongoDb
StackExchange Driver for Redis
...
And of course, one more thing that always has to be taken in consideration is whether there will be testing that would include mocking the data access layer.
Decision whether you should use UoW/Repository pattern should come from your project itself.
If your project is short-termed, with limited budget, and you are not likely to be using anything else but Entity Framework and SQL, then introducing UoW/Repository layer of abstraction will just take you additional pointless development time which you could have used in something else or completed project earlier and earned some extra cash.
However, if project is long-running and involves more complex development lifecycle that includes continuous testing, then UoW/Repository pattern is a must. With amount of databases that are now in usage and NoSQL movement coming heavily in .NET ecosystem, making a decision to nail selection of ORM and database might cause severe refactoring once you decide to scale out (i.e. scaling out with MongoDb is much cheaper than with SQL, so your client might ask you out of sudden to move everything to MongoDb). As sides are shifting constantly right now and new ideas are being implemented (such as combined graph+document databases), no one can make a good statement which database will be best choice for your project in 1 year from now.
There is no bool answer to this question.
This is just my point of view, and it is coming from developer who works on both short-termed and long-running projects.

Design decision for a Java EE 6 (EJB, JSF, CDI, JPA) application

I am developing a small (but growing) Java EE project based on the technologies EJB 3.1, JSF 2, CDI (WELD) and JPA 2 deployed on a JBOSS AS 7.1Beta1.
As a starting point I created a Maven project based on the Knappsack Maven archetypes.
My architecture is basically the same provided by the archetype and as my project grows I think this archetype seems to be reaching its limits. I want to modify the basic idea of the archetype according to my needs. But let me first explain how the project is organized at the moment.
The whole project is built around Seam like Home classes. The view is referencing them (via EL in xhtml templates). Most of the Home classes are #Named and #RequestScoped (or shortly #Model) or #ConversationScoped and Entripse Java Beans are #Injected. Basically these (normally #Local) EJBs are responsible for the database access (Some kind of DAOs) to get transactions managed automatically by the container. So every DAO class has its own EntityManager injected via CDI. At the moment every DAO integrates aspects which logically belong to each other (e. g. there is a SchoolDao in the archetype which is responsible for creating Teachers, Students and Courses).
This of course results in growing DAOs which have no well defined task and which become hard to maintain and hard to understand. And as a painful side effect the risk of duplicate code grows.
As a consequence I want to breakup this design by having only DAOs which are responsible for one specific task (a #StudentDao, a #TeacherDaoand so on). And at this point I am in trouble. As each DAO has a reference to its own EntityManager it cannot be guaranteed that something like the following will work (I think it never will :)
Teacher teacher = teacherDao.find(teacherId);
course.setTeacher(teacher);
courseDao.save(course);
The JPA implementaion complains about a null value for column COURSE.TEACHER_ID (assuming Course has a not nullable FK realtionship to Teacher). Each DAO holds its own EntityManager, the teacher is managed by the one in the TeacherDao, but the other one in the CourseDao tries to merge the Course #Entity.
Maybe the archetye I used is not suitable for larger applications. But what would be a appropriate design for such an aplication then IF the technologies I used are obligatory (EJB 3.1 for container managed transactions [and later on other business related stuff], JSF as view technologie, JPA as the database mapper and CDI as the 'must have because it's hip:)?
Edit:
I now have an EntityManager injected in the base class all other DAO classes inherit from. So all DAOs use the same instance (debugger shows same object id) but I still have the problem that all entities that I read from the database are immediately detached. This is something that makes me wonder as it means that there is either no container managed transaction or the transaction gets immediately closed after the entity was read. Each DAO is a #Local #Stateless EJB. They are injected into my JSF Beans (#Named and #RequestScoped) from where I want to make use of the CRUD operations. Is there anything I miss?
Having each DOA have its own EntityManager is a very bad design.
You should have an EntityManager per transaction/request and pass it to each DOA, or have them share the same one or get it from the context.