Can couchbase be used as the underlying JobRepository for spring-batch? - spring-batch

We have a requirement where we have to read a batch of a entitytype from the database, submit info about each entity to a service which will callback later with some data to update in the caller entity, save all the caller entities with the updated data. We thought of using spring-batch however we use Couchbase as our database which is eventually consistent and has no support for transactions.
I was going through the spring-batch documentation and I came across the Spring Batch Meta-Data ERD diagram here :
https://docs.spring.io/spring-batch/4.1.x/reference/html/index-single.html#metaDataSchema
With the above information in mind, my question is:
Can Couchbase be used as the underlying job-repository for spring-batch? What are the things I should keep in mind if its possible to use it? Any links to example implementations would be welcome.

The JobRepository needs to be transactional in order for Spring Batch to work properly. Here is an excerpt from the Transaction Configuration for the JobRepository section of the reference documentation:
The behavior of the framework is not well defined if the repository methods are not transactional.
Since Couchbase has no support for transactions as you mentioned, it is not possible to use it as an underlying datasource for the JobRepository.

Related

how to implement multi-tenant using spring-data-mongodb

I am new to multi-tenancy with mongodb using spring-data-mongodb, we need to use spring-data-mongodb for rest APIs and scheduling tasks( we have more than one schedulers in our application) in the same code with thread-safe. Is autowiring mongoTemplate will make application thread safe as the same mongoTemplate will be accessed from Schedulers and APIs both. Please get me the good practice in such a situation.
Regards
Kris
MongoTemplate itself is thread-safe, i.e. you can call it from multiple threads at the same time, and it will work correctly, i.e. send the different requests correctly to MongoDB.
But that doesn't guarantee consistency: if the scheduler is running and executes multiple updates in the same task, an API call can possibly get some updated records and some other records that aren't updated yet.
By the way: multi-tenancy is having data from multiple organisational entities in the same database. I'm not sure how that links to your question, did you mean multi-threading?
If you use different databases, then you can't use an autowired MongoTemplate.
For autowiring, there must be a single instance, but since the database connection string is a dependency of a MongoTemplate, there must be a single database as well.
You could go for an approach where you do not auto-wire the MongoTemplate directly, but use some sort of factory pattern to create the correct MongoTemplate for the current tenant. See Making spring-data-mongodb multi-tenant for some examples. (It's an old question, but its answers get updated every now and then).
Or you could go with an infrastructural solution, and deploy separate instances of your application, one for each tenant, e.g. on Kubernetes.

How to keep a history of edit of Entities in a JPA application

A JavaEE and JPA application need to keep a record of all the changes made by the user.
Currently, for all the entities, there are fields to record createdBy and lastEditedBy properties. Yet, the requirement of recording all edits is not possible with those properties.
What is the best way to record the history of all edits for a particular entity?
I do not use Spring.
You can use Javers which is db and framework agnostic tool for maintaining operation history.
There are two big differences between JaVers and Envers:
Envers is the Hibernate plugin. It has good integration with Hibernate
but you can use it only with traditional SQL databases. If you chose
NoSQL database or SQL but with another persistence framework (for
example JOOQ) — Envers is not an option.
On the contrary, JaVers can be used with any kind of database and any
kind of persistence framework. For now, JaVers comes with repository
implementations for MongoDB and popular SQL databases. Other databases
(like Cassandra, Elastic) might be added in the future.
Envers’ audit model is table-oriented. You can think about Envers as
the tool for versioning database records.
JaVers’ audit model is object-oriented. It’s all about objects’
Snapshots. JaVers saves them to the single table (or the collection in
Mongo) as JSON documents with unified structure.
You can also achieve this using triggers and storing object differences.
Edit:
JaversAuditableAspect for any kind of repository.
It defines the pointcut on any method annotated with the method-level #JaversAuditable annotation. Choose it if you have repositories that are not managed by Spring Data.
#Bean public JaversAuditableAspect javersAuditableAspect() { return new JaversAuditableAspect(javers(), authorProvider(), commitPropertiesProvider()); }
You can use Hibernate's Envers to audit your entities. It allow you to keep track of ALL changes made to entities - even deleted ones. Most probably you are already using Hibernates (as JPA provider) so integration should be a no problem.
https://hibernate.org/orm/envers/

How to set transaction isolation per request using Hibernate + Dropwizard + Guice for DI and Spring Data JPA?

How, without using Spring as the DI framework (as it offers the #Transactional annotation offering custom isolation level for that transaction), could I have simple custom isolation level for specific/sensible transaction in a web service built on Dropwizard + Hibernate + Guice (DI) + Spring Data JPA + PostgreSQL (all recent versions)?
Here's a simple working example of web service with the exact same stack (pull requests are more than welcome):
https://github.com/jeep87c/dropwizard-guice-springDataJPA-hibernate
We use Spring Data JPA as an abstraction layer above Hibernate to simply save us from writing our own implementation for each DAOs. In an ideal world, it would be the only dependency to Spring for the web service but as you'll see in this code sample, we are doing kind of a hack by resolving the DAO implementation with Spring DI framework (using the beanFactory) so we can then register these in Guice. (I'm more than open to a better solution if you have one but this is not the subject of this question)
In this code sample, AbsenceResource.create voluntary perform a dup persist of the payload received. And there's an acceptance test AbsenceResourceAcceptanceTest.rollbackTest testing this compromised API route expecting the rollback to happen.
The business requirement here is, to create a new absence, it must first verify if no other absences collide with this one for the same employee in the same company. The sample repo I provide is actually simpler than my real life scenario where actually it must verify collision with absences and vacations entities for an employee in a multi-tenant (per company) environnement having a single table per entity (multi-tenancy with column filtering on the company id).
To prevent any concurrency issue resulting in the race condition that would let two colliding absences be wrongly inserted, we would like to set the isolation level to Serializable for this kind of specific transaction as per PostgreSQL documentation reveals to be the only choice to avoid such kind of issue.
We looked into dropwizard-hibernate library but unfortunately, it doesn't provide any way to set the isolation level per transaction.
So before I spend hours replacing Guice by Spring as our DI framework in our web service (as it looks like the only option for now), I'm seeking other potential simple solutions that would achieve the same.

Spring Batch Execution Status Backed by Database

From the Spring Guides:
For starters, the #EnableBatchProcessing annotation adds many critical
beans that support jobs and saves you a lot of leg work. This example
uses a memory-based database (provided by #EnableBatchProcessing),
meaning that when it’s done, the data is gone.
How can I make the execution state backed by a database (or some other persistent record) so that, in case the application crashes, the job is resumed from the previous state?
My solution, until now, is having my ItemReader be an JdbcCursorItemReader which reads records from a table whose column X is not NULL, and my ItemWriter be a JdbcBatchItemWriter which updates the record with data on column X, making it non-null (so that it won't be picked on the next execution). However, this seems really hackish and I believe there's a more elegant way. Can anyone please shed some light?
When using the #EnableBatchProcessing annotation, if you provide a DataSource bean definition called dataSoure, Spring Batch will use that database for the job repository store instead of the in memory map. You can read more about this functionality in the documentation here: http://docs.spring.io/spring-batch/trunk/apidocs/org/springframework/batch/core/configuration/annotation/EnableBatchProcessing.html

Auditing with Spring Data JPA

I am using Spring Data JPA in an application in which all entity objects need auditing. I know that I can have each either implement Auditable or extend AbstractAuditable, but my problem is coming with the overall auditing implementation.
The example on the Spring Data JPA reference pages seems to indicate that you need an AuditableAware bean for each entity. Is there any way to avoid this extra code and handle it in one place or through one configuration?
The generic parameter of AuditorAware is not the entity you want to capture the auditing information for but rather the creating/modifying one. So it will typically be the user currently logged in or the like.