spring batch transaction manager clarification - spring-batch

we have two diff datasource (one for spring batch and one for biz domain). When we configure spring batch job, where does transaction manager supposed to reference. Is it datasource pointing to spring batch schema or biz schema?

I would say you should use same transaction manager for your data and meta-data, so that they are consistent.

Standard SB transaction manager is the one pointed by bean named transactionManager.
If you have more than one transaction manager in your spring config, you'll need to specify the name of the bean you want to use.
If you have more than one datasource, one for SB metadata tables and one for processed data, you need distribuited transaction.

Related

Can couchbase be used as the underlying JobRepository for spring-batch?

We have a requirement where we have to read a batch of a entitytype from the database, submit info about each entity to a service which will callback later with some data to update in the caller entity, save all the caller entities with the updated data. We thought of using spring-batch however we use Couchbase as our database which is eventually consistent and has no support for transactions.
I was going through the spring-batch documentation and I came across the Spring Batch Meta-Data ERD diagram here :
https://docs.spring.io/spring-batch/4.1.x/reference/html/index-single.html#metaDataSchema
With the above information in mind, my question is:
Can Couchbase be used as the underlying job-repository for spring-batch? What are the things I should keep in mind if its possible to use it? Any links to example implementations would be welcome.
The JobRepository needs to be transactional in order for Spring Batch to work properly. Here is an excerpt from the Transaction Configuration for the JobRepository section of the reference documentation:
The behavior of the framework is not well defined if the repository methods are not transactional.
Since Couchbase has no support for transactions as you mentioned, it is not possible to use it as an underlying datasource for the JobRepository.

Spring Data JPA | handle multiple DB without entityManager, DataSource and TransactionManager

I've read several examples ex1, ex2, ex3 for using several database on Spring Data (JPA), all of them suggest creating one Configuration file for each database, and three bean (EntityManager, DataSource and TransactionManager) for each entity. Is it the only way we can handle multiple database or there is easier way to do this ?

Spring Batch Execution Status Backed by Database

From the Spring Guides:
For starters, the #EnableBatchProcessing annotation adds many critical
beans that support jobs and saves you a lot of leg work. This example
uses a memory-based database (provided by #EnableBatchProcessing),
meaning that when it’s done, the data is gone.
How can I make the execution state backed by a database (or some other persistent record) so that, in case the application crashes, the job is resumed from the previous state?
My solution, until now, is having my ItemReader be an JdbcCursorItemReader which reads records from a table whose column X is not NULL, and my ItemWriter be a JdbcBatchItemWriter which updates the record with data on column X, making it non-null (so that it won't be picked on the next execution). However, this seems really hackish and I believe there's a more elegant way. Can anyone please shed some light?
When using the #EnableBatchProcessing annotation, if you provide a DataSource bean definition called dataSoure, Spring Batch will use that database for the job repository store instead of the in memory map. You can read more about this functionality in the documentation here: http://docs.spring.io/spring-batch/trunk/apidocs/org/springframework/batch/core/configuration/annotation/EnableBatchProcessing.html

Application Managed Transaction (JTA) and Container Managed Persistence (JPA)

Currently I'm working on a software that consists of two parts.
One part is a kind of company wide framework for data processing (kind of self written process engine). It uses JTA UserTransactions and calls a sub processor that is written by our project.
Our "sub processor" is an application on it's own. It uses container managed persistence via JPA. (Websphere with OpenJPA)
A typical workflow is:
process engine loads process data -> start user transaction -> calls sub processor -> writes process data -> ends user transaction
We now experience the following wrong behavior:
The user transaction is committed in the process engine, all the meta data of the process is stored into the db BUT the data the entity manager holds inside our sub processor application is not written to the db.
Is there some manual communication necessary to commit the content of our entity manager?
The Problem we observed had nothing to do with JTA and transactions.
We tried to clean a blob column but this was not possible. I will create another question for it.

Auditing with Spring Data JPA

I am using Spring Data JPA in an application in which all entity objects need auditing. I know that I can have each either implement Auditable or extend AbstractAuditable, but my problem is coming with the overall auditing implementation.
The example on the Spring Data JPA reference pages seems to indicate that you need an AuditableAware bean for each entity. Is there any way to avoid this extra code and handle it in one place or through one configuration?
The generic parameter of AuditorAware is not the entity you want to capture the auditing information for but rather the creating/modifying one. So it will typically be the user currently logged in or the like.