Can I use Spring JdbcTemplate in Spring Batch Job Implementation? - spring-batch

As per the Spring Batch Documentation, it provides the variety of flavors to read data from the database as ItemReader. In my case, there are lots of business validation needs to be performed against the database.
Let's say after reading data from any of the below source, I wanted to validate them against the multiple databases, Can I use Spring JdbcTemplate in Spring Batch Job Implementation?
1. HibernatePagingItemReader
2. HibernateCursorItemReader
3. JpaPagingItemReader
4. JdbcPagingItemReader
5. JdbcCursorItemReader

You can use whatever mechanism you desire including JdbcTemplate to read database with Spring Batch. Spring Batch as a framework doesn't put any such restrictions.
Spring Batch has those convenient readers ( listed by you ) for simple use cases and if those don't fit in your requirement, you are very free to write your own readers too.
JdbcPagingItemReader itself uses a NamedParameterJdbcTemplate created on datasource that you provide.
You requirement is not very clear to me but I guess, you can do any of the two tasks,
1.Composite Reader - You write your own composite reader and use one of Spring Batch readers as first reader then put in validation logic on those read items
2.Validate in Processor - Read your items with Spring Batch provided readers then process / validate in processor. Chaining of processors is possible in Spring Batch - Chaining ItemProcessors so you can put different transformations if different processors and produce a final output after a chain.

Related

Can Spring Batch use no-sql database to store batch metadata?

Can Spring Batch use no-sql database (e.g. firestore, mongodb, etc.) to store batch metadata? If yes, can you share sample?
It can, but it does not provide an implementation yet. We have a feature request for that here: https://github.com/spring-projects/spring-batch/issues/877.
That said, it is a matter of implementing a single interface: JobRepository. Otherwise, you can implement the 4 DAOs required by Spring Batch using your NoSQL database (JobInstanceDao, JobExecutionDao, StepExecutionDao, ExecutionContextDao) and use them with the provided SimpleJobRepository.

SpringBatch is blocking insertion of data in other tables

i am using Postgres as my SQL.My Springboot application uses Spring Batch for processing and insertion of data.I am auditing my code flow like say suppose one 3rd party api which i call if it fails i audit this failure event.This piece of code is in my Spring Batch Writer.I see logs of my AUDIT DTO class getting created however i dont see data in audit table.The same if i move code of auditing outside Spring Batch writer -it works.What should be done so that my audit table insertion code in Spring Batch writer works?
More details would be needed to be sure but I assume your writer writes to the 3rd party API and you write the audit log to the same DataSource that you use for the Spring Batch meta data.
Every write of a chunk that Spring Batch does in a writer is wrapped in a transaction. Such a transaction will be rolled back if you throw an exception in the writer.
You need to write the audit log outside of the transaction created by Spring Batch. For example by using Spring transaction management and starting a new transaction with propagation level REQUIRES_NEW.

Spring batch with MongoDB and transactions

I have a Spring Batch application with two databases: one SQL DB for the Spring Batch meta data, and another which is a MongoDB where all the business data is stored. The relation DB still uses DataSourceTransactionManager.
However I dont think the Mongo writes are done within an active transaction with rollbacks. Here is the excerpt from the official Spring Batch documentation on MongoItemWriter:
A ItemWriter implementation that writes to a MongoDB store using an implementation of Spring Data's MongoOperations. Since MongoDB is not a transactional store, a best effort is made to persist written data at the last moment, yet still honor job status contracts. No attempt to roll back is made if an error occurs during writing.
However this is not the case any more; MongoDB introduced ACID transactions in version 4.
How do I go about adding transactions to my writes? I could use #Transactional on my service methods when I use ItemWriterAdapter. But still dont know what to do with MongoItemWriter... What is the right configuration here? Thank you.
I have a Spring Batch application with two databases: one SQL DB for the Spring Batch meta data, and another which is a MongoDB where all the business data is stored.
I invite you to take a look at the following posts to understand the implications of this design choice:
How to java-configure separate datasources for spring batch data and business data? Should I even do it?
How does Spring Batch transaction management work?
In your case, you have a distributed transaction across two data sources:
SQL datasource for the job repository, which is managed by a DataSourceTransactionManager
MongoDB for your step (using the MongoItemWriter), which is managed by a MongoTransactionManager
If you want technical meta-data and business data to be committed/rolled back in the scope of the same distributed transaction, you need to use a JtaTransactionManager that coordinates the DataSourceTransactionManager and MongoTransactionManager. You can find some resources about the matter here: https://stackoverflow.com/a/56547839/5019386.
BTW, there is a feature request to use MongoDB as a job repository in Spring Batch: https://github.com/spring-projects/spring-batch/issues/877. When this is implemented, you could store both business data and technical meta-data in the same datasource (so no need for a distributed transaction anymore) and you would be able to use the same MongoTransactionManager for both the job repository and your step.

Applying drools rules using spring batch

We have scenario where I have to get data from one database and update data to another database after applying business rules.
I want to use spring batch+drools+hibernate.
Can we apply rules in batch as we have million records at one time?
I am not an expert of drools and I am simply trying to give some context about Spring Batch.
Spring Batch is a Read -> Process -> Write framework and what we do with drools is same as what we do in Process step of Spring Batch i.e. we transform a read item in an ItemProcessor.
How Spring Batch helps you for handling large number of items is by implementing Chunk Oriented processing i.e. We read N-number of items in one go, transform these items one by one in Processor & then write a bulk of items in writer - this way we are basically reducing number of DB calls.
There are further scope of performance improvement by implementing parallelism via partitioning etc if your data can be partitioned on some criteria.
So we read items in bulk , transform one by one & then write in bulk to target database & I don't think hibernate is a good tool for bulk update / insert at write step - I would go by plain JDBC.
Your drools comes into picture at transformation step & that is going to be your custom code & its performance will have nothing to do with Spring Batch i.e. how you initialize sessions , pre compile rules etc . You will have to plug in this code in such a way that you don't initialize drools session etc every time but that should be one time activity.

How to chain two readers in Spring Batch using Java configuration

This is a Spring Batch problem.
I would like to read some information from a CSV, then use that to read from two different tables in a database, then perform an update on those rows. I have a reader than reads from a CSV, and can write to two tables by making a composite writer.
I would prefer a solution that uses Java configuration (it's too bad so many examples use XML configuration on the Web, and haven't been updated to do Java configuration).
The more sample code that you can provide, the better, in particular, if I had to use a listener or a processor, how would I perform the query and get the result.
What you're really looking for isn't chaining of readers but using an ItemProcessor to enrich the data that was read in from the CSV. I'd expect your step to be something along the lines of FlatFileItemReader for the reader, your own custom ItemProcessor that enriches the object provided from the reader, and then (as you mentioned) a CompositeItemWriter that delegates the writes to the appropriate other writers.