Spring batch: several hosts, one database - spring-batch

My question is simple - is spring batch made in the way that it allows batch processing hosts to be connected to the same database which contains all spring batch's tables like BATCH_JOB_EXECUTION, BATCH_JOB_STATUS etc?
I'm asking because now I have application organized in this way and I have DB deadlocks on those tables...

Related

Can same meta data tables be used for multiple spring batches

Example :
I have two different batches
batch-a , batch-b running on azure connecting to onprem db.
batch-a is deployed first and creates a meta data table.
lets say batch-b is deployed after few months .
Can it use the same meta tables that was created and used by batch-a?
If batch-a and batch-b is different Jobs, then you can use the same Spring Batch Metadata tables, it depends if batch-a and batch-b is connecting to same DB then yes Spring Batch Framework will automatically take care of it.
See my article here: https://prateek-ashtikar512.medium.com/spring-batch-metadata-in-different-schema-c18813a0448a

Spring batch with MongoDB and transactions

I have a Spring Batch application with two databases: one SQL DB for the Spring Batch meta data, and another which is a MongoDB where all the business data is stored. The relation DB still uses DataSourceTransactionManager.
However I dont think the Mongo writes are done within an active transaction with rollbacks. Here is the excerpt from the official Spring Batch documentation on MongoItemWriter:
A ItemWriter implementation that writes to a MongoDB store using an implementation of Spring Data's MongoOperations. Since MongoDB is not a transactional store, a best effort is made to persist written data at the last moment, yet still honor job status contracts. No attempt to roll back is made if an error occurs during writing.
However this is not the case any more; MongoDB introduced ACID transactions in version 4.
How do I go about adding transactions to my writes? I could use #Transactional on my service methods when I use ItemWriterAdapter. But still dont know what to do with MongoItemWriter... What is the right configuration here? Thank you.
I have a Spring Batch application with two databases: one SQL DB for the Spring Batch meta data, and another which is a MongoDB where all the business data is stored.
I invite you to take a look at the following posts to understand the implications of this design choice:
How to java-configure separate datasources for spring batch data and business data? Should I even do it?
How does Spring Batch transaction management work?
In your case, you have a distributed transaction across two data sources:
SQL datasource for the job repository, which is managed by a DataSourceTransactionManager
MongoDB for your step (using the MongoItemWriter), which is managed by a MongoTransactionManager
If you want technical meta-data and business data to be committed/rolled back in the scope of the same distributed transaction, you need to use a JtaTransactionManager that coordinates the DataSourceTransactionManager and MongoTransactionManager. You can find some resources about the matter here: https://stackoverflow.com/a/56547839/5019386.
BTW, there is a feature request to use MongoDB as a job repository in Spring Batch: https://github.com/spring-projects/spring-batch/issues/877. When this is implemented, you could store both business data and technical meta-data in the same datasource (so no need for a distributed transaction anymore) and you would be able to use the same MongoTransactionManager for both the job repository and your step.

Spring batch 2.2.0 not writing data to file

We have a spring batch application which inserts data into few tables and then selects data from few tables based on multiple business conditions and writes the data in feed file(flat text file). The application while run generates empty feed file only with headers and no data. The select query when ran separately in SQL developer runs for 2 hours and fetches the data (approx 50 million records). We are using the below components in the application JdbcCursorItemReader and FlatFileWrtier. Below is the configuration details used.
maxBatchSize=100
fileFetchSize=1000
commitInterval=10000
There are no errors or exceptions while the application is run. Wanted to know if we are missing anything here or is any spring batch components not properly used.Any pointers in this regard would be really helpful.

Spring Batch reading from Oracle View

I have an external Oracle Database for which I have access only to one view.
It will take more than 40 minutes for the view to provide results for which there are around 50,000 records in resultset.
We do not have control on optimizing the oracle view.
I have to process the resultset and persist to a table in another postgres database
Is using Spring Batch recommended for my requirement?
Yes, Spring Batch will work fine for this

Can I use Spring JdbcTemplate in Spring Batch Job Implementation?

As per the Spring Batch Documentation, it provides the variety of flavors to read data from the database as ItemReader. In my case, there are lots of business validation needs to be performed against the database.
Let's say after reading data from any of the below source, I wanted to validate them against the multiple databases, Can I use Spring JdbcTemplate in Spring Batch Job Implementation?
1. HibernatePagingItemReader
2. HibernateCursorItemReader
3. JpaPagingItemReader
4. JdbcPagingItemReader
5. JdbcCursorItemReader
You can use whatever mechanism you desire including JdbcTemplate to read database with Spring Batch. Spring Batch as a framework doesn't put any such restrictions.
Spring Batch has those convenient readers ( listed by you ) for simple use cases and if those don't fit in your requirement, you are very free to write your own readers too.
JdbcPagingItemReader itself uses a NamedParameterJdbcTemplate created on datasource that you provide.
You requirement is not very clear to me but I guess, you can do any of the two tasks,
1.Composite Reader - You write your own composite reader and use one of Spring Batch readers as first reader then put in validation logic on those read items
2.Validate in Processor - Read your items with Spring Batch provided readers then process / validate in processor. Chaining of processors is possible in Spring Batch - Chaining ItemProcessors so you can put different transformations if different processors and produce a final output after a chain.