[Spring Batch][Mongo] Cannot read jobParameters in ItemReader read() - mongodb

I have set a step configuration and ItemReader to read data from mongoDB in same file like this...
#Bean("STEP_FETCH_DATA")
public Step fetchDatabaseStep(
ItemReader<ExampleDao> dataReader,
DataProcessor dataProcessor,
DataWriter dataWriter,
#Qualifier("TASK_EXECUTOR") TaskExecutor taskExecutor
) {
log.info("Initialize step: {}", "STEP_FETCH_DATA");
return stepBuilderFactory.get("STEP_FETCH_DATA")
.<ExampleDao, ExampleDao>chunk(chunkSize)
.processor(dataProcessor)
.reader(dataReader)
.writer(dataWriter)
.taskExecutor(taskExecutor)
.build();
}
#Bean("dataReader")
#StepScope
public ItemReader<ExampleDao> read(#Value("#{jobParameters.get(\"batchRunDate\")}") String batchRunDate) throws UnexpectedInputException, ParseException, NonTransientResourceException {
log.info("Reading start... batchRunDate : {}", batchRunDate);
MongoItemReader<ExampleDao> reader = new MongoItemReader<>();
reader.setTemplate(mongoTemplate);
reader.setSort(new HashMap<String, Sort.Direction>() {{
put("_id", Sort.Direction.DESC);
}});
reader.setTargetType(ExampleDao.class);
reader.setQuery("{}");
return reader;
}
From above code, it can access my jobParameter and working as expected.
However, if I create a class to contain my mongo ItemReader like this
#Component
#Slf4j
public class DataReaderExample {
#Autowired
private MongoTemplate mongoTemplate;
#Bean
#StepScope
public ItemReader<ExampleDao> read(#Value("#{jobParameters.get(\"batchRunDate\")}") String
batchRunDate) throws UnexpectedInputException, ParseException, NonTransientResourceException {
log.info("Reading start... batchRunDate : {}", batchRunDate);
MongoItemReader<ExampleDao> reader = new MongoItemReader<>();
reader.setTemplate(mongoTemplate);
reader.setSort(new HashMap<String, Sort.Direction>() {{
put("_id", Sort.Direction.DESC);
}});
reader.setTargetType(ExampleDao.class);
reader.setQuery("{}");
return reader;
}
}
Then set a step configuration like this. (Notice the .reader(dataReadExample.read(null)). I expected #Value("#{jobParameters.get("batchRunDate") in read() argument will overide the null value)
#Bean("STEP_FETCH_DATA")
public Step fetchDatabaseStep(
DataReaderExample dataReadExample ,
DataProcessor dataProcessor,
DataWriter dataWriter,
#Qualifier("TASK_EXECUTOR") TaskExecutor taskExecutor
) {
log.info("Initialize step: {}", "STEP_FETCH_DATA");
return stepBuilderFactory.get("STEP_FETCH_DATA")
.<ExampleDao, ExampleDao>chunk(chunkSize)
.processor(dataProcessor)
.reader(dataReadExample.read(null))
.writer(dataWriter)
.taskExecutor(taskExecutor)
.build();
}
My log.info("Reading start... batchRunDate : {}", batchRunDate); the batch will always print out as null value and the #Value("#{jobParameters.get("batchRunDate") is not working. Seem like I cannot access the jobParameters.
Have anyone could explain me about this behavior and how to move the ItemReader to another class. My goal is to seperate ItemReader into another class. Thanks!

Your DataReaderExample is declared as a #Component, it should rather be a #Configuration class in which you declare bean definitions.
I suggest to change the read method to itemReader or something similar because its purpose is to define the item reader bean and not actually read the data.
Once that done, you can import your DataReaderExample configuration class in your application context and autowire the item reader in your step:
#Bean("STEP_FETCH_DATA")
public Step fetchDatabaseStep(
ItemReader<ExampleDao> itemReader ,
DataProcessor dataProcessor,
DataWriter dataWriter,
#Qualifier("TASK_EXECUTOR") TaskExecutor taskExecutor
) {
log.info("Initialize step: {}", "STEP_FETCH_DATA");
return stepBuilderFactory.get("STEP_FETCH_DATA")
.<ExampleDao, ExampleDao>chunk(chunkSize)
.processor(dataProcessor)
.reader(itemReader)
.writer(dataWriter)
.taskExecutor(taskExecutor)
.build();
}

Related

Should Job/Step/Reader/Writer all be bean?

As fars as all the examples from the Spring Batch reference doc , I see that those objects like job/step/reader/writer are all marked as #bean, like the following:
#Bean
public Job footballJob() {
return this.jobBuilderFactory.get("footballJob")
.listener(sampleListener())
...
.build();
}
#Bean
public Step sampleStep(PlatformTransactionManager transactionManager) {
return this.stepBuilderFactory.get("sampleStep")
.transactionManager(transactionManager)
.<String, String>chunk(10)
.reader(itemReader())
.writer(itemWriter())
.build();
}
I have a scenario that the server side will receive requests and run job concurrently(different job names or same job name with different jobparameters). The usage is to new a job object(including steps/reader/writers) in concurrent threads, so I propabaly will not state the job method as #bean and new a job each time.
And there is actually a differenence on how to transmit parameters to object like reader. If using #bean , parameters must be put in e.g. JobParameters to be late binding into object using #StepScope, like the following example:
#StepScope
#Bean
public FlatFileItemReader flatFileItemReader(#Value(
"#{jobParameters['input.file.name']}") String name) {
return new FlatFileItemReaderBuilder<Foo>()
.name("flatFileItemReader")
.resource(new FileSystemResource(name))
}
If not using #bean , I can just transmit parameter directly with no need to put data into JobParameter,like the following
public FlatFileItemReader flatFileItemReader(String name) {
return new FlatFileItemReaderBuilder<Foo>()
.name("flatFileItemReader")
.resource(new FileSystemResource(name))
}
Simple test shows that no #bean works. But I want to confirm formally:
1、 Is using #bean at job/step/reader/writer mandatory or not ?
2、 if it is not mandatory, when I new a object like reader, do I need to call afterPropertiesSet() manually?
Thanks!
1、 Is using #bean at job/step/reader/writer mandatory or not ?
No, it is not mandatory to declare batch artefacts as beans. But you would want to at least declare the Job as a bean to benefit from Spring's dependency injection (like injecting the job repository reference into the job, etc) and be able to do something like:
ApplicationContext context = new AnnotationConfigApplicationContext(MyJobConfig.class);
Job job = context.getBean(Job.class);
JobLauncher jobLauncher = context.getBean(JobLauncher.class);
jobLauncher.run(job, new JobParameters());
2、 if it is not mandatory, when I new a object like reader, do I need to call afterPropertiesSet() manually?
I guess that by "when I new a object like reader" you mean create a new instance manually. In this case yes, if the object is not managed by Spring, you need to call that method yourself. If the object is declared as a bean, Spring will call
the afterPropertiesSet() method automatically. Here is a quick sample:
import org.springframework.beans.factory.InitializingBean;
import org.springframework.context.ApplicationContext;
import org.springframework.context.annotation.AnnotationConfigApplicationContext;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
#Configuration
public class TestAfterPropertiesSet {
#Bean
public MyBean myBean() {
return new MyBean();
}
public static void main(String[] args) throws Exception {
ApplicationContext context = new AnnotationConfigApplicationContext(TestAfterPropertiesSet.class);
MyBean myBean = context.getBean(MyBean.class);
myBean.sayHello();
}
static class MyBean implements InitializingBean {
#Override
public void afterPropertiesSet() throws Exception {
System.out.println("MyBean.afterPropertiesSet");
}
public void sayHello() {
System.out.println("Hello");
}
}
}
This prints:
MyBean.afterPropertiesSet
Hello

How do I create custom ItemReader for each step in my Spring Batch project

I am trying to use a custom reader, processor and writer in each step:
public Step step1(StepBuilderFactory factory,
ItemReader reader,
ExpireAssessmentWriter writer,
AssessmentItemProcessor processor,
PlatformTransactionManager platformTransactionManager){
return stepBuilderFactory.get("step1")
.transactionManager(platformTransactionManager)
.<Assessment,Assessment>chunk(10)
.reader(reader)
.processor(processor)
.writer(writer)
.build();
}
//update aggregate balance table
#Bean
public Step step2(StepBuilderFactory factory,
ItemReader reader,
BalanceItemWriter writer,
BalanceProcessor processor,
PlatformTransactionManager platformTransactionManager){
return stepBuilderFactory.get("step2")
.transactionManager(platformTransactionManager)
.<Assessment,Assessment>chunk(10)
.reader(reader)
.processor(processor)
.writer(writer)
.build();
}
#Bean
public Step step3(StepBuilderFactory factory,
ItemReader<Assessment> reader,
CustomWriter3 writer,
CustomItemProcessor3 processor,
PlatformTransactionManager platformTransactionManager){
return stepBuilderFactory.get("step3")
.transactionManager(platformTransactionManager)
.<Assessment,Assessment>chunk(10)
.reader(reader)
.processor(processor)
.writer(writer)
.build();
}
The first steps works fine but thats only when I leave this reader in the same class:
private static final String READER_QUERY = "SELECT * FROM TABLE1 WHERE COLUMN='TEST'";
#Bean
public JdbcCursorItemReader<Assessment> reader(DataSource dataSource) {
return new JdbcCursorItemReaderBuilder<Assessment>()
.dataSource(dataSource)
.name("AssessmentUtilityReader")
.sql(READER_QUERY)
.rowMapper(new AssessmentMapper())
.build();
}
How can I create a custom reader for each of these steps that will read it's own query?
Can I create a custom reader that extends JdbcCursorItemReader
and returns this same snippet of code? :
#Bean
public JdbcCursorItemReader<Assessment> reader(DataSource dataSource) {
return new JdbcCursorItemReaderBuilder<Assessment>()
.dataSource(dataSource)
.name("AssessmentUtilityReader")
.sql(READER_QUERY)
.rowMapper(new AssessmentMapper())
.build();
}
```
Since the item type is the same for all steps, you can create a method that accepts a query and returns an item reader:
public JdbcCursorItemReader<Assessment> getReader(DataSource dataSource, String query) {
return new JdbcCursorItemReaderBuilder<Assessment>()
.dataSource(dataSource)
.name("AssessmentUtilityReader") // can be passed as a parameter as well
.sql(query)
.rowMapper(new AssessmentMapper())
.build();
}
Then call this method in each step definition and pass the required query for each step.
To Turn your reader in a custom component, which can be autowired, add the following class:
#Component
public class AssessmentUtilityReader extends JdbcCursorItemReader<Assessment> {
public AssessmentUtilityReader(final DataSource dataSource) {
setName(getClass().getSimpleName());
setDataSource(dataSource);
setRowMapper(new AssessmentMapper());
// language=SQL
setSql(
"""
SELECT *
FROM TABLE1
WHERE COLUMN = 'TEST'
""");
}
}
Hint: The Comment (// language=SQL) is an hint for IntelliJ to use SQL highlighting in the following lines. It's optional.
Simply autowire in steps definition:
#Bean
public Step step3(StepBuilderFactory factory,
AssessmentUtilityReader<Assessment> assessmentUtilityReader,
CustomWriter3 writer,
CustomItemProcessor3 processor,
PlatformTransactionManager platformTransactionManager){
return stepBuilderFactory.get("step3")
.transactionManager(platformTransactionManager)
.<Assessment,Assessment>chunk(10)
.reader(assessmentUtilityReader)
.processor(processor)
.writer(writer)
.build();
}

How to pass JobParameters to myBatisPagingItemReader without using #StepScope

I am using spring batch restart functionality so that it reads from the last failed point forward. My restart works fine as long as I don't use #StepScope annotation to my myBatisPagingItemReader bean method.
I have to use #StepScope so that i can do late binding to get the jobParameters using the input parameter to my myBatisPagingItemReader bean method
#Value("#{JobParameters['run-date']}"))
If I use #StepScope the restart does not work.
I tried adding listener new JobParameterExecutionContextCopyListener() to copy JobParameters to ExecutionContext.
But how will i get access to ExecutionContext inside myBatisPagingItemReader as I don't have ItemReader's open methods?
Not sure how i can get access to jobParameters when running myBatisPagingItemReader without using #StepScope? Please any inputs.
Also not sure if my understanding on spring-batch restart is correct on how it works when new instance (stateful) is used when using #StepScope.
#Configuration
#EnableBatchProcessing
public class BatchConfig {
#Bean
public Step step1(StepBuilderFactory stepBuilderFactory,
ItemReader<Model> myBatisPagingItemReader,
ItemProcessor<Model, Model> itemProcessor,
ItemWriter<Model> itemWriter) {
return stepBuilderFactory.get("data-load")
.<Model, Model>chunk(10)
.reader(myBatisPagingItemReader)
.processor(itemProcessor)
.writer(itemWriter)
.listener(itemReadListener())
.listener(new JobParameterExecutionContextCopyListener())
.build();
}
#Bean
public Job job(JobBuilderFactory jobBuilderFactory, #Qualifier("step1")
Step step1) {
return jobBuilderFactory.get("load-job")
.incrementer(new RunIdIncrementer())
.start(step1)
.listener(jobExecutionListener())
.build();
}
}
#Component
public class BatchInputReader {
#Bean
//#StepScope
public ItemReader<Model> myBatisPagingItemReader(
SqlSessionFactory sqlSessionFactory) {
MyBatisPagingItemReader<Model> reader = new
MyBatisPagingItemReader<>();
Map<String, Object> parameterValues = new HashMap<>();
// populate parameterValues from jobParameters ??
reader.setSqlSessionFactory(sqlSessionFactory);
reader.setParameterValues(parameterValues);
reader.setQueryId("query");
return reader;
}
}
You are declaring a Spring Bean (myBatisPagingItemReader) in a Class annotated with #Component (BatchInputReader). This is not correct.
What you need to do is to declare the mybatis reader as a bean in your configuration class BatchConfig. Once this is done and the bean is annotated with #StepScope, you can pass pass job parameters as follows:
#Configuration
#EnableBatchProcessing
public class BatchConfig {
#Bean
#StepScope
public ItemReader<Model> myBatisPagingItemReader(
SqlSessionFactory sqlSessionFactory,
#Value("#{jobParameters['param1']}") String param1,
#Value("#{jobParameters['param2']}") String param2) {
MyBatisPagingItemReader<Model> reader = new
MyBatisPagingItemReader<>();
Map<String, Object> parameterValues = new HashMap<>();
// populate parameterValues from jobParameters ?? => Those can be now accessed from method parameters
reader.setSqlSessionFactory(sqlSessionFactory);
reader.setParameterValues(parameterValues);
reader.setQueryId("query");
return reader;
}
#Bean
public Step step1(StepBuilderFactory stepBuilderFactory,
ItemReader<Model> myBatisPagingItemReader,
ItemProcessor<Model, Model> itemProcessor,
ItemWriter<Model> itemWriter) {
return stepBuilderFactory.get("data-load")
.<Model, Model>chunk(10)
.reader(myBatisPagingItemReader)
.processor(itemProcessor)
.writer(itemWriter)
.listener(itemReadListener())
.listener(new JobParameterExecutionContextCopyListener())
.build();
}
#Bean
public Job job(JobBuilderFactory jobBuilderFactory, #Qualifier("step1")
Step step1) {
return jobBuilderFactory.get("load-job")
.incrementer(new RunIdIncrementer())
.start(step1)
.listener(jobExecutionListener())
.build();
}
}
More details about this in the Late Binding of Job and Step Attributes section. BatchInputReader will be left empty and is not needed anymore. Less is more! :-)
Hope this helps.
Adding to my question. I have added myBatisPagingItemReader() as suggested to my configuration annoated class.
Restart example when I use #Stepscope annotaton to myBatisPagingItemReader(), the reader is fetching 5 records and I have chunk size(commit-interval) set to 3.
Job Instance - 01 - Job Parameter - 01/02/2019.
chunk-1:
- process record-1
- process record-2
- process record-3
writer - writes all 3 records
chunk-1 commit successful
chunk-2:
process record-4
process record-5 - Throws and exception
Job completes and set to 'FAILED' status
Now the Job is Restarted again using same Job Parameter.
Job Instance - 01 - Job Parameter - 01/02/2019.
chunk-1:
process record-1
process record-2
process record-3
writer - writes all 3 records
chunk-1 commit successful
chunk-2:
process record-4
process record-5 - Throws and exception
Job completes and set to 'FAILED' status
Please note: Here as I am using #Stepscope annotation on myBatisPagingItemReader() bean method, the job creates a new instance , see below log message.
Creating object in scope=step, name=scopedTarget.myBatisPagingItemReader
Registered destruction callback in scope=step, name=scopedTarget.myBatisPagingItemReader
As it is new instance it start the process from start, instead of starting from chunk-2.
If i don't use Stepscope, it restarts from chunk-2 as the restarted job step sets - MyBatisPagingItemReader.read.count=3.
I would like to use Stepscope to use the late bindings, if i use stepscope, is it possible for my myBatisPagingItemReader to set the read.count from the last failure to get restart working?
Or
If I don't use #Stepscope, is there a way to get job parameters inside myBatisPagingItemReader?

How do I set JobParameters in spring batch with spring-boot

I followed the guide at http://spring.io/guides/gs/batch-processing/ but it describes a job with no configurable parameters. I'm using Maven to build my project.
I'm porting an existing job that I have defined in XML and would like to pass-in the jobParameters through the command.
I tried the following :
#Configuration
#EnableBatchProcessing
public class MyBatchConfiguration {
// other beans ommited
#Bean
public Resource destFile(#Value("#{jobParameters[dest]}") String dest) {
return new FileSystemResource(dest);
}
}
Then I compile my project using :
mvn clean package
Then I try to launch the program like this :
java my-jarfile.jar dest=/tmp/foo
And I get an exception saying :
[...]
Caused by: org.springframework.expression.spel.SpelEvaluationException:
EL1008E:(pos 0): Field or property 'jobParameters' cannot be found on object of
type 'org.springframework.beans.factory.config.BeanExpressionContext'
Thanks !
Parse in job parameters from the command line and then create and populate JobParameters.
public JobParameters getJobParameters() {
JobParametersBuilder jobParametersBuilder = new JobParametersBuilder();
jobParametersBuilder.addString("dest", <dest_from_cmd_line);
jobParametersBuilder.addDate("date", <date_from_cmd_line>);
return jobParametersBuilder.toJobParameters();
}
Pass them to your job via JobLauncher -
JobLauncher jobLauncher = context.getBean(JobLauncher.class);
JobExecution jobExecution = jobLauncher.run(job, jobParameters);
Now you can access them using code like -
#Bean
#StepScope
public Resource destFile(#Value("#{jobParameters[dest]}") String dest) {
return new FileSystemResource(dest);
}
Or in a #Configuration class that is configuring Spring Batch Job artifacts like - ItemReader, ItemWriter, etc...
#Bean
#StepScope
public JdbcCursorItemReader<MyPojo> reader(#Value("#{jobParameters}") Map jobParameters) {
return new MyReaderHelper.getReader(jobParameters);
}
I managed to get this working by simply annotating my bean as follows :
#Bean
#StepScope
public Resource destFile(#Value("#{jobParameters[dest]}") String dest) {
return new FileSystemResource(dest);
}

Using JdbcTemplate with Named parameters in spring batch

i'm trying to pass a parameter to my query in spring batch. I decided to create a tasklet and use JdbcTemplate as follows ...
public RepeatStatus execute(StepContribution stepContribution, ChunkContext chunkContext)
throws EpsilonBatchBusinessException {
LOGGER.debug("Enter execute.");
JdbcTemplate jdbcTemplate = new JdbcTemplate(dataSource);
jdbcTemplate.query(queryString,
new PreparedStatementSetter() {
public void setValues(PreparedStatement preparedStatement) throws SQLException {
preparedStatement.setInt(1, runNumber);
}
},
rowMapper);
LOGGER.debug("Exit execute.");
return RepeatStatus.FINISHED;
}
So am injecting to this bean a dataSource, queryString, rowMapper object, and the parameter (runNumber) .. This tasklet will be called within a step to create a list. I usually pass the row mapper to JdbcCursorItemReader spring bean and wouldn't write a tasklet, but my query string needs a parameter hence am writing this tasklet. Am just not sure if this tasklet will do the trick as with JdbcCursorItemReader? Your input wil be appreciated
A better option would be to use the JdbcCursorItemReader and write a custom PreparedStatementSetter.
The PreparedStatementSetter interface is very simple; pretty much all the code you'd need to write is below. Once the setter is written, all you need to do is configure it as a new bean with the runNumber value injected in the config, and then inject that bean into a JdbcCursorItemReader. This allows you to use all the usual ItemReaders and ItemWriters instead of having to implement everything by hand in a Tasklet.
package com.foo;
import java.sql.PreparedStatement;
import java.sql.SQLException;
import org.springframework.jdbc.core.PreparedStatementSetter;
public class YourParamSetter implements PreparedStatementSetter {
private int runNumber;
public void setValues(PreparedStatement ps) throws SQLException {
ps.setInt(1, runNumber);
}
public void setRunNumber(int runNumber) {
this.runNumber = runNumber;
}
public int getRunNumber() {
return runNumber;
}
}