How to pass JobParameters to myBatisPagingItemReader without using #StepScope - spring-batch

I am using spring batch restart functionality so that it reads from the last failed point forward. My restart works fine as long as I don't use #StepScope annotation to my myBatisPagingItemReader bean method.
I have to use #StepScope so that i can do late binding to get the jobParameters using the input parameter to my myBatisPagingItemReader bean method
#Value("#{JobParameters['run-date']}"))
If I use #StepScope the restart does not work.
I tried adding listener new JobParameterExecutionContextCopyListener() to copy JobParameters to ExecutionContext.
But how will i get access to ExecutionContext inside myBatisPagingItemReader as I don't have ItemReader's open methods?
Not sure how i can get access to jobParameters when running myBatisPagingItemReader without using #StepScope? Please any inputs.
Also not sure if my understanding on spring-batch restart is correct on how it works when new instance (stateful) is used when using #StepScope.
#Configuration
#EnableBatchProcessing
public class BatchConfig {
#Bean
public Step step1(StepBuilderFactory stepBuilderFactory,
ItemReader<Model> myBatisPagingItemReader,
ItemProcessor<Model, Model> itemProcessor,
ItemWriter<Model> itemWriter) {
return stepBuilderFactory.get("data-load")
.<Model, Model>chunk(10)
.reader(myBatisPagingItemReader)
.processor(itemProcessor)
.writer(itemWriter)
.listener(itemReadListener())
.listener(new JobParameterExecutionContextCopyListener())
.build();
}
#Bean
public Job job(JobBuilderFactory jobBuilderFactory, #Qualifier("step1")
Step step1) {
return jobBuilderFactory.get("load-job")
.incrementer(new RunIdIncrementer())
.start(step1)
.listener(jobExecutionListener())
.build();
}
}
#Component
public class BatchInputReader {
#Bean
//#StepScope
public ItemReader<Model> myBatisPagingItemReader(
SqlSessionFactory sqlSessionFactory) {
MyBatisPagingItemReader<Model> reader = new
MyBatisPagingItemReader<>();
Map<String, Object> parameterValues = new HashMap<>();
// populate parameterValues from jobParameters ??
reader.setSqlSessionFactory(sqlSessionFactory);
reader.setParameterValues(parameterValues);
reader.setQueryId("query");
return reader;
}
}

You are declaring a Spring Bean (myBatisPagingItemReader) in a Class annotated with #Component (BatchInputReader). This is not correct.
What you need to do is to declare the mybatis reader as a bean in your configuration class BatchConfig. Once this is done and the bean is annotated with #StepScope, you can pass pass job parameters as follows:
#Configuration
#EnableBatchProcessing
public class BatchConfig {
#Bean
#StepScope
public ItemReader<Model> myBatisPagingItemReader(
SqlSessionFactory sqlSessionFactory,
#Value("#{jobParameters['param1']}") String param1,
#Value("#{jobParameters['param2']}") String param2) {
MyBatisPagingItemReader<Model> reader = new
MyBatisPagingItemReader<>();
Map<String, Object> parameterValues = new HashMap<>();
// populate parameterValues from jobParameters ?? => Those can be now accessed from method parameters
reader.setSqlSessionFactory(sqlSessionFactory);
reader.setParameterValues(parameterValues);
reader.setQueryId("query");
return reader;
}
#Bean
public Step step1(StepBuilderFactory stepBuilderFactory,
ItemReader<Model> myBatisPagingItemReader,
ItemProcessor<Model, Model> itemProcessor,
ItemWriter<Model> itemWriter) {
return stepBuilderFactory.get("data-load")
.<Model, Model>chunk(10)
.reader(myBatisPagingItemReader)
.processor(itemProcessor)
.writer(itemWriter)
.listener(itemReadListener())
.listener(new JobParameterExecutionContextCopyListener())
.build();
}
#Bean
public Job job(JobBuilderFactory jobBuilderFactory, #Qualifier("step1")
Step step1) {
return jobBuilderFactory.get("load-job")
.incrementer(new RunIdIncrementer())
.start(step1)
.listener(jobExecutionListener())
.build();
}
}
More details about this in the Late Binding of Job and Step Attributes section. BatchInputReader will be left empty and is not needed anymore. Less is more! :-)
Hope this helps.

Adding to my question. I have added myBatisPagingItemReader() as suggested to my configuration annoated class.
Restart example when I use #Stepscope annotaton to myBatisPagingItemReader(), the reader is fetching 5 records and I have chunk size(commit-interval) set to 3.
Job Instance - 01 - Job Parameter - 01/02/2019.
chunk-1:
- process record-1
- process record-2
- process record-3
writer - writes all 3 records
chunk-1 commit successful
chunk-2:
process record-4
process record-5 - Throws and exception
Job completes and set to 'FAILED' status
Now the Job is Restarted again using same Job Parameter.
Job Instance - 01 - Job Parameter - 01/02/2019.
chunk-1:
process record-1
process record-2
process record-3
writer - writes all 3 records
chunk-1 commit successful
chunk-2:
process record-4
process record-5 - Throws and exception
Job completes and set to 'FAILED' status
Please note: Here as I am using #Stepscope annotation on myBatisPagingItemReader() bean method, the job creates a new instance , see below log message.
Creating object in scope=step, name=scopedTarget.myBatisPagingItemReader
Registered destruction callback in scope=step, name=scopedTarget.myBatisPagingItemReader
As it is new instance it start the process from start, instead of starting from chunk-2.
If i don't use Stepscope, it restarts from chunk-2 as the restarted job step sets - MyBatisPagingItemReader.read.count=3.
I would like to use Stepscope to use the late bindings, if i use stepscope, is it possible for my myBatisPagingItemReader to set the read.count from the last failure to get restart working?
Or
If I don't use #Stepscope, is there a way to get job parameters inside myBatisPagingItemReader?

Related

[Spring Batch][Mongo] Cannot read jobParameters in ItemReader read()

I have set a step configuration and ItemReader to read data from mongoDB in same file like this...
#Bean("STEP_FETCH_DATA")
public Step fetchDatabaseStep(
ItemReader<ExampleDao> dataReader,
DataProcessor dataProcessor,
DataWriter dataWriter,
#Qualifier("TASK_EXECUTOR") TaskExecutor taskExecutor
) {
log.info("Initialize step: {}", "STEP_FETCH_DATA");
return stepBuilderFactory.get("STEP_FETCH_DATA")
.<ExampleDao, ExampleDao>chunk(chunkSize)
.processor(dataProcessor)
.reader(dataReader)
.writer(dataWriter)
.taskExecutor(taskExecutor)
.build();
}
#Bean("dataReader")
#StepScope
public ItemReader<ExampleDao> read(#Value("#{jobParameters.get(\"batchRunDate\")}") String batchRunDate) throws UnexpectedInputException, ParseException, NonTransientResourceException {
log.info("Reading start... batchRunDate : {}", batchRunDate);
MongoItemReader<ExampleDao> reader = new MongoItemReader<>();
reader.setTemplate(mongoTemplate);
reader.setSort(new HashMap<String, Sort.Direction>() {{
put("_id", Sort.Direction.DESC);
}});
reader.setTargetType(ExampleDao.class);
reader.setQuery("{}");
return reader;
}
From above code, it can access my jobParameter and working as expected.
However, if I create a class to contain my mongo ItemReader like this
#Component
#Slf4j
public class DataReaderExample {
#Autowired
private MongoTemplate mongoTemplate;
#Bean
#StepScope
public ItemReader<ExampleDao> read(#Value("#{jobParameters.get(\"batchRunDate\")}") String
batchRunDate) throws UnexpectedInputException, ParseException, NonTransientResourceException {
log.info("Reading start... batchRunDate : {}", batchRunDate);
MongoItemReader<ExampleDao> reader = new MongoItemReader<>();
reader.setTemplate(mongoTemplate);
reader.setSort(new HashMap<String, Sort.Direction>() {{
put("_id", Sort.Direction.DESC);
}});
reader.setTargetType(ExampleDao.class);
reader.setQuery("{}");
return reader;
}
}
Then set a step configuration like this. (Notice the .reader(dataReadExample.read(null)). I expected #Value("#{jobParameters.get("batchRunDate") in read() argument will overide the null value)
#Bean("STEP_FETCH_DATA")
public Step fetchDatabaseStep(
DataReaderExample dataReadExample ,
DataProcessor dataProcessor,
DataWriter dataWriter,
#Qualifier("TASK_EXECUTOR") TaskExecutor taskExecutor
) {
log.info("Initialize step: {}", "STEP_FETCH_DATA");
return stepBuilderFactory.get("STEP_FETCH_DATA")
.<ExampleDao, ExampleDao>chunk(chunkSize)
.processor(dataProcessor)
.reader(dataReadExample.read(null))
.writer(dataWriter)
.taskExecutor(taskExecutor)
.build();
}
My log.info("Reading start... batchRunDate : {}", batchRunDate); the batch will always print out as null value and the #Value("#{jobParameters.get("batchRunDate") is not working. Seem like I cannot access the jobParameters.
Have anyone could explain me about this behavior and how to move the ItemReader to another class. My goal is to seperate ItemReader into another class. Thanks!
Your DataReaderExample is declared as a #Component, it should rather be a #Configuration class in which you declare bean definitions.
I suggest to change the read method to itemReader or something similar because its purpose is to define the item reader bean and not actually read the data.
Once that done, you can import your DataReaderExample configuration class in your application context and autowire the item reader in your step:
#Bean("STEP_FETCH_DATA")
public Step fetchDatabaseStep(
ItemReader<ExampleDao> itemReader ,
DataProcessor dataProcessor,
DataWriter dataWriter,
#Qualifier("TASK_EXECUTOR") TaskExecutor taskExecutor
) {
log.info("Initialize step: {}", "STEP_FETCH_DATA");
return stepBuilderFactory.get("STEP_FETCH_DATA")
.<ExampleDao, ExampleDao>chunk(chunkSize)
.processor(dataProcessor)
.reader(itemReader)
.writer(dataWriter)
.taskExecutor(taskExecutor)
.build();
}

X-Ray configuration for Spring Batch Job

X-Ray is integrated into my service and everything works fine when some endpoints are triggered from other services.
The Spring Batch job is used to process some data and push some part of it to SNS topic. This job is launched via SimpleJobLauncher.
The issue is that during the pushing to SNS from my Spring Batch the following exception is thrown: SegmentNotFoundException: No segment in progress .
Based on the documentation it looks like I need to pass the trace ID to the job:
https://docs.aws.amazon.com/xray/latest/devguide/xray-sdk-java-multithreading.html
Does anyone know what is the best way to integrate X-Ray with Spring Batch? And what would be the cleanest solution?
I solved this issue in the following way:
I've passed name, trace id and parent id to my job via job parameters while launching the job:
Entity segment = AWSXRay.getGlobalRecorder().getTraceEntity();
asyncJobLauncher.run(
myJob,
new JobParametersBuilder()
.addLong(JOB_UNIQUENESS_KEY, System.nanoTime())
.addString(X_RAY_NAME_ID_KEY, segment.getName())
.addString(X_RAY_TRACE_ID_KEY, segment.getTraceId().toString())
.addString(X_RAY_PARENT_ID_KEY, segment.getParentId())
.toJobParameters()
);
I've implemented the job listener to create a new X-Ray segment while starting a job:
#Slf4j
#Component
#RequiredArgsConstructor
public class XRayJobListener implements JobExecutionListener {
#Value("${spring.application.name}")
private String appName;
#Override
public void beforeJob(#NonNull JobExecution jobExecution) {
AWSXRayRecorder recorder = AWSXRay.getGlobalRecorder();
String name = Objects.requireNonNullElse(
jobExecution.getJobParameters().getString(X_RAY_NAME_ID_KEY),
appName
);
Optional<String> traceIdOpt =
Optional.ofNullable(jobExecution.getJobParameters().getString(X_RAY_TRACE_ID_KEY));
TraceID traceID =
traceIdOpt
.map(TraceID::fromString)
.orElseGet(TraceID::create);
String parentId = jobExecution.getJobParameters().getString(X_RAY_PARENT_ID_KEY);
recorder.beginSegment(name, traceID, parentId);
}
#Override
public void afterJob(#NonNull JobExecution jobExecution) {
AWSXRay.getGlobalRecorder().endSegment();
}
}
And this listener is added to the configuration of my job:
#Bean
public Job myJob(
JobBuilderFactory jobBuilderFactory,
Step myStep1,
Step myStep2,
XRayJobListener xRayJobListener
) {
return
jobBuilderFactory
.get("myJob")
.incrementer(new RunIdIncrementer())
.listener(xRayJobListener)
.start(myStep1)
.next(myStep2)
.build();
}

Should Job/Step/Reader/Writer all be bean?

As fars as all the examples from the Spring Batch reference doc , I see that those objects like job/step/reader/writer are all marked as #bean, like the following:
#Bean
public Job footballJob() {
return this.jobBuilderFactory.get("footballJob")
.listener(sampleListener())
...
.build();
}
#Bean
public Step sampleStep(PlatformTransactionManager transactionManager) {
return this.stepBuilderFactory.get("sampleStep")
.transactionManager(transactionManager)
.<String, String>chunk(10)
.reader(itemReader())
.writer(itemWriter())
.build();
}
I have a scenario that the server side will receive requests and run job concurrently(different job names or same job name with different jobparameters). The usage is to new a job object(including steps/reader/writers) in concurrent threads, so I propabaly will not state the job method as #bean and new a job each time.
And there is actually a differenence on how to transmit parameters to object like reader. If using #bean , parameters must be put in e.g. JobParameters to be late binding into object using #StepScope, like the following example:
#StepScope
#Bean
public FlatFileItemReader flatFileItemReader(#Value(
"#{jobParameters['input.file.name']}") String name) {
return new FlatFileItemReaderBuilder<Foo>()
.name("flatFileItemReader")
.resource(new FileSystemResource(name))
}
If not using #bean , I can just transmit parameter directly with no need to put data into JobParameter,like the following
public FlatFileItemReader flatFileItemReader(String name) {
return new FlatFileItemReaderBuilder<Foo>()
.name("flatFileItemReader")
.resource(new FileSystemResource(name))
}
Simple test shows that no #bean works. But I want to confirm formally:
1、 Is using #bean at job/step/reader/writer mandatory or not ?
2、 if it is not mandatory, when I new a object like reader, do I need to call afterPropertiesSet() manually?
Thanks!
1、 Is using #bean at job/step/reader/writer mandatory or not ?
No, it is not mandatory to declare batch artefacts as beans. But you would want to at least declare the Job as a bean to benefit from Spring's dependency injection (like injecting the job repository reference into the job, etc) and be able to do something like:
ApplicationContext context = new AnnotationConfigApplicationContext(MyJobConfig.class);
Job job = context.getBean(Job.class);
JobLauncher jobLauncher = context.getBean(JobLauncher.class);
jobLauncher.run(job, new JobParameters());
2、 if it is not mandatory, when I new a object like reader, do I need to call afterPropertiesSet() manually?
I guess that by "when I new a object like reader" you mean create a new instance manually. In this case yes, if the object is not managed by Spring, you need to call that method yourself. If the object is declared as a bean, Spring will call
the afterPropertiesSet() method automatically. Here is a quick sample:
import org.springframework.beans.factory.InitializingBean;
import org.springframework.context.ApplicationContext;
import org.springframework.context.annotation.AnnotationConfigApplicationContext;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
#Configuration
public class TestAfterPropertiesSet {
#Bean
public MyBean myBean() {
return new MyBean();
}
public static void main(String[] args) throws Exception {
ApplicationContext context = new AnnotationConfigApplicationContext(TestAfterPropertiesSet.class);
MyBean myBean = context.getBean(MyBean.class);
myBean.sayHello();
}
static class MyBean implements InitializingBean {
#Override
public void afterPropertiesSet() throws Exception {
System.out.println("MyBean.afterPropertiesSet");
}
public void sayHello() {
System.out.println("Hello");
}
}
}
This prints:
MyBean.afterPropertiesSet
Hello

I am getting error Table 'test.batch_job_instance' doesn't exist

I am new to Spring Batch. I have configured my job with inmemoryrepository. But still, it seems it is using DB to persist job Metadata.
My spring batch Configuration is :
#Configuration
public class BatchConfiguration {
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Autowired
private JobBuilderFactory jobBuilder;
#Bean
public JobLauncher jobLauncher() throws Exception {
SimpleJobLauncher job =new SimpleJobLauncher();
job.setJobRepository(getJobRepo());
job.afterPropertiesSet();
return job;
}
#Bean
public PlatformTransactionManager getTransactionManager() {
return new ResourcelessTransactionManager();
}
#Bean
public JobRepository getJobRepo() throws Exception {
return new MapJobRepositoryFactoryBean(getTransactionManager()).getObject();
}
#Bean
public Step step1(JdbcBatchItemWriter<Person> writer) throws Exception {
return stepBuilderFactory.get("step1")
.<Person, Person> chunk(10)
.reader(reader())
.processor(processor())
.writer(writer).repository(getJobRepo())
.build();
}
#Bean
public Job job( #Qualifier("step1") Step step1) throws Exception {
return jobBuilder.get("myJob").start(step1).repository(getJobRepo()).build();
}
}
How to resolve above issue?
If you are using Sprint boot
a simple property in your application.properties will solve the issue
spring.batch.initialize-schema=ALWAYS
For a non-Spring Boot setup:This error shows up when a datasource bean is declared in the batch configuration. To workaround the problem I added an embedded datasource, since I didn't want to create those tables in the application database:
#Bean
public DataSource mysqlDataSource() {
// create your application datasource here
}
#Bean
#Primary
public DataSource batchEmbeddedDatasource() {
// in memory datasource required by spring batch
EmbeddedDatabaseBuilder builder = new EmbeddedDatabaseBuilder();
return builder.setType(EmbeddedDatabaseType.H2)
.addScript("classpath:schema-drop-h2.sql")
.addScript("classpath:schema-h2.sql")
.build();
}
The initialization scripts can be found inside the spring-batch-core-xxx.jar under org.springframework.batch.core package.Note I used an in-memory database but the solution is valid also for other database systems.
Those who face the same problem with MySql database in CentOS(Most Unix based systems).
Table names are case-sensitive in Linux. Setting lower_case_table_names=1 has solved the problem.
Find official document here
For those using versions greater then spring-boot 2.5 this worked inside of application.properties
spring.batch.jdbc.initialize-schema = ALWAYS
This solved my case:
spring.batch.jdbc.initialize-schema=ALWAYS

How do I set JobParameters in spring batch with spring-boot

I followed the guide at http://spring.io/guides/gs/batch-processing/ but it describes a job with no configurable parameters. I'm using Maven to build my project.
I'm porting an existing job that I have defined in XML and would like to pass-in the jobParameters through the command.
I tried the following :
#Configuration
#EnableBatchProcessing
public class MyBatchConfiguration {
// other beans ommited
#Bean
public Resource destFile(#Value("#{jobParameters[dest]}") String dest) {
return new FileSystemResource(dest);
}
}
Then I compile my project using :
mvn clean package
Then I try to launch the program like this :
java my-jarfile.jar dest=/tmp/foo
And I get an exception saying :
[...]
Caused by: org.springframework.expression.spel.SpelEvaluationException:
EL1008E:(pos 0): Field or property 'jobParameters' cannot be found on object of
type 'org.springframework.beans.factory.config.BeanExpressionContext'
Thanks !
Parse in job parameters from the command line and then create and populate JobParameters.
public JobParameters getJobParameters() {
JobParametersBuilder jobParametersBuilder = new JobParametersBuilder();
jobParametersBuilder.addString("dest", <dest_from_cmd_line);
jobParametersBuilder.addDate("date", <date_from_cmd_line>);
return jobParametersBuilder.toJobParameters();
}
Pass them to your job via JobLauncher -
JobLauncher jobLauncher = context.getBean(JobLauncher.class);
JobExecution jobExecution = jobLauncher.run(job, jobParameters);
Now you can access them using code like -
#Bean
#StepScope
public Resource destFile(#Value("#{jobParameters[dest]}") String dest) {
return new FileSystemResource(dest);
}
Or in a #Configuration class that is configuring Spring Batch Job artifacts like - ItemReader, ItemWriter, etc...
#Bean
#StepScope
public JdbcCursorItemReader<MyPojo> reader(#Value("#{jobParameters}") Map jobParameters) {
return new MyReaderHelper.getReader(jobParameters);
}
I managed to get this working by simply annotating my bean as follows :
#Bean
#StepScope
public Resource destFile(#Value("#{jobParameters[dest]}") String dest) {
return new FileSystemResource(dest);
}