Using spring batch, I am trying to start a job with some parameters but parameters from previous instance are used.
Spring is started using ApplicationContext context = SpringApplication.run(Application.class, args);
My job bean :
#Bean
public Job closingJob(JobCompletionNotificationListener listener,
Step step1,
Step step2,
Step step3,
JobParametersValidator validator) {
return jobBuilderFactory.get("quarterly-closing")
.incrementer(new RunIdIncrementer())
.validator(validator)
.listener(listener)
.flow(step1)
.next(step2)
.next(step3)
.end()
.build();
}
In the logs :
2020-04-15 08:51:10,259 - INFO - [] {o.s.b.a.b.JobLauncherCommandLineRunner} --> Running default command line with: [run.id(long)=1, my.param=secondRun ]
2020-04-15 08:51:10,422 - INFO - [] {o.s.b.c.l.s.SimpleJobLauncher} --> Job: [FlowJob: [name=my-job]] launched with the following parameters: [{run.id=2, my.param=firstRun}]
I saw a similar question but there are only one answer that doesn't help me.
Edit : I tried it with a custom JobParametersIncrementer but it doesn't work : it still uses the previous instance parameters
#Bean
public JobParametersIncrementer incrementer(){
return parameters -> {
if (parameters==null || parameters.isEmpty()) {
return new JobParametersBuilder().addLong("run.id",1L).toJobParameters();
}
long id = parameters.getLong("run.id",1L) + 1;
return new JobParametersBuilder().addLong("run.id", id).toJobParameters();
};
}
From your log statements, it does look like the RunIdIncrementer is being hit.
2020-04-15 08:51:10,259 - INFO - [] {o.s.b.a.b.JobLauncherCommandLineRunner} --> Running default command line with: [run.id(long)=1, my.param=secondRun ]
2020-04-15 08:51:10,422 - INFO - [] {o.s.b.c.l.s.SimpleJobLauncher} --> Job: [FlowJob: [name=my-job]] launched with the following parameters: [{run.id=2, my.param=firstRun}]
JobLauncherCommandLineRunner says:
run.id(long)=1
SimpleJobLauncher says:
run.id=2
JobLauncherCommandLineRunner
The log statement for JobLauncherCommandLineRunner happens before any modifications to the JobParameters occur.
Relevant Code Snippet (Source):
public void run(String... args) throws JobExecutionException {
logger.info("Running default command line with: " + Arrays.asList(args));
launchJobFromProperties(StringUtils.splitArrayElementsIntoProperties(args, "="));
}
Assumption
I'm assuming you're trying to execute a job more than once given a static run.id=1 parameter, and continually getting run.id=2 from your incrementer. If you're wanting to increment the JobParameters to guarantee unique JobParameters, you have to try a different approach.
Look at the below JobParameterIncrementer snippet, which take a set of JobParameters and adds a parameter random=(random long value):
#Override
public JobParameters getNext(JobParameters params) {
Long random = (long) (Math.random() * 1000000);
return new JobParametersBuilder(params).addLong("random", random).toJobParameters();
}
Using spring batch, I am trying to start a job with some parameters but parameters from previous instance are used.
According to your job definition, you are using a JobParametersIncrementer. When you add an incrementer to your job definition, you are basically telling Spring Batch the following: Whenever I run a my job, do increment the parameters of the previous instance using my incrementer and create a new job instance.
Using a job parameters incrementer makes sense when there is a logical sequence of job instances (aka it is possible to calculate the next job instance from the previous one).
So Spring Batch will take the parameters of the previous instance, pass them to your incrementer, and use those returned by the incrementer to create the "next" job instance in the sequence.
Hopefully this makes it clear why "parameters from previous instance are used".
Related
I am new to Spring Batch. I have some question about restart. I know restart feature enabled by default. Any extra code I need to do restart any job? Which jobs are restart-able. How can I test my batch app is restartable. I tried to stop the batch middle of process and run again. It always executing a new job.
Below are my code :
#Bean
#Qualifier("dataTransferJob")
public Job dataJob() {
return jobBuilderFactory.get("data-transfer-job")
.listener(jobExecutionListener())
.flow(step()).end().build();
}
#Bean
public Step step() {
return stepBuilderFactory.get("data-transfer-step")
.<TestData, TestDataVO>chunk(100)
.reader(reader())
.processor(process())
.writer(writer)
.taskExecutor(threadPool)
.transactionManager(transactionManager)
.listener(stepExecutionListener())
.listener(chunkListener())
.throttleLimit(10)
.build();
}
#PersistenceContext
private EntityManager em;
#Bean(destroyMethod="")
public ItemReader<TestData> reader() {
JpaPagingItemReader<TestData> itemReader = new JpaPagingItemReader<>();
try {
String sqlQuery = "SELECT * FROM TEST_DATA";
JpaNativeQueryProvider<TestData> queryProvider = new JpaNativeQueryProvider<TestData>();
queryProvider.setSqlQuery(sqlQuery);
queryProvider.setEntityClass(TestData.class);
queryProvider.afterPropertiesSet();
itemReader.setEntityManagerFactory(em.getEntityManagerFactory());
itemReader.setPageSize(100);
itemReader.setQueryProvider(queryProvider);
itemReader.afterPropertiesSet();
itemReader.setSaveState(true);
}
catch (Exception e) {
System.out.println("BatchConfiguration.reader() ==> error " + e.getMessage());
}
return itemReader;
}
And lunch the job using CommandLineRunner
#Autowired
JobLauncher jobLauncher;
#Autowired
#Qualifier("dataTransferJob")
Job dataJob;
JobParametersBuilder paramsBuilder = new JobParametersBuilder();
paramsBuilder.addString("date", LocalDateTime.now().toString());
JobExecution jobExecution=jobLauncher.run(dataJob, paramsBuilder.toJobParameters());
In Spring Batch, a job instance is identified by the (identifying) job parameters. Please check the The domain language of Batch section to understand the difference between the Job, JobInstance and JobExecution concepts and how parameters are used to identify job instances.
I tried to stop the batch middle of process and run again. It always executing a new job.
In your case, since your are adding the current time as a job parameter on each run here:
JobParametersBuilder paramsBuilder = new JobParametersBuilder();
paramsBuilder.addString("date", LocalDateTime.now().toString());
you end up with a different job instance each time. If you want to start the same job instance again, you need to pass the same timestamp of the first attempt as a job parameter.
I want to use Spring Batch (v3.0.9) restart functionality so that when JobInstance restarted the process step reads from the last failed chunk point forward. My restart works fine as long as I don't use #StepScope annotation to my myBatisPagingItemReader bean method.
I was using #StepScope so that i can do late binding to get the JobParameters in my myBatisPagingItemReader bean method #Value("#{jobParameters['run-date']}"))
If I use #StepScope annotation on myBatisPagingItemReader() bean method the restart does not work as it creates new instance (scope=step, name=scopedTarget.myBatisPagingItemReader).
If i use stepscope, is it possible for my myBatisPagingItemReader to set the read.count from the last failure to get restart working?
I have explained this issue with example below.
#Configuration
#EnableBatchProcessing
public class BatchConfig {
#Bean
public Step step1(StepBuilderFactory stepBuilderFactory,
ItemReader<Model> myBatisPagingItemReader,
ItemProcessor<Model, Model> itemProcessor,
ItemWriter<Model> itemWriter) {
return stepBuilderFactory.get("data-load")
.<Model, Model>chunk(10)
.reader(myBatisPagingItemReader)
.processor(itemProcessor)
.writer(itemWriter)
.listener(itemReadListener())
.listener(new JobParameterExecutionContextCopyListener())
.build();
}
#Bean
public Job job(JobBuilderFactory jobBuilderFactory, #Qualifier("step1")
Step step1) {
return jobBuilderFactory.get("load-job")
.incrementer(new RunIdIncrementer())
.start(step1)
.listener(jobExecutionListener())
.build();
}
#Bean
#StepScope
public ItemReader<Model> myBatisPagingItemReader(
SqlSessionFactory sqlSessionFactory,
#Value("#{JobParameters['run-date']}") String runDate)
{
MyBatisPagingItemReader<Model> reader = new
MyBatisPagingItemReader<>();
Map<String, Object> parameterValues = new HashMap<>();
parameterValues.put("runDate", runDate);
reader.setSqlSessionFactory(sqlSessionFactory);
reader.setParameterValues(parameterValues);
reader.setQueryId("query");
return reader;
}
}
Restart Example when I use #Stepscope annotation to myBatisPagingItemReader(), the reader is fetching 5 records and I have chunk size(commit-interval) set to 3.
Job Instance - 01 - Job Parameter - 01/02/2019.
chunk-1:
- process record-1
- process record-2
- process record-3
writer - writes all 3 records
chunk-1 commit successful
chunk-2:
process record-4
process record-5 - Throws and exception
Job completes and set to 'FAILED' status
Now the Job is Restarted again using same Job Parameter.
Job Instance - 01 - Job Parameter - 01/02/2019.
chunk-1:
process record-1
process record-2
process record-3
writer - writes all 3 records
chunk-1 commit successful
chunk-2:
process record-4
process record-5 - Throws and exception
Job completes and set to 'FAILED' status
The #StepScope annotation on myBatisPagingItemReader() bean method creates a new instance , see below log message.
Creating object in scope=step, name=scopedTarget.myBatisPagingItemReader
Registered destruction callback in scope=step, name=scopedTarget.myBatisPagingItemReader
As it is new instance it start the process from start, instead of starting from chunk-2.
If i don't use #Stepscope, it restarts from chunk-2 as the restarted job step sets - MyBatisPagingItemReader.read.count=3.
The issue here is that you are returning an ItemReader instead of the fully qualified class (MyBatisPagingItemReader) or at least ItemStreamReader. When you use Spring Batch's step scope, we create a proxy to allow for late initialization. The proxy is based on the return type of the method (ItemReader in your case). The issue you are running into is that because the proxy is of ItemReader, Spring Batch does not know that your bean also implements ItemStream and it is that interface that enables restartability. By default, Spring Batch will automatically register all beans of type ItemStream for you (you can also explicitly register the beans yourself, but it's typically not needed).
To address your issue, the following should work (note the change in the return type):
#Bean
#StepScope
public MyBatisPagingItemReader<Model> myBatisPagingItemReader(
SqlSessionFactory sqlSessionFactory,
#Value("#{JobParameters['run-date']}") String runDate) {
MyBatisPagingItemReader<Model> reader =
new MyBatisPagingItemReader<>();
Map<String, Object> parameterValues = new HashMap<>();
parameterValues.put("runDate", runDate);
reader.setSqlSessionFactory(sqlSessionFactory);
reader.setParameterValues(parameterValues);
reader.setQueryId("query");
return reader;
}
This is why it is my recommendation that where possible, when using #Bean annotated methods, you should return the most concrete type possible to allow Spring to help as much as possible.
Our SpringBatch Job has a single Step with an ItemReader, ItemProcessor, and ItemWriter. We are running the same job concurrently with different parameters. The ItemReader is stateful as it contains an input stream that it reads from.
So, we don't want the same instance of the ItemReader to be used for every JobInstance (Job + Parameters) invocation.
I am not quite sure which is the best "scoping" for this situation.
1) Should the Step be annotated with #JobScope and ItemReader be a prototype?
OR
2) Should the Step be annotated with #StepScope and ItemReader be a prototype?
OR
3) Should both the Step and ItemReader be annotated as Prototype?
The end result should be such that a new ItemReader is created for every new execution of the Job with different identifying parameters (ie, for every new JobInstance).
Thanks.
-AP_
Here's how it goes from a class instantiation standpoint (from least to most instances):
Singleton (per JVM)
JobScope (per job)
StepScope (per step)
Prototype (per reference)
If you have multiple jobs running in a single JVM (assuming you aren't in a partitioned Step, JobScope will be sufficient. If you have a partitioned step, you'll want StepScope. Prototype would be overkill in all scenarios.
However, if these jobs are launching in different JVMs (and not a partitioned step), then a simple Singleton bean will be just fine.
There is no need that every component (Step, ItemReader, ItemProcessor, ItemWriter) has to be a spring component. For instance, with the SpringBatch-JavaApi, only your Job needs to be a SpringBean, but not your Steps, Readers and Writers:
#Autowired
private JobBuilderFactory jobs;
#Autowired
private StepBuilderFactory steps;
#Bean
public Job job() throws Exception {
return this.jobs.get(JOB_NAME) // create jobbuilder
.start(step1()) // add step 1
.next(step2()) // add step 2
.build(); // create job
}
#Bean
public Job job() throws Exception {
return this.jobs.get(JOB_NAME) // create jobbuilder
.start(step1(JOB_NAME)) // add step 1
.next(step2(JOB_NAME)) // add step 2
.build(); // create job
}
private Step step1(String jobName) throws Exception {
return steps.get(jobName + "_Step_1").chunk(10) //
.faultTolerant() //
.reader(() -> null) // you could lambdas
.writer(items -> {
}) //
.build();
}
private Step step2(String jobName) throws Exception {
return steps.get(jobName + "_Step_2").chunk(10) //
.faultTolerant() //
.reader(createDbItemReader(ds, sqlString, rowmapper)) //
.writer(createFileWriter(resource, aggregator)) //
.build();
}
The only thing you have to pay attention to is that you have to call the "afterPropertiesSet"-methods when creating instances like JdbcCurserItemReader, FlatFileItemReader/Writer:
private static <T> ItemReader<T> createDbItemReader(DataSource ds, String sql, RowMapper<T> rowMapper) throws Exception {
JdbcCursorItemReader<T> reader = new JdbcCursorItemReader<>();
reader.setDataSource(ds);
reader.setSql(sql);
reader.setRowMapper(rowMapper);
reader.afterPropertiesSet(); // don't forget
return reader;
}
private static <T> ItemWriter<T> createFileWriter(Resource target, LineAggregator<T> aggregator) throws Exception {
FlatFileItemWriter<T> writer = new FlatFileItemWriter();
writer.setEncoding("UTF-8");
writer.setResource(target);
writer.setLineAggregator(aggregator);
writer.afterPropertiesSet(); // don't forget
return writer;
}
This way, there is no need for you to hassle around with the Scopes. Every Job will have its own instances of its Steps and their Readers and Writers.
Another advantage of this approach is the fact that you now can create your jobs completly dynamically.
I hava a spring batch (2.2.2) application and for some reason cannot make the job parameter incremeneter work. The step is declared as this :
<job id="job1" xmlns="http://www.springframework.org/schema/batch" incrementer="incrementer">
<step id="step1" parent="step" />
</job>
<bean id="incrementer" class="org.springframework.batch.core.launch.support.RunIdIncrementer" />
When I put a breakpoint into the incrementer it is not even called.
Calling the job two times with the same parameter gives the following exception :
A job instance already exists and is complete for parameters={fail=false}. If you want to run this job again, change the parameters.
I checked the official samples here
https://github.com/spring-projects/spring-batch-admin/tree/master/spring-batch-admin-sample
and it has the same problem
Old question, but still applies to current version (3.0.5):
If you are starting the job execution via
JobExecution jobExecution = launcher.run(job, jobParameters);
using e.g. the SimpleJobLauncher class, then the incrementer is never called. If you check the "callers" of the method Incrementer.getNext(JobParameters), the number of callers is limited:
org.springframework.batch.core.launch.support.CommandLineJobRunner
CommandLineJobRunner does the call to getNext() conditionally on "-next" before calling the launcher:
if (opts.contains("-next")) {
JobParameters nextParameters = getNextJobParameters(job);
Map<String, JobParameter> map = new HashMap<String, JobParameter>(nextParameters.getParameters());
map.putAll(jobParameters.getParameters());
jobParameters = new JobParameters(map);
}
JobExecution jobExecution = launcher.run(job, jobParameters);
org.springframework.batch.core.launch.support.SimpleJobOperator
This is used by the Spring Admin web application and it is basically the same implementation as in CommandLineJobRunner:
if (lastInstances.isEmpty()) {
parameters = incrementer.getNext(new JobParameters());
if (parameters == null) {
throw new JobParametersNotFoundException("No bootstrap parameters found for job=" + jobName);
}
}
else {
List<JobExecution> lastExecutions = jobExplorer.getJobExecutions(lastInstances.get(0));
parameters = incrementer.getNext(lastExecutions.get(0).getJobParameters());
}
logger.info(String.format("Attempting to launch job with name=%s and parameters=%s", jobName, parameters));
try {
return jobLauncher.run(job, parameters).getId();
}
catch (JobExecutionAlreadyRunningException e) {
throw new UnexpectedJobExecutionException(String.format(ILLEGAL_STATE_MSG, "job already running", jobName,
parameters), e);
}
So if you are using using the JobLauncher class for starting Jobs, must must take care of yourself for calling the incrementer before calling the jobLauncher to enhance the job parameter with your desired values.
I see that nobody gave you a correct answer so here is it (even if it is one year late maybe it will help others):
You can directly call the main function of the
CommandLineJobRunner class.
For example:
String[] args = new String[]{"spring/batch/jobs/helloWorldJob.xml", "helloWorldJob", "name(string)=Spring", "-next"};
CommandLineJobRunner.main(args);
Basically this is what you are doing when you are running Spring Batch from the command line (or any other java application in fact).
If you run with CommandLineJobRunner use the -next option else the best solution is to use a timestamp job parameter to make every job instance different from others.
We read most of our data from a DB. Sometimes the result-set is empty, and for that case we want the job to stop immediately, and not hand over to a writer. We don't want to create a file, if there is no input.
Currently we achieve this goal with a Step-Listener that returns a certain String, which is the input for a transition to either the next business-step or a delete-step, which deletes the file we created before (the file contains no real data).
I'd like the job to end after the reader realizes that there is no input?
New edit (more elegant way)
This approach is to elegantly move to the next step or end the batch application when the file is not found and prevent unwanted steps to execute (and their listeners too).
-> Check for the presence of file in a tasklet, say FileValidatorTasklet.
-> When the file is not found set some exit status (enum or final string) , here we have set EXIT_CODE
sample tasklet
public class FileValidatorTasklet implements Tasklet {
static final String EXIT_CODE = "SOME_EXIT_CODE";
static final String EXIT_DESC = "SOME_EXIT_DESC";
#Override
public RepeatStatus execute(StepContribution stepContribution, ChunkContext chunkContext) throws Exception {
boolean isFileFound = false;
//do file check and set isFileFound
if(!isFileFound){
stepContribution.setExitStatus(new ExitStatus(EXIT_CODE, EXIT_DESC));
}
return RepeatStatus.FINISHED;
}
}
-> In the job configuration of this application after executing FileValidatorTasklet, check for the presence of the EXIT_CODE.
-> Provide the subsequent path for this job if the code is found else the normal flow of the job.( Here we are simply terminating the job if the EXIT_CODE is found else continue with the next steps)
sample config
public Job myJob(JobBuilderFactory jobs) {
return jobs.get("offersLoaderJob")
.start(fileValidatorStep).on(EXIT_CODE).end() // if EXIT_CODE is found , then end the job
.from(fileValidatorStep) // else continue the job from here, after this step
.next(step2)
.next(finalStep)
.end()
.build();
}
Here we have taken advantage of conditional step flow in spring batch.
We have to define two separate path from step A. The flow is like A->B->C or A->D->E.
Old answer:
I have been through this and hence I am sharing my approach. It's better to
throw new RunTimeException("msg");.
It will start to terminate the Spring Application , rather than exact terminate at that point. All methods like close() in ( reader/writer) would be called and destroy method of all the beans would be called.
Note: While executing this in Listener, remember that by this point all the beans would have been initialized and code in their initialization (like afterPropertySet() ) would have executed.
I think above is the correct way, but if you are willing to terminate at that point only, you can try
System.exit(1);
It would likely be cleaner to use a JobExecutionDecider and based on the read count from the StepExecution set a new FlowExecutionStatus and route it to the end of the job.
Joshua's answer addresses the stopping of the job instead of transitioning to the next business step.
Your file writer might still create the file unnecessarily. You can create something like a LazyItemWriter with a delegate (FlatFileItemWriter) and it will only call delegate.open (once) if there's a call to write method. Of course you have to check if delegate.close() needs to be called only if the delegate was previously opened. This makes sure that no empty file is created and deleting it is no longer a concern.
I have the same question as the OP. I am using all annotations, and if the reader returns as null when no results (in my case a File) are found, then the Job bean will fail to be initialized with an UnsatisfiedDependencyException, and that exception is thrown to stdout.
If I create a Reader and then return it w/o a File specified, then the Job will be created. After that an ItemStreamException is thrown, but it is thrown to my log, as I am past the Job autowiring and inside the Step at that point. That seems preferable, at least for what I am doing.
Any other solution would be appreciated.
NiksVij Answer works for me, i implemented it like this:
#Component
public class FileValidatorTasklet implements Tasklet {
private final ImportProperties importProperties;
#Autowired
public FileValidatorTasklet(ImportProperties importProperties) {
this.importProperties = importProperties;
}
#Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
String folderPath = importProperties.getPathInput();
String itemName = importProperties.getItemName();
File currentItem = new File(folderPath + File.separator + itemName);
if (currentItem.exists()) {
contribution.setExitStatus(new ExitStatus("FILE_FOUND", "FILE_FOUND"));
} else {
contribution.setExitStatus(new ExitStatus("NO_FILE_FOUND", "NO_FILE_FOUND"));
}
return RepeatStatus.FINISHED;
}
}
and in the Batch Configuration:
#Bean
public Step fileValidatorStep() {
return this.stepBuilderFactory.get("step1")
.tasklet(fileValidatorTasklet)
.build();
}
#Bean
public Job tdZuHostJob() throws Exception {
return jobBuilderFactory.get("tdZuHostJob")
.incrementer(new RunIdIncrementer())
.listener(jobCompletionNotificationListener)
.start(fileValidatorStep()).on("NO_FILE_FOUND").end()
.from(fileValidatorStep()).on("FILE_FOUND").to(testStep()).end()
.build();
}