How to run three spring batch jobs sequentially..
i created three batch jobs and one cron trigger expression. i need to run three jobs sequentially when first batch job completed successfully then second batch job and then third batch job...
for example:1st batch job will give input to second, second atch job will give input to 3rd and then 3rd will process it...
For sequential execution see §5.3.1 Sequential Flow:
<job id="job">
<step id="stepA" parent="s1" next="stepB" />
<step id="stepB" parent="s2" next="stepC"/>
<step id="stepC" parent="s3" />
</job>
The output of each step should be file, or DB records. In this case Spring batch has an ability to restart the job execution from interrupted (failed) step (see also §5.1.4.2 Restarting a completed step). Inter-step communication is possible via ExecutionContext (see StepExecution and StepExecutionListener)
If you don't want to dump intermediate results, you don't need need sequential step execution: you simply feed the output of one processor to another.
Related
I want to create a chain process. When a parent's job ends, a child (or multiple children) job starts. The main problem is that I want to have only the current job pods deployed.
Something like: job1 starts --> job 1 finish --> job 2 starts --> job 2 finish --> job 3 starts --> job 3 finish.
How can I do this? I thought that I could create in disabled mode job 1, job 2, and job 3 at the same time and enable them when the respective parent job ends (maybe with a service?).
I recently read about a tool called argo-workflow but I am not sure if it will do the chain effect that I am searching for.
Yes. CI/CD solution like Argo workflow is the way to go. You can check this example on how to execute different task one by one using steps here.
Why rundeck not launching scheduled spark jobs even if the previous job is still executing?
Rundeck is skipping the jobs set to launch during the execution of the previous job, then after the completion of its execution launch new job based on the schedule.
But I want to launch a scheduled job even if the previous job is executing.
Check your workflow strategy, here you have an explanation about that:
https://www.rundeck.com/blog/howto-controlling-how-and-where-rundeck-jobs-execute
You can design a workflow strategy based on "Parallel" to launch the jobs simultaneously on your node.
Example using the parallel strategy with a parent job.
Example jobs:
Job one, Job two and Parent Job (using parallel strategy).
I have had experience working with Spring Batch a few months but I have got a doubt a few days ago. I have to process a file and then update a database from it but this is not a scheduled batch process because it has to be executed just once.
Is Spring batch recommended to execute not scheduled processes like this one? Or the fact that is not scheduled has nothing to do with using Spring batch or not
Thanks
Is Spring batch recommended to execute not scheduled processes like this one? Or the fact that is not scheduled has nothing to do with using Spring batch or not
Yes, the fact that your job has to be executed only once has nothing to do with using Spring Batch or not. There is a difference between developing the job (using Spring Batch or not) and scheduling the job (using cron, quartz, etc).
For your use case (process a file and then update a database), I would recommend using Spring Batch to develop your job. Then, you can choose to run it:
only once or on demand (Spring Batch provides APIs to run the job)
or schedule it to run repeatedly using your favourite scheduler
We have bunch of spring batch jobs and we need to invoke them in specific order. Is there any best practice we should follow? I was thinking of using autosys or cron scheduler based on status of each job and decide whether to invoke next one or not but open to other suggestions.
The approach sounds right, though it's harder to build something like this in cron. A scheduler tool like autosys or control-m provide the orchestration feature usually out of the box.
I have used CRON to schedule the spring batch jobs . I nearly had to schedule around 3 main jobs and 6 jobs in all of them.
I had a same scenario where the next job is dependent on the first.
In that case you can use spring batch tables to check if the previous job is Completed or not using spring batch tables.
You will find the batch tables details here - http://docs.spring.io/spring-batch/reference/html/metaDataSchema.html
The tables are -
BATCH_JOB_INSTANCE
BATCH_JOB_EXECUTION
BATCH_EXECUTION_CONTEXT
BATCH_STEP_EXECUTION
BATCH_STEP_EXECUTION
and it will be easy from CRON to schedule the jobs for you .But some how managing jobs in CRON is quite a pain.
TO use a scheduler tool you need to configure it and it will consume a good time. But once the scheduler tool is up , then it is easy to schedule and manage jobs.
In most of the cases - scheduling is one time activity. So i guess it is better not to waist time for scheduler tool , go for CRON instead.
Requirement:
Manager doesn’t want multiple instance of the job in the job instance and Job execution tables. He wants just one instance though multiple executions are fine.
Implications:
The job cannot end with a batch status of COMPLETED since such an instance can never be restarted.
My approach:
I try to end the job with a batch status of STOPPED so that the next run of the job runs the same instance that was ran previously. (Note that if the job fails then there’s no issue since a failed instance can be re-run). I plan to have no parameter for the job so that the default instance created on each run matches the instance that already exists in the database (job instance table). In this way, I don’t have to worry about passing any parameter to the job when restarting since there’s just one instance and it has no differentiating parameters.
Issues:
If all steps complete and the BATCH_STEP_EXECUTION gets updated to COMPLTED for all such steps, I cannot rerun this same job instance anymore even if I manage to have the job execution end with a status of STOPPED. I get the message: All steps already completed or no steps configured for this job.
I know what the message means but I’m trying to have all steps end in a status other than completed so that whenever I re-run my single instance, those steps can be re-run as well rather than getting that above message.
I am aware that a step execution status can be used to derive the job execution status. For example, if a step ends in STOPPED, you can use that status to instruct the job to end in STOPPED as well through the , , or elements. This is not what I am looking for as I already know how to work with them unless there’s a way to do them such that they affect what’s updated in the BATCH_STEP_EXECUTION table.
In that table, I want the steps to end with STOPPED if they are successful rather than with COMPLETED. Any ideas? Is it possible to achieve this in the first place?
TO set the step execution status -
StepExecution stepExecution = StepSynchronizationManager.getContext().getStepExecution();
stepExecution.setExitStatus("XYZ")
an example of 'end on' in step
<batch:step id="step1">
<batch:tasklet ref="aaaTasklet"/>
<batch:end on="END1" />
<batch:end on="XYZ"/>
<batch:next on="*" to="step2" />
<batch:listeners>
<batch:listener ref="aaaTaskletListener" />
</batch:listeners>
</batch:step>
<batch:step id="step2">
.
.
.
</batch:step>
Batch Schema -
http://www.springframework.org/schema/batch/spring-batch-2.2.xsd
search
<xsd:element name="end">
Jobs (as steps) must be completed to let spring-batch correctly manage job's lifecycle.
The way to allow running the same job multiple times is to use a an extra job parameter to make every job different from others; you can use a JobParametersIncrementer or just add a new parameter like startDate = new Date().
IMO your way should be avoided.
When getting the "All steps already completed or no steps configured for this job." it might be that you are running the same job with exactly the same params.
To solve this easily you can just add a timestamp to the JobParams like this:
#Test
public void importFilesTest() throws JobParametersInvalidException, JobExecutionAlreadyRunningException, JobRestartException, JobInstanceAlreadyCompleteException {
jobLauncher.run(job, new JobParametersBuilder()
.addLong("timestamp",
System.currentTimeMillis())
.toJobParameters());
}
I had the same issue. i wanted to restart the job on failure. I used decider to take flow decision in batch job configuration file.
When I intentionally fail/stop the job , and rerun with corrected data- then job directly jumped to decider without re running the Step which I want to rerun in case of failure. I did the research on it and came to conclusion that you can not re run the step with same job parameter, even job was failed/stopped. Since in Batch Job table, the status of Step is COMPLETED. hence as per spring batch life cycle, Job will start from the point it was failed. In my case, that point was decider. Hence job was running from decider to correctly manage batch job life cycle.
Hope this is helpful for you.