Spring Batch - Executing multiple instances of a job at same time - spring-batch

I have a clarification.
Is it possible for us to run multiple instances of a job at the same time.
Currently, we have single instance of a job at any given time.
If it is possible, please let me know how to do it.

Yes you can. Spring Batch distinguishes jobs based on the JobParameters. So if you always pass different JobParameters to the same job, you will have multiple instances of the same job running.
A simple way is just to add a UUID parameter to each request to start a job.
Example:
final JobParametersBuilder jobParametersBuilder = new JobParametersBuilder();
jobParametersBuilder.addString("instance_id", UUID.randomUUID().toString(), true);
jobLauncher.run(job,jobParametersBuilder.toJobParameters());
The boolean 'true' at the end signal to Spring Batch to use that parameter as part of the 'identity' of the instance of the job, so you will always get new instances with each 'run' of the job.

Yes you can very much run tasks in parallel as also documented here
But there are certain things to be considered
Does your application logic needs parallel execution? Because if if you are going to run steps in parallel, you would have to take care and build application logic so that the work done by parallel steps is not overlapping (Unless that is the intention of your application)

Yes, it's completely possible to have multiple instances (or executions) of a job run concurrently.

Related

How to stop a Spring Batch job no matter what step it's on

I know there's a common pattern to stop a job when certain exceptions are thrown. But we allow our users to stop any job at any time.
I have a number of microservices, each running a different batch job. In the front, have a controller method that looks up all running jobs, gets the execution Id, and then uses JobOperator to issue a stop command. But execution appears to continue.
jobOperator.stop(Long.parseLong(jobExecId));
All of the examples I've seen have issued just this command and updated the JobRepository, which I do.
jobExecution.setEndTime(new Date());
jobExecution.setStatus(BatchStatus.ABANDONED);
jobExecution.setExitStatus(ExitStatus.STOPPED);
jobRepository.update(jobExecution);
Is there something more I should be doing?
I suppose calling Thread.currentThread().interrupt() should be the right way to proceed.
SB will intercept signal in ThreadStepInterruptionPolicy and stops the job.

Spring batch job is thread-safe?

I need to parallelize a single step of a batch spring job. Before the step to be parallelized, tasklets are run that put some results in the parameters of the job.
The results produced by the tasklets, are necessary to execute the Partitioner and the Items of the step to be parallelized.
A doubt is arising that I really can't solve. Since I can have the same job running simultaneously multiple times with different initial parameters, are the tasklets and step items safe thread-safe?
No, tasklets and chunk-oriented step components are not thread-safe. If they are shared between multiple job instances/executions running concurrently, you need to make them thread-safe.
You can achieve this by using JobScoped steps and StepScoped readers/writers. You can also use the SynchronizedItemStreamReader and the (upcoming) SynchronizedItemStreamWriter to make readers and writers thread-safe. All item readers and writers provided by Spring Batch have a mention about their thread-safety in the Javadoc.
You do not want to run multiple instances of the same job. It would be better to run multiple tasks or processes in the same step and or job. You might want to lookup job partitioning, and or Remote Chucking to do concurrent processing.
If it has to be isolated jobs then you might have your concurrent jobs write out to say a message que as their end (writer) step, and then have another job listen to read from that que.
https://docs.spring.io/spring-batch/2.1.x/cases/parallel.html

Quartz: Undesired multiple job run in same time

I have some jobs that run with their schedule, but when I have for example two jobs that run in same time Quartz start one of this jobs two or three times. Did someone have the same problem? And how can I resolve this?
I am not entirely sure, but there could be copule reasons,
1). you are creating a new scheduler instance everytime you trigger/schedule job
2). or you are running the same exact Execute method of a class or have same job running eveytime.
So, when you declare your scheduler, instead of working on some other instance of scheduler, use default scheduler all time, for e.g.,
private IScheduler scheduler = StdSchedulerFactory.GetDefaultScheduler();
that is what I have done and I have a lot notifications/jobs triggering using Quartz.Net and haven't faced any issue yet.
Let me know if any of this helps. Cheers!

Is there any utility to run multiple spring batch jobs programmatically ?

Am invoking spring jobs based on event, however i hv couple jobs to execute on specific event which could execute in parallel, Is there any utility class which can execute multiple jobs in parallel? Thanks
We don't offer anything specific for launching multiple jobs based on a single message out of the box with Spring Batch. However, writing a message handler that can handle that scenario should be pretty trivial.

SpringBatch: getting the executionid of a completed instance by its JobParameters

My software is coreographing a number of spring batch jobs. The output of a job is partially an input for the next job . It may happen that the entire process (the entire jobs chain) is restarted, even if one or more jobs in the chain have been successfully completed. In this case, when I try tu run one of the jobs again with the same parameters, I get a JobInstanceAlreadyCompletedException as expected. I could skip and go on to the next job but I would need to access the context of the completed instance in order to get the output produced by its steps and pass them over to the next job.
According to the JobExplorer APIs, this is just possible if you have the executionId of the completed instance. I can't get it from the JobInstanceAlreadyCompletedException , and it looks there are no APIs for getting it from the already used parameters list. Do you know a way to get this executionId given the parameters? Or to get access, in whatever way, to the completed instance job context?
Why not put all this jobs into one main job and using JobSteps to integrate the jobs? This way, already completed subjobs will be treated as completed steps, which will not be started again. Moreover, all information is available in the job/step contexts, even if you restart?
Another way would be to save all needed parameters and information into a file and use this to start the next job instead of beeing dependent on the Jobexecution info. Your last step could simply be a tasklet, that writes an appropriate property file.