Searching for actually running spring batch jobs - spring-batch

I'm trying to find the way how to determine whether JobExecution obtained from JobExplorer is actually running.
The problem is that explorer checks for job execution by running jdbc query. But is some cases (e.g. container with Spring Batch job was killed/restarted) jobs remain persisted in DB in running state, though they are orphaned and there is no processing actually running in runtime.
And simple method like
private boolean isJobRunning(String jobName) {
var activeExecutions = jobExplorer.findRunningJobExecutions(jobName);
return activeExecutions
.stream()
.anyMatch(jobExecution -> jobExecution.getStatus().isRunning());
}
doesn't work as expected.
Is there any way to find actually active jobs?
Thanks in advance

Related

Spring Batch fails with Transaction error when triggered by Spring Integration

I have a simple Spring Boot (2.6.4) app that uses Spring Batch and Spring Integration to read data from a file and emit Kafka events. I have configured Spring Integration to poll a configurable directory and trigger the Spring Batch Job as soon as a new file arrives.
As soon as the Job is "kicked" I get this error:
Existing transaction detected in JobRepository. Please fix this and try again (e.g. remove #Transactional annotations from client).
I assume this is triggered by the following Spring Integration configuration:
return IntegrationFlows.from(fileReadingMessageSource,
c -> c.poller(Pollers.fixedDelay(period)
.taskExecutor(taskExecutor)
.maxMessagesPerPoll(maxMessagesPerPoll)
.transactionSynchronizationFactory(transactionSynchronizationFactory())
.transactional(new PseudoTransactionManager())))
The TransactionSynchronizationFactory is configured to move the incoming file into an error or success directory, based on the outcome of the job.
My understanding is that a Spring Batch Job does not like to be executed within the context of a Transaction, since it needs to manage its own transaction to deal with multiple steps failures.
I also understand that I could set the value of setValidateTransactionState of JobRepositoryFactoryBean to false, but I would rather not mess around with the Spring Batch internals.
As suggested elsewhere, I have also tried to use an Async Job Launcher, but no luck.
#Bean
public JobLauncher getJobLauncher() {
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setTaskExecutor(new SimpleAsyncTaskExecutor());
jobLauncher.setJobRepository(this.jobRepository);
return jobLauncher;
}
The app is using an in-memory H2 data-source, but that doesn't seem to affect the error.
I'm unable to find a solution to this problem, any recommendation?
Edit - full project here: https://github.com/luciano-fiandesio/spring-batch-and-integration-demo

Spring Batch: How to start a job but not execute it, but execute it in an other java instance

I plan to use Spring Batch. We like to initiate new jobs executions in our pods that are answering the frontend requests.
Pseudo Code:
#PostMapping(path = "/request-report/{id}")
public void requestReport(String id){
this.jobOperator.start("reportJob", new Properties("1"));
}
But we don't want the job to be executed in the frontend pod. For that we like to build a separate micro service pod.
I see the following solutions:
do a rest call from the frontend pod to the spring-batch pod and start the job there. i could do this, but if possible, i like to skip that step and integrate it over the spring batch db.
in the frontend pod i create JobLauncher that has SimpleAsyncTaskExecutor with size zero. So it will never execute a job.
https://docs.spring.io/spring-batch/docs/current/reference/html/job.html#configuringJobLauncher
#Bean
public JobLauncher jobLauncher() {
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setJobRepository(jobRepository());
jobLauncher.setTaskExecutor(new SimpleAsyncTaskExecutor());
jobLauncher.afterPropertiesSet();
return jobLauncher;
}
in the front end pod, i do not use the BatchAutoConfiguration but leaf out some stuff, but what?
I think i also have to write some software that scans the job table and check if there is a not started job present, and start is again in the spring-batch-pod.
Thanks for you help!
If you configure your job launcher with a SimpleAsyncTaskExecutor, it will start jobs in separate threads in the same JVM (hence in the same container and pod).
So unless you provide a custom TaskExecutor that launches jobs in separate JVM/container/pod, you would need to change your architecture to use a job queue. The section Launching Batch Jobs through Messages from the reference docs shows how to setup such pattern.

configuring multiple versions of job in spring batch

SpringBatch seems to be lacking the metadata for the job definition in the database.
In order to create a job instance in the database, the only thing it considers is jobName and jobParamter, "JobInstance createJobInstance(String jobName, JobParameters jobParameters);"
But,the object model of Job is rich enough to consider steps and listeners. So, if i create a new version of the existing job, by adding few additional steps, spring batch does not distinguish it from the previous version. Hence, if i ran the previous version today and run the updated version, spring batch does not run the updated version, as it feels that previous run was successful. At present, it seems like, the version number of the job, should be part of the name. Is this correct understanding ?
You are correct that the framework identifies each job instance by a unique combination of job name and (identifying) job parameters.
In general, if a job fails, you should be able to re-run with the same parameters to restart the failed instance. However, you cannot restart a completed instance. From the documentation:
JobInstance can be restarted multiple times in case of execution failure and it's lifecycle ends with first successful execution. Trying to execute an existing JobIntance that has already completed successfully will result in error. Error will be raised also for an attempt to restart a failed JobInstance if the Job is not restartable.
So you're right that the same job name and identifying parameters cannot be run multiple times. The design framework prevents this, regardless of what the business steps job performs. Again, ignoring what your job actually does, here's how it would work:
1) jobName=myJob, parm1=foo , parm2=bar -> runs and fails (assume some exception)
2) jobName=myJob, parm1=foo , parm2=bar -> restarts failed instance and completes
3) jobName=myJob, parm1=foo , parm2=bar -> fails on startup (as expected)
4) jobName=myJob, parm1=foobar, parm2=bar -> new params, runs and completes
The "best practices" we use are the following:
Each job instance (usually defined by run-date or filename we are processing) must define a unique set of parameters (otherwise it will fail per the framework design)
Jobs that run multiple times a day but just scan a work table or something use an incrementer to pass a integer parameter, which we increase by 1 upon each successful completion
Any failed job instances must be either restarted or abandoned before pushing code changes that affect the the job will function

Spring batch jobOperator - how are multiple concurrent instances of a job from the same XML file controlled?

When we run multiple concurrent jobs with different parameters, how can we control (stop, restart) the appropriate jobs? Our internal code provides the jobExecution object, but under the covers The jobOperator uses the job name to get the job instance.
In our case all of the jobs are from "do-stuff.xml" (okay, it's sanitized and not very original). After looking at the spring-batch source code, our concern is that if there is more then one job running and we stop a job it will take the most recently submitted job and stop it.
The JobOperator will allow you to fetch all running executions of the job using getRunningExecutions(String jobName). You should be able to iterate over that list to find the one you want. Then, just call stop(long executionId) on the one you want.
Alternatively, we've also implemented listeners (both at step and chunk level) to check an outage status table. When we want to implement a system-wide outage, we add the outage there and have our listener throw an exception to bring our jobs down. once the outage is lifted, all "failed" executions may be restarted.

Spring Batch Job Execution does not show in Executions Tab

I have a Spring Batch job setup using JavaConfig (entirely through java code) which I deploy on a module in Spring XD. Typically, when you launch a job, you should see it in Spring XD's admin-ui under the Executions tab. Not mine, however, and I have no clue why. I've spent hours scouring through the documentation and looking for an answer, but I can't find anything.
Am I missing something? Is there something I need to put in my Job to make this work? What could cause Spring XD to not display a job's execution under Executions?
If you need me to provide logs or something, let me know, although I am not seeing any error in the Spring XD console output.
EDIT: This is how the job is defined int the code:
return jobBuilderFactory.get("Job")
.incrementer(new RunIdIncrementer())
.start(setupStep)
.next(verifyStep)
...
.next(zipFilesStep)
.next(teardownStep)
.build();
Does this happen only at the Admin UI? Can you see the job executions when you run the XD shell command job execution list?
If you are sure the batch job gets executed but not getting listed then it appears a bug. Do you see any stacktrace at the admin log when the execution tab is clicked?