Spring Batch: How to start a job but not execute it, but execute it in an other java instance - spring-batch

I plan to use Spring Batch. We like to initiate new jobs executions in our pods that are answering the frontend requests.
Pseudo Code:
#PostMapping(path = "/request-report/{id}")
public void requestReport(String id){
this.jobOperator.start("reportJob", new Properties("1"));
}
But we don't want the job to be executed in the frontend pod. For that we like to build a separate micro service pod.
I see the following solutions:
do a rest call from the frontend pod to the spring-batch pod and start the job there. i could do this, but if possible, i like to skip that step and integrate it over the spring batch db.
in the frontend pod i create JobLauncher that has SimpleAsyncTaskExecutor with size zero. So it will never execute a job.
https://docs.spring.io/spring-batch/docs/current/reference/html/job.html#configuringJobLauncher
#Bean
public JobLauncher jobLauncher() {
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setJobRepository(jobRepository());
jobLauncher.setTaskExecutor(new SimpleAsyncTaskExecutor());
jobLauncher.afterPropertiesSet();
return jobLauncher;
}
in the front end pod, i do not use the BatchAutoConfiguration but leaf out some stuff, but what?
I think i also have to write some software that scans the job table and check if there is a not started job present, and start is again in the spring-batch-pod.
Thanks for you help!

If you configure your job launcher with a SimpleAsyncTaskExecutor, it will start jobs in separate threads in the same JVM (hence in the same container and pod).
So unless you provide a custom TaskExecutor that launches jobs in separate JVM/container/pod, you would need to change your architecture to use a job queue. The section Launching Batch Jobs through Messages from the reference docs shows how to setup such pattern.

Related

Spring Batch fails with Transaction error when triggered by Spring Integration

I have a simple Spring Boot (2.6.4) app that uses Spring Batch and Spring Integration to read data from a file and emit Kafka events. I have configured Spring Integration to poll a configurable directory and trigger the Spring Batch Job as soon as a new file arrives.
As soon as the Job is "kicked" I get this error:
Existing transaction detected in JobRepository. Please fix this and try again (e.g. remove #Transactional annotations from client).
I assume this is triggered by the following Spring Integration configuration:
return IntegrationFlows.from(fileReadingMessageSource,
c -> c.poller(Pollers.fixedDelay(period)
.taskExecutor(taskExecutor)
.maxMessagesPerPoll(maxMessagesPerPoll)
.transactionSynchronizationFactory(transactionSynchronizationFactory())
.transactional(new PseudoTransactionManager())))
The TransactionSynchronizationFactory is configured to move the incoming file into an error or success directory, based on the outcome of the job.
My understanding is that a Spring Batch Job does not like to be executed within the context of a Transaction, since it needs to manage its own transaction to deal with multiple steps failures.
I also understand that I could set the value of setValidateTransactionState of JobRepositoryFactoryBean to false, but I would rather not mess around with the Spring Batch internals.
As suggested elsewhere, I have also tried to use an Async Job Launcher, but no luck.
#Bean
public JobLauncher getJobLauncher() {
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setTaskExecutor(new SimpleAsyncTaskExecutor());
jobLauncher.setJobRepository(this.jobRepository);
return jobLauncher;
}
The app is using an in-memory H2 data-source, but that doesn't seem to affect the error.
I'm unable to find a solution to this problem, any recommendation?
Edit - full project here: https://github.com/luciano-fiandesio/spring-batch-and-integration-demo

Searching for actually running spring batch jobs

I'm trying to find the way how to determine whether JobExecution obtained from JobExplorer is actually running.
The problem is that explorer checks for job execution by running jdbc query. But is some cases (e.g. container with Spring Batch job was killed/restarted) jobs remain persisted in DB in running state, though they are orphaned and there is no processing actually running in runtime.
And simple method like
private boolean isJobRunning(String jobName) {
var activeExecutions = jobExplorer.findRunningJobExecutions(jobName);
return activeExecutions
.stream()
.anyMatch(jobExecution -> jobExecution.getStatus().isRunning());
}
doesn't work as expected.
Is there any way to find actually active jobs?
Thanks in advance

How to explicitly configure TaskBatchExecutionListener in my spring batch application

My spring batch application is not inserting relationship between task and job in TASK_TASK_BATCH table.
Spring doc says :
Associating A Job Execution To The Task In Which It Was Executed
Spring Boot provides facilities for the execution of batch jobs easily
within an über-jar. Spring Boot’s support of this functionality allows
for a developer to execute multiple batch jobs within that execution.
Spring Cloud Task provides the ability to associate the execution of a
job (a job execution) with a task’s execution so that one can be
traced back to the other.
This functionality is accomplished by using the TaskBatchExecutionListener. By default, this listener is auto configured in any context that has both a Spring Batch Job configured (via having a bean of type Job defined in the context) and the spring-cloud-task-batch jar is available within the classpath. The listener will be injected into all jobs."
I have all the required jars in my classpath.It's just that I am creating jobs and tasklets dynamically so not using any annotation. As per the doc TaskBatchExecutionListener is responsible for creating mapping in TASK_TASK_BATCH table by calling taskBatchDao's saveRelationship method.
I am just not able to figure out how to configure TaskBatchExecutionListener explicitly in my spring batch application.
If you have the org.springframework.cloud:spring-cloud-task-batch dependency, and the annotation #EnableTask is present, then your application context contains a TaskBatchExecutionListener bean that you can inject into your class that dynamically creates the jobs and tasklets.
That might look similar to this:
#Autowired
JobBuilderFactory jobBuilderFactory;
#Autowired
TaskBatchExecutionListener taskBatchExecutionListener;
Job createJob() throws Exception {
return jobBuilderFactory
.get("myJob")
.start(createStep())
.listener(taskBatchExecutionListener)
.build();
}
I hope that helps. Otherwise please share some minimal code example to demonstrate what you're trying to do.

Java EE 7 schedule tasks in different environments/tiers

Is there a way to configure a specific scheduled task in different environments and schedules?
E.g. The same scheduled task 'MyTask' is supposed to run in Integration and Production. In Production 'MyTask' has to run every 24h and in Integration 'MyTask' must not run at all.
Currently we're focusing on the native Java EE 7 schedule mechanism. Spring, Quartz are additional frameworks/libraries which we don't want to use (if possible).
If you are using Spring try #Scheduled("${cronEx}"). You can provide for each environment a different configuration defining the cronEx-value. For example you could get the cronEx-value via JNDI. More about it: SO-Q&A and SO-Q&A.
If you need something more sophisticated have a look at the QUARTZ project: http://www.quartz-scheduler.org/ It is a library to schedule jobs.
There are several ways of creating a scheduled task in Java EE. I think that what fits better for you is using ManagedScheduledExecutorService.
#ApplicationScoped
public class PeriodicTask {
#Resource
ManagedScheduledExecutorService mses;
#Inject
#Config("period")
private int period;
public void startJobs() {
mses.scheduleAtFixedRate(this::task, 0, period, TimeUnit.MINUTES);
}
private void task() {
...
}
...
}
That way you can, for instance, inject the config value period depending on the running environment. If you don't need to schedule a task for an specific environment you can have another configuration parameter to avoid calling scheduleAtFixedRate method.
The only thing pending to do is calling startJobs method.

Launching Spring Batch Task from Spring Cloud Data Flow

I am having a web based spring batch application. My Batch Job will be kick started on an API call. Here is my method exposed as a web service.
#RequestMapping(value = "/v1/initiateEntityCreation", method = RequestMethod.GET)
public String initiateEntityCreation()
throws JobExecutionAlreadyRunningException, JobRestartException, JobInstanceAlreadyCompleteException,
JobParametersInvalidException, NoSuchJobException, JobInstanceAlreadyExistsException {
JobParametersBuilder jobParametersBuilder = new JobParametersBuilder();
jobParametersBuilder.addDate("Date", new Date());
Long executionContext = jobOperator.start("InitiateEntityCreation", String.format("Date=%s", new Date()));
return executionContext.toString();
}
My batch job is working fine and i have a Mysql Instance as my Job Repository. I have integrated Spring Cloud data flow to my batch application. I have my #EnableTask annotation and all necessary dependencies. I have connected my Spring Cloud data local server to my spring batch jon Repository instance.
Here is what my command line argument for SCDF.
java -jar spring-cloud-dataflow-server-local-1.2.3.RELEASE.jar --
spring.datasource.url=jdbc:mysql://localhost:3306/springbatchdb--
spring.datasource.username=root --spring.datasource.password=password --
spring.datasource.driver-class-name=org.mariadb.jdbc.Driver
My Local server is running and capturing all the job execution instances. I have registered my spring batch application to SCDF and defined a task for SCDF with the definition.
When am trying to launch the job from SCDF, am getting "Task successfully executed". But my job is not getting executed.
If I check task executions, am seeing like
StartTime N/A, EndTime N/A and if I drill down to task execution there are not batch jobs that have been run. Please let me know how we can launch a web based spring batch job using Spring Cloud data flow.