Spring Batch fails with Transaction error when triggered by Spring Integration - spring-batch

I have a simple Spring Boot (2.6.4) app that uses Spring Batch and Spring Integration to read data from a file and emit Kafka events. I have configured Spring Integration to poll a configurable directory and trigger the Spring Batch Job as soon as a new file arrives.
As soon as the Job is "kicked" I get this error:
Existing transaction detected in JobRepository. Please fix this and try again (e.g. remove #Transactional annotations from client).
I assume this is triggered by the following Spring Integration configuration:
return IntegrationFlows.from(fileReadingMessageSource,
c -> c.poller(Pollers.fixedDelay(period)
.taskExecutor(taskExecutor)
.maxMessagesPerPoll(maxMessagesPerPoll)
.transactionSynchronizationFactory(transactionSynchronizationFactory())
.transactional(new PseudoTransactionManager())))
The TransactionSynchronizationFactory is configured to move the incoming file into an error or success directory, based on the outcome of the job.
My understanding is that a Spring Batch Job does not like to be executed within the context of a Transaction, since it needs to manage its own transaction to deal with multiple steps failures.
I also understand that I could set the value of setValidateTransactionState of JobRepositoryFactoryBean to false, but I would rather not mess around with the Spring Batch internals.
As suggested elsewhere, I have also tried to use an Async Job Launcher, but no luck.
#Bean
public JobLauncher getJobLauncher() {
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setTaskExecutor(new SimpleAsyncTaskExecutor());
jobLauncher.setJobRepository(this.jobRepository);
return jobLauncher;
}
The app is using an in-memory H2 data-source, but that doesn't seem to affect the error.
I'm unable to find a solution to this problem, any recommendation?
Edit - full project here: https://github.com/luciano-fiandesio/spring-batch-and-integration-demo

Related

Spring Batch: How to start a job but not execute it, but execute it in an other java instance

I plan to use Spring Batch. We like to initiate new jobs executions in our pods that are answering the frontend requests.
Pseudo Code:
#PostMapping(path = "/request-report/{id}")
public void requestReport(String id){
this.jobOperator.start("reportJob", new Properties("1"));
}
But we don't want the job to be executed in the frontend pod. For that we like to build a separate micro service pod.
I see the following solutions:
do a rest call from the frontend pod to the spring-batch pod and start the job there. i could do this, but if possible, i like to skip that step and integrate it over the spring batch db.
in the frontend pod i create JobLauncher that has SimpleAsyncTaskExecutor with size zero. So it will never execute a job.
https://docs.spring.io/spring-batch/docs/current/reference/html/job.html#configuringJobLauncher
#Bean
public JobLauncher jobLauncher() {
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setJobRepository(jobRepository());
jobLauncher.setTaskExecutor(new SimpleAsyncTaskExecutor());
jobLauncher.afterPropertiesSet();
return jobLauncher;
}
in the front end pod, i do not use the BatchAutoConfiguration but leaf out some stuff, but what?
I think i also have to write some software that scans the job table and check if there is a not started job present, and start is again in the spring-batch-pod.
Thanks for you help!
If you configure your job launcher with a SimpleAsyncTaskExecutor, it will start jobs in separate threads in the same JVM (hence in the same container and pod).
So unless you provide a custom TaskExecutor that launches jobs in separate JVM/container/pod, you would need to change your architecture to use a job queue. The section Launching Batch Jobs through Messages from the reference docs shows how to setup such pattern.

How can I use Spring Cloud Task with modular Spring Batch jobs?

I've developed couple of Spring Batch jobs in Spring Boot web application. For easier maintenance, I used #EnableBatchProcessing(modular = true) like this:
#Configuration
#EnableBatchProcessing(modular = true)
public class BatchConfiguration {
#Bean
#ConditionalOnProperty(value="first.batch.enabled", havingValue = "true")
public ApplicationContextFactory firstJobs() {
return new GenericApplicationContextFactory(FirstModule.class);
}
#Bean
#ConditionalOnProperty(value="second.batch.enabled", havingValue = "true")
public ApplicationContextFactory secondJobs() {
return new GenericApplicationContextFactory(SecondModule.class);
}
...
}
and I have #Configuration classes, one for every Job defined respectively in base directories of modules. Everything works fine, but now I want to setup Spring Cloud Dataflow UI to have some basic monitoring of my Jobs.
The problem is, when I try to add #EnableTask annotation to this class BatchConfiguration, Spring is not associating job execution with task execution. It is only working, when I run tests (#SpringBatchTest).
I also tried to add #EnableTask annotation to FirstJobConfiguration class instead, and also add it to both BatchConfiguration and First|JobConfiguration, but with no effect. I also went through official documentation, but found nothing.
Is it possible to use Spring Cloud Task with modular Spring Batch ?
Thanks.
The Spring Batch application can be converted into a Spring Cloud Task application and be managed using Spring Cloud Data Flow.
You should be able to use EnableTask in your SpringBoot application along with EnableBatchProcessing in your configuration.
Please see the SCDF sample which demonstrates this scenario.

How to explicitly configure TaskBatchExecutionListener in my spring batch application

My spring batch application is not inserting relationship between task and job in TASK_TASK_BATCH table.
Spring doc says :
Associating A Job Execution To The Task In Which It Was Executed
Spring Boot provides facilities for the execution of batch jobs easily
within an über-jar. Spring Boot’s support of this functionality allows
for a developer to execute multiple batch jobs within that execution.
Spring Cloud Task provides the ability to associate the execution of a
job (a job execution) with a task’s execution so that one can be
traced back to the other.
This functionality is accomplished by using the TaskBatchExecutionListener. By default, this listener is auto configured in any context that has both a Spring Batch Job configured (via having a bean of type Job defined in the context) and the spring-cloud-task-batch jar is available within the classpath. The listener will be injected into all jobs."
I have all the required jars in my classpath.It's just that I am creating jobs and tasklets dynamically so not using any annotation. As per the doc TaskBatchExecutionListener is responsible for creating mapping in TASK_TASK_BATCH table by calling taskBatchDao's saveRelationship method.
I am just not able to figure out how to configure TaskBatchExecutionListener explicitly in my spring batch application.
If you have the org.springframework.cloud:spring-cloud-task-batch dependency, and the annotation #EnableTask is present, then your application context contains a TaskBatchExecutionListener bean that you can inject into your class that dynamically creates the jobs and tasklets.
That might look similar to this:
#Autowired
JobBuilderFactory jobBuilderFactory;
#Autowired
TaskBatchExecutionListener taskBatchExecutionListener;
Job createJob() throws Exception {
return jobBuilderFactory
.get("myJob")
.start(createStep())
.listener(taskBatchExecutionListener)
.build();
}
I hope that helps. Otherwise please share some minimal code example to demonstrate what you're trying to do.

Batch Job exit status using Spring Cloud Task

I'm trying to setup a spring batch project to be deployed on Spring Cloud Data Flow server, but first I must "wrapp" it on a Spring Cloud Task application.
Spring Batch generates metadata (start/end, status, parameters, etc) on BATCH_ tables. Cloud Task do the same but on TASK_ tables.
Reading the documentation of Spring Cloud Task, it said that in order to pass the batch information to the task, it must be set
spring.cloud.task.batch.failOnJobFailure=true and also
To have your task return the exit code based on the result of the
batch job execution, you will need to write your own
CommandLineRunner.
So, any indications on how I should write my own CommandLineRunner ?
For now, only having set the propertie, if I force the task to fail, I'm getting Failed to execute CommandLineRunner .... Job UsersJob failed during execution for jobId 3 with jobExecutionId of 6

Launching Spring Batch Task from Spring Cloud Data Flow

I am having a web based spring batch application. My Batch Job will be kick started on an API call. Here is my method exposed as a web service.
#RequestMapping(value = "/v1/initiateEntityCreation", method = RequestMethod.GET)
public String initiateEntityCreation()
throws JobExecutionAlreadyRunningException, JobRestartException, JobInstanceAlreadyCompleteException,
JobParametersInvalidException, NoSuchJobException, JobInstanceAlreadyExistsException {
JobParametersBuilder jobParametersBuilder = new JobParametersBuilder();
jobParametersBuilder.addDate("Date", new Date());
Long executionContext = jobOperator.start("InitiateEntityCreation", String.format("Date=%s", new Date()));
return executionContext.toString();
}
My batch job is working fine and i have a Mysql Instance as my Job Repository. I have integrated Spring Cloud data flow to my batch application. I have my #EnableTask annotation and all necessary dependencies. I have connected my Spring Cloud data local server to my spring batch jon Repository instance.
Here is what my command line argument for SCDF.
java -jar spring-cloud-dataflow-server-local-1.2.3.RELEASE.jar --
spring.datasource.url=jdbc:mysql://localhost:3306/springbatchdb--
spring.datasource.username=root --spring.datasource.password=password --
spring.datasource.driver-class-name=org.mariadb.jdbc.Driver
My Local server is running and capturing all the job execution instances. I have registered my spring batch application to SCDF and defined a task for SCDF with the definition.
When am trying to launch the job from SCDF, am getting "Task successfully executed". But my job is not getting executed.
If I check task executions, am seeing like
StartTime N/A, EndTime N/A and if I drill down to task execution there are not batch jobs that have been run. Please let me know how we can launch a web based spring batch job using Spring Cloud data flow.