Spring batch - How to run multiple jobs in parallel - spring-batch

I have multiple scheduled jobs in BatchScheduler which runs at a particular time. Simple inbuilt JobLauncher which is sync. in nature is used initially.
Now, I want to run the jobs in parallel so that no job can wait for other to finish.
I have tried with the #Async annotation on my different jobs but it did not worked.
I have also tried creating different JobLauncher object for each and every job but it also did not worked.
Then, I tried with setting the jobLauncher.setTaskExecutor(new SimpleAsyncTaskExecutor()).
But it did not worked.
I have also tried #Bean
public JobLauncher jobLauncher() {
final SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setJobRepository(jobRepository);
final SimpleAsyncTaskExecutor simpleAsyncTaskExecutor = new SimpleAsyncTaskExecutor();
jobLauncher.setTaskExecutor(simpleAsyncTaskExecutor);
return jobLauncher;
}
I have tried all the combinations given in different stackoverflow answers but it did not worked.
#Bean
public JobLauncher jobLauncher() {
final SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setJobRepository(jobRepository);
final SimpleAsyncTaskExecutor simpleAsyncTaskExecutor = new SimpleAsyncTaskExecutor();
jobLauncher.setTaskExecutor(simpleAsyncTaskExecutor);
return jobLauncher;
}
Actual:
But it also did not worked.
As when I am checking the starttime for the batch jobs in batch tables. The job are starting when 1 job is finished.
Expected:
Jobs should run in parallel.

this configuration works for me:
import org.springframework.batch.core.configuration.annotation.DefaultBatchConfigurer;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.batch.core.launch.support.SimpleJobLauncher;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.scheduling.concurrent.ThreadPoolTaskScheduler;
#Configuration
public class TaskExecutorBatchConfigurer extends DefaultBatchConfigurer {
#Bean
public ThreadPoolTaskScheduler batchTaskScheduler() {
ThreadPoolTaskScheduler threadPoolTaskScheduler = new ThreadPoolTaskScheduler();
threadPoolTaskScheduler.setPoolSize(10);
threadPoolTaskScheduler.afterPropertiesSet();
return threadPoolTaskScheduler;
}
#Override
protected JobLauncher createJobLauncher() throws Exception {
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setJobRepository(super.getJobRepository());
jobLauncher.setTaskExecutor(batchTaskScheduler());
jobLauncher.afterPropertiesSet();
return jobLauncher;
}
}

Related

Should Job/Step/Reader/Writer all be bean?

As fars as all the examples from the Spring Batch reference doc , I see that those objects like job/step/reader/writer are all marked as #bean, like the following:
#Bean
public Job footballJob() {
return this.jobBuilderFactory.get("footballJob")
.listener(sampleListener())
...
.build();
}
#Bean
public Step sampleStep(PlatformTransactionManager transactionManager) {
return this.stepBuilderFactory.get("sampleStep")
.transactionManager(transactionManager)
.<String, String>chunk(10)
.reader(itemReader())
.writer(itemWriter())
.build();
}
I have a scenario that the server side will receive requests and run job concurrently(different job names or same job name with different jobparameters). The usage is to new a job object(including steps/reader/writers) in concurrent threads, so I propabaly will not state the job method as #bean and new a job each time.
And there is actually a differenence on how to transmit parameters to object like reader. If using #bean , parameters must be put in e.g. JobParameters to be late binding into object using #StepScope, like the following example:
#StepScope
#Bean
public FlatFileItemReader flatFileItemReader(#Value(
"#{jobParameters['input.file.name']}") String name) {
return new FlatFileItemReaderBuilder<Foo>()
.name("flatFileItemReader")
.resource(new FileSystemResource(name))
}
If not using #bean , I can just transmit parameter directly with no need to put data into JobParameter,like the following
public FlatFileItemReader flatFileItemReader(String name) {
return new FlatFileItemReaderBuilder<Foo>()
.name("flatFileItemReader")
.resource(new FileSystemResource(name))
}
Simple test shows that no #bean works. But I want to confirm formally:
1、 Is using #bean at job/step/reader/writer mandatory or not ?
2、 if it is not mandatory, when I new a object like reader, do I need to call afterPropertiesSet() manually?
Thanks!
1、 Is using #bean at job/step/reader/writer mandatory or not ?
No, it is not mandatory to declare batch artefacts as beans. But you would want to at least declare the Job as a bean to benefit from Spring's dependency injection (like injecting the job repository reference into the job, etc) and be able to do something like:
ApplicationContext context = new AnnotationConfigApplicationContext(MyJobConfig.class);
Job job = context.getBean(Job.class);
JobLauncher jobLauncher = context.getBean(JobLauncher.class);
jobLauncher.run(job, new JobParameters());
2、 if it is not mandatory, when I new a object like reader, do I need to call afterPropertiesSet() manually?
I guess that by "when I new a object like reader" you mean create a new instance manually. In this case yes, if the object is not managed by Spring, you need to call that method yourself. If the object is declared as a bean, Spring will call
the afterPropertiesSet() method automatically. Here is a quick sample:
import org.springframework.beans.factory.InitializingBean;
import org.springframework.context.ApplicationContext;
import org.springframework.context.annotation.AnnotationConfigApplicationContext;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
#Configuration
public class TestAfterPropertiesSet {
#Bean
public MyBean myBean() {
return new MyBean();
}
public static void main(String[] args) throws Exception {
ApplicationContext context = new AnnotationConfigApplicationContext(TestAfterPropertiesSet.class);
MyBean myBean = context.getBean(MyBean.class);
myBean.sayHello();
}
static class MyBean implements InitializingBean {
#Override
public void afterPropertiesSet() throws Exception {
System.out.println("MyBean.afterPropertiesSet");
}
public void sayHello() {
System.out.println("Hello");
}
}
}
This prints:
MyBean.afterPropertiesSet
Hello

Can multi job Spring Batch app load minimum set of beans? [duplicate]

I have a Spring Batch project with multiple jobs (job A, job B, job C,...). When I run a particular job A, I got the log of the job A shows that all of the beans of job B, C,... are created too. Is there any way to avoid the creation of the other beans when job A is launched.
I have tried to use #Lazy annotation but it 's seem not working.
#Configuration
#EnableBatchProcessing
public class BatchConfiguration {
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Autowired
#Qualifier("springDataSource")
public DataSource springDataSource;
#Autowired
#Qualifier("batchJobDataSource")
public DataSource batchJobDataSource;
}
#Configuration
#PropertySource("classpath:partner.properties")
public class B extends BatchConfiguration {
#Value("${partnerId}")
private String partnerId;
#Lazy
#Bean
public Job ProcessB(JobCompletionNotificationListener listener) {
return jobBuilderFactory
.get("ProcessB")
.incrementer(new RunIdIncrementer())
.listener(listener)
.start(ProcessStepB())
.build();
}
#Lazy
#Bean
public Step (ProcessStepB() {
return stepBuilderFactory
.get("(ProcessStepB")
.<PartnerDTO, PartnerDTO> chunk(1)
.reader(getPartner())
.processor(process())
.writer(save())
.build();
}
#Lazy
#Bean(destroyMethod = "")
public Reader getPartner() {
return new Reader(batchJobDataSource,partnerId);
}
#Lazy
#Bean
public Processor process() {
return new Processor();
}
#Lazy
#Bean
HistoryWriter historyWriter() {
return new HistoryWriter(batchJobDataSource);
}
#Lazy
#Bean
UpdateWriter updateWriter() {
return new UpdateWriter(batchJobDataSource);
}
#Lazy
#Bean
public CompositeItemWriter<PartnerDTO> saveTransaction() {
List<ItemWriter<? super PartnerDTO>> delegates = new ArrayList<>();
delegates.add(updateWriter());
delegates.add(historyWriter());
CompositeItemWriter<PartnerDTO> itemWriter = new CompositeItemWriter<>();
itemWriter.setDelegates(delegates);
return itemWriter;
}
}
I have also put the #Lazy over the #Configuration but it does work too.
That should not be an issue. But here are a few ideas to try:
Use Spring profiles to isolate job beans
If you use Spring Boot 2.2+, try to activate the lazy bean initialization mode
Package each job in its own jar. This is the best option IMO.

Spring boot reading CSV file using "FlatFileItemReader" in Service method

I am going to develop an application where I am trying to read CSV file using Spring FlatFileItemReader object. I like to use a Service in where a method will be called to execute reading process. But I am not directly using configuration bean object of FlatFileItemReader and others. An example of my prototype bellow.
public void executeCsvReading(){
FlatFileItemReader<SomeModel> reader = new FlatFileItemReader<SomeModel>();
reader.setResource(new ClassPathResource("someFile.csv"));
reader.setLineMapper(new DefaultLineMapper<SomeModel>() {
{
setLineTokenizer(new DelimitedLineTokenizer() {
{
setNames(new String[] { "somefield1", "somefield2" });
}
});
setFieldSetMapper(new BeanWrapperFieldSetMapper<SomeModel>() {
{
setTargetType(SomeModel.class);
}
});
}
});
// But how do I start this Job?
Job job = jobBuilderFactory
.get("readCSVFilesJob")
.incrementer(new RunIdIncrementer())
.start(step)
.build();
Step step = stepBuilderFactory.get("step1").<SomeModel, SomeModel>chunk(5)
.reader(reader)
.writer(new WriteItemsOn()) // Call WriteItemsOn class constructor
.build();
}
public class WriteItemsOn implements ItemWriter<User> {
#Override
public void write(List<? extends SomeModel> items) throws Exception {
// TODO Auto-generated method stub
for (SomeModel item : items) {
System.out.println("Item is " + item.someMethod()");
}
}
}
// But how do I start this Job?
To start the job, you need to use a JobLauncher. For example:
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
//jobLauncher.setJobRepository(yourJobRepository);
//jobLauncher.setTaskExecutor(yourTaskExecutor);
jobLauncher.afterPropertiesSet();
jobLauncher.run(job, new JobParameters());
However, with this approach, you will need to configure infrastructure beans required by Spring Batch (JobRepository, JobLauncher, etc) yourself.
I would recommend to use a typical Spring Batch job configuration and run your job from the method executeCsvReading. Here is an example:
import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.repeat.RepeatStatus;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
#Configuration
#EnableBatchProcessing
public class MyJob {
private final JobBuilderFactory jobBuilderFactory;
private final StepBuilderFactory stepBuilderFactory;
public MyJob(JobBuilderFactory jobBuilderFactory, StepBuilderFactory stepBuilderFactory) {
this.jobBuilderFactory = jobBuilderFactory;
this.stepBuilderFactory = stepBuilderFactory;
}
#Bean
public Step step() {
return stepBuilderFactory.get("step")
.tasklet((contribution, chunkContext) -> {
System.out.println("hello world");
return RepeatStatus.FINISHED;
})
.build();
}
#Bean
public Job job() {
return jobBuilderFactory.get("job")
.start(step())
.build();
}
}
With this job configuration in place, you can load the Spring application context and launch your job as follows:
public void executeCsvReading() {
ApplicationContext context = new AnnotationConfigApplicationContext(MyJob.class);
JobLauncher jobLauncher = context.getBean(JobLauncher.class);
Job job = context.getBean(Job.class);
jobLauncher.run(job, new JobParameters());
}
Note that loading the Spring application context can be done outside the method executeCsvReading so that it is not loaded each time this method is called. With this approach, you don't have to configure infrastructure beans required by Spring Batch yourself, they will be automatically created and added to the application context. It is of course possible to override them if needed.
Heads up: If you put the MyJob configuration class in the package of your Spring Boot app, Spring Boot will by default execute the job at startup. You can disable this behaviour by adding spring.batch.job.enabled=false to your application properties.
Hope this helps.

Spring Batch Integration using Java DSL / launching jobs

I've a working spring boot/batch projet containing 2 jobs.
I'm now trying to add Integration to poll files from a remote SFTP using only java configuration / java DSL, and then launch a job.
The file polling is working but I've no idea on how to launch a Job in my flow, despite reading these links :
Spring Batch Integration config using Java DSL
and
Spring Batch Integration job-launching-gateway
some code snippets:
#Bean
public SessionFactory SftpSessionFactory()
{
DefaultSftpSessionFactory sftpSessionFactory = new DefaultSftpSessionFactory();
sftpSessionFactory.setHost("myip");
sftpSessionFactory.setPort(22);
sftpSessionFactory.setUser("user");
sftpSessionFactory.setPrivateKey(new FileSystemResource("path to my key"));
return sftpSessionFactory;
}
#Bean
public IntegrationFlow ftpInboundFlow() {
return IntegrationFlows
.from(Sftp.inboundAdapter(SftpSessionFactory())
.deleteRemoteFiles(Boolean.FALSE)
.preserveTimestamp(Boolean.TRUE)
.autoCreateLocalDirectory(Boolean.TRUE)
.remoteDirectory("remote dir")
.regexFilter(".*\\.txt$")
.localDirectory(new File("C:/sftp/")),
e -> e.id("sftpInboundAdapter").poller(Pollers.fixedRate(600000)))
.handle("FileMessageToJobRequest","toRequest")
// what to put next to process the jobRequest ?
For .handle("FileMessageToJobRequest","toRequest") I use the one described here http://docs.spring.io/spring-batch/trunk/reference/html/springBatchIntegration.html
I would appreciate any help on that, many thanks.
EDIT after Gary comment
I've added, it doesn't compile -of course- because I don't understand how the request is propagated :
.handle("FileMessageToJobRequest","toRequest")
.handle(jobLaunchingGw())
.get();
}
#Bean
public MessageHandler jobLaunchingGw() {
return new JobLaunchingGateway(jobLauncher());
}
#Autowired
private JobLauncher jobLauncher;
#Bean
public JobExecution jobLauncher(JobLaunchRequest req) throws JobExecutionException {
JobExecution execution = jobLauncher.run(req.getJob(), req.getJobParameters());
return execution;
}
I've found a way to launch a job using a #ServiceActivator and adding this to my flow but I'm not sure it's good practice :
.handle("lauchBatchService", "launch")
#Component("lauchBatchService")
public class LaunchBatchService {
private static Logger log = LoggerFactory.getLogger(LaunchBatchService.class);
#Autowired
private JobLauncher jobLauncher;
#ServiceActivator
public JobExecution launch(JobLaunchRequest req) throws JobExecutionException {
JobExecution execution = jobLauncher.run(req.getJob(), req.getJobParameters());
return execution;
}
}
.handle(jobLaunchingGw())
// handle result
...
#Bean
public MessageHandler jobLaunchingGw() {
return new JobLaunchingGateway(jobLauncher());
}
where jobLauncher() is the JobLauncher bean.
EDIT
Your service activator is doing about the same as the JLG; it uses this code.
Your jobLauncher #Bean is wrong.
#Beans are definitions; they don't do runtime stuff like this
#Bean
public JobExecution jobLauncher(JobLaunchRequest req) throws JobExecutionException {
JobExecution execution = jobLauncher.run(req.getJob(), req.getJobParameters());
return execution;
}
Since you are already autowiring a JobLauncher, just use that.
#Autowired
private JobLauncher jobLauncher;
#Bean
public MessageHandler jobLaunchingGw() {
return new JobLaunchingGateway(jobLauncher);
}

How to continually run a Spring Batch job

What is the best way to continually run a Spring Batch job? Do we need to write a shell file which loops and starts the job at predefined intervals? Or is there a way within Spring Batch itself to configure a job so that it repeats at either
1) pre-defined intervals
2) after the completion of each run
Thanks
If you want to launch your jobs periodically, you can combine Spring Scheduler and Spring Batch. Here is a concrete example : Spring Scheduler + Batch Example.
If you want to re-launch your job continually (Are you sure !), You can configure a Job Listener on your job. Then, through the method jobListener.afterJob(JobExecution jobExecution) you can relaunch your job.
Id did something like this for importing emails, so i have to check it periodically
#SpringBootApplication
#EnableScheduling
public class ImportBillingFromEmailBatchRunner
{
private static final Logger LOG = LoggerFactory.getLogger(ImportBillingFromEmailBatchRunner.class);
public static void main(String[] args)
{
SpringApplication app = new SpringApplication(ImportBillingFromEmailBatchRunner.class);
app.run(args);
}
#Bean
BillingEmailCronService billingEmailCronService()
{
return new BillingEmailCronService();
}
}
So the BillingEmailCronService takes care of the continuation:
public class BillingEmailCronService
{
private static final Logger LOG = LoggerFactory.getLogger(BillingEmailCronService.class);
#Autowired
private JobLauncher jobLauncher;
#Autowired
private JobExplorer jobExplorer;
#Autowired
private JobRepository jobRepository;
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Autowired
private #Qualifier(BillingBatchConfig.QUALIFIER)
Step fetchBillingFromEmailsStep;
#Scheduled(fixedDelay = 5000)
public void run()
{
LOG.info("Procesando correos con facturas...");
try
{
Job job = createNewJob();
JobParameters jobParameters = new JobParameters();
jobLauncher.run(job, jobParameters);
}catch(...)
{
//Handle each exception
}
}
}
Implement your createNewJob logic and try it out.
one easy way would be configure cron job from Unix which will run application at specified interval