How to move files to archive and error folders after processing - spring-batch

Job runs once and try to process all the files available in a source folder in a step. Further it need to do removal of processed/tried but failed files from the source folder to another subsequent folders (/_archived, /_faild). What is the best way to move successfully processed files in archive folder and unsuccessfull files in error folder categorically using spring batch.

you can add separate tasklet or use JobExecutionListener.afterJob hook to move files.
Below is sample example for moving files using tasklet
Java config
#autowired
private MoveFilesTasklet moveFilesTasklet
#Bean
protected Step moveFiles() {
return steps
.get("moveFiles")
.tasklet(moveFilesTasklet)
.build();
}
#Bean
public Job job() {
return jobs
.get("taskletsJob")
.start(processFiles())
.next(moveFiles())
.build();
Tasklet
#Component
public class MoveFilesTasklet implements Tasklet {
private String filePath ="someFilePAth";
#Override
public RepeatStatus execute(StepContribution stepContribution, ChunkContext chunkContext) throws Exception {
final File directory = new File(filePath);
Arrays.asList(directory.listFiles((dir, name) -> name.matches("yourfilePrefix".*?")))
.stream()
.forEach(singleFile -> singleFile.renameTo(new File("someNewFilePath")));
return RepeatStatus.FINISHED;
}
}

Related

store filenames in Spring Batch for send email

I’m writing an application in Spring Batch to do this:
Read the content of a folder, file by file.
Rename the files and move them to several folders.
Send two emails: one with successed name files processed and one with name files that throwed errors.
I’ve already get 1. and 2. but I need to make the 3 point. ¿How can I store the file names that have send to the writer method in an elegant way with Spring Batch?
You can use a Execution Context to store the values of file names which gets processed and also which fails with errors.
We shall have a List/ similar datastructure which has the file names after the business logic. Below is a small snippet for reference which implements StepExecutionListener,
public class FileProcessor implements ItemWriter<TestData>, StepExecutionListener {
private List<String> success = new ArrayList<>();
private List<String> failed = new ArrayList<>();
#Override
public void beforeStep(StepExecution stepExecution) {
}
#Override
public void write(List<? extends BatchTenantBackupData> items) throws Exception {
// Business logic which adds the success and failure file names to the list
after processing
}
#Override
public ExitStatus afterStep(StepExecution stepExecution) {
stepExecution.getJobExecution().getExecutionContext()
.put("fileProcessedSuccessfully", success);
stepExecution.getJobExecution().getExecutionContext()
.put("fileProcessedFailure", failed);
return ExitStatus.COMPLETED;
}
}
Now we have stored the file names in the execution context which we will be able to use it in send email step.
public class sendReport implements Tasklet, StepExecutionListener {
private List<String> success = new ArrayList<>();
private List<String> failed = new ArrayList<>();
#Override
public void beforeStep(StepExecution stepExecution) {
try {
// Fetch the list of file names which we have stored in the context from previous step
success = (List<String>) stepExecution.getJobExecution().getExecutionContext()
.get("fileProcessedSuccessfully");
failed = (List<BatchJobReportContent>) stepExecution.getJobExecution()
.getExecutionContext().get("fileProcessedFailure");
} catch (Exception e) {
}
}
#Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
// Business logic to send email with the file names
}
#Override
public ExitStatus afterStep(StepExecution stepExecution) {
logger.debug("Email Trigger step completed successfully!");
return ExitStatus.COMPLETED;
}
}

How to modify a job without effecting the other jobs deployed on Spring Cloud Data Flow

How can I modify and deploy 1 job (ex: rebuild the jar file with changing job A) on SCDF but the other jobs in that jar file are still running.
I'm setting up a Spring Batch Job on Spring Cloud Data Flow. There are multiple jobs (A,B,C,...) in my Spring Batch project. I have built a jar file from my project and deployed it on SCDF.
I have used --spring.batch.job.names=A/B/C/...when launching tasks to run each job separately.
I have tried on creating a new jar and replace it with the old one but it's not work because the old jar is still running.
I have multiple classes related to multiple job and extends from CommonBatchConfiguration:
#Configuration
public class jobAclass extends CommonBatchConfiguration{
#Bean
public Job jobA() {
return jobBuilderFactory
.get("jobA ")
.incrementer(new RunIdIncrementer())
.start(stepA1())
.build();
}
#Bean
public Step stepA1() {
return stepBuilderFactory
.get("stepA1")
.tasklet(taskletA1())
.build();
}
public Tasklet taskletA1() {
return (contribution, chunkContext) -> {
return RepeatStatus.FINISHED;
};
}
}
#Configuration
public class jobBclass extends CommonBatchConfiguration{
#Bean
public Job jobB() {
return jobBuilderFactory
.get("jobB")
.incrementer(new RunIdIncrementer())
.start(stepB1())
.build();
}
#Bean
public Step stepB1() {
return stepBuilderFactory
.get("stepB1")
.tasklet(taskletB1())
.build();
}
public Tasklet taskletB1() {
return (contribution, chunkContext) -> {
return RepeatStatus.FINISHED;
};
}
}
#EnableBatchProcessing
#Configuration
public class CommonBatchConfiguration {
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
}
I expect to modify 1 jobs in file jar and deploy it without effect the others
It looks like you need Composed tasks (configured as batch jobs) in your case and you can have the composed tasks deployed as individual tasks (batch applications). For more details on composed tasks, you can see here.
The feature of modifying one of the jobs' version without affecting the other tasks is something being addressed in 2.3.x of SCDF and you can watch the epic here

I am getting error Table 'test.batch_job_instance' doesn't exist

I am new to Spring Batch. I have configured my job with inmemoryrepository. But still, it seems it is using DB to persist job Metadata.
My spring batch Configuration is :
#Configuration
public class BatchConfiguration {
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Autowired
private JobBuilderFactory jobBuilder;
#Bean
public JobLauncher jobLauncher() throws Exception {
SimpleJobLauncher job =new SimpleJobLauncher();
job.setJobRepository(getJobRepo());
job.afterPropertiesSet();
return job;
}
#Bean
public PlatformTransactionManager getTransactionManager() {
return new ResourcelessTransactionManager();
}
#Bean
public JobRepository getJobRepo() throws Exception {
return new MapJobRepositoryFactoryBean(getTransactionManager()).getObject();
}
#Bean
public Step step1(JdbcBatchItemWriter<Person> writer) throws Exception {
return stepBuilderFactory.get("step1")
.<Person, Person> chunk(10)
.reader(reader())
.processor(processor())
.writer(writer).repository(getJobRepo())
.build();
}
#Bean
public Job job( #Qualifier("step1") Step step1) throws Exception {
return jobBuilder.get("myJob").start(step1).repository(getJobRepo()).build();
}
}
How to resolve above issue?
If you are using Sprint boot
a simple property in your application.properties will solve the issue
spring.batch.initialize-schema=ALWAYS
For a non-Spring Boot setup:This error shows up when a datasource bean is declared in the batch configuration. To workaround the problem I added an embedded datasource, since I didn't want to create those tables in the application database:
#Bean
public DataSource mysqlDataSource() {
// create your application datasource here
}
#Bean
#Primary
public DataSource batchEmbeddedDatasource() {
// in memory datasource required by spring batch
EmbeddedDatabaseBuilder builder = new EmbeddedDatabaseBuilder();
return builder.setType(EmbeddedDatabaseType.H2)
.addScript("classpath:schema-drop-h2.sql")
.addScript("classpath:schema-h2.sql")
.build();
}
The initialization scripts can be found inside the spring-batch-core-xxx.jar under org.springframework.batch.core package.Note I used an in-memory database but the solution is valid also for other database systems.
Those who face the same problem with MySql database in CentOS(Most Unix based systems).
Table names are case-sensitive in Linux. Setting lower_case_table_names=1 has solved the problem.
Find official document here
For those using versions greater then spring-boot 2.5 this worked inside of application.properties
spring.batch.jdbc.initialize-schema = ALWAYS
This solved my case:
spring.batch.jdbc.initialize-schema=ALWAYS

Spring Batch Integration using Java DSL / launching jobs

I've a working spring boot/batch projet containing 2 jobs.
I'm now trying to add Integration to poll files from a remote SFTP using only java configuration / java DSL, and then launch a job.
The file polling is working but I've no idea on how to launch a Job in my flow, despite reading these links :
Spring Batch Integration config using Java DSL
and
Spring Batch Integration job-launching-gateway
some code snippets:
#Bean
public SessionFactory SftpSessionFactory()
{
DefaultSftpSessionFactory sftpSessionFactory = new DefaultSftpSessionFactory();
sftpSessionFactory.setHost("myip");
sftpSessionFactory.setPort(22);
sftpSessionFactory.setUser("user");
sftpSessionFactory.setPrivateKey(new FileSystemResource("path to my key"));
return sftpSessionFactory;
}
#Bean
public IntegrationFlow ftpInboundFlow() {
return IntegrationFlows
.from(Sftp.inboundAdapter(SftpSessionFactory())
.deleteRemoteFiles(Boolean.FALSE)
.preserveTimestamp(Boolean.TRUE)
.autoCreateLocalDirectory(Boolean.TRUE)
.remoteDirectory("remote dir")
.regexFilter(".*\\.txt$")
.localDirectory(new File("C:/sftp/")),
e -> e.id("sftpInboundAdapter").poller(Pollers.fixedRate(600000)))
.handle("FileMessageToJobRequest","toRequest")
// what to put next to process the jobRequest ?
For .handle("FileMessageToJobRequest","toRequest") I use the one described here http://docs.spring.io/spring-batch/trunk/reference/html/springBatchIntegration.html
I would appreciate any help on that, many thanks.
EDIT after Gary comment
I've added, it doesn't compile -of course- because I don't understand how the request is propagated :
.handle("FileMessageToJobRequest","toRequest")
.handle(jobLaunchingGw())
.get();
}
#Bean
public MessageHandler jobLaunchingGw() {
return new JobLaunchingGateway(jobLauncher());
}
#Autowired
private JobLauncher jobLauncher;
#Bean
public JobExecution jobLauncher(JobLaunchRequest req) throws JobExecutionException {
JobExecution execution = jobLauncher.run(req.getJob(), req.getJobParameters());
return execution;
}
I've found a way to launch a job using a #ServiceActivator and adding this to my flow but I'm not sure it's good practice :
.handle("lauchBatchService", "launch")
#Component("lauchBatchService")
public class LaunchBatchService {
private static Logger log = LoggerFactory.getLogger(LaunchBatchService.class);
#Autowired
private JobLauncher jobLauncher;
#ServiceActivator
public JobExecution launch(JobLaunchRequest req) throws JobExecutionException {
JobExecution execution = jobLauncher.run(req.getJob(), req.getJobParameters());
return execution;
}
}
.handle(jobLaunchingGw())
// handle result
...
#Bean
public MessageHandler jobLaunchingGw() {
return new JobLaunchingGateway(jobLauncher());
}
where jobLauncher() is the JobLauncher bean.
EDIT
Your service activator is doing about the same as the JLG; it uses this code.
Your jobLauncher #Bean is wrong.
#Beans are definitions; they don't do runtime stuff like this
#Bean
public JobExecution jobLauncher(JobLaunchRequest req) throws JobExecutionException {
JobExecution execution = jobLauncher.run(req.getJob(), req.getJobParameters());
return execution;
}
Since you are already autowiring a JobLauncher, just use that.
#Autowired
private JobLauncher jobLauncher;
#Bean
public MessageHandler jobLaunchingGw() {
return new JobLaunchingGateway(jobLauncher);
}

How do I set JobParameters in spring batch with spring-boot

I followed the guide at http://spring.io/guides/gs/batch-processing/ but it describes a job with no configurable parameters. I'm using Maven to build my project.
I'm porting an existing job that I have defined in XML and would like to pass-in the jobParameters through the command.
I tried the following :
#Configuration
#EnableBatchProcessing
public class MyBatchConfiguration {
// other beans ommited
#Bean
public Resource destFile(#Value("#{jobParameters[dest]}") String dest) {
return new FileSystemResource(dest);
}
}
Then I compile my project using :
mvn clean package
Then I try to launch the program like this :
java my-jarfile.jar dest=/tmp/foo
And I get an exception saying :
[...]
Caused by: org.springframework.expression.spel.SpelEvaluationException:
EL1008E:(pos 0): Field or property 'jobParameters' cannot be found on object of
type 'org.springframework.beans.factory.config.BeanExpressionContext'
Thanks !
Parse in job parameters from the command line and then create and populate JobParameters.
public JobParameters getJobParameters() {
JobParametersBuilder jobParametersBuilder = new JobParametersBuilder();
jobParametersBuilder.addString("dest", <dest_from_cmd_line);
jobParametersBuilder.addDate("date", <date_from_cmd_line>);
return jobParametersBuilder.toJobParameters();
}
Pass them to your job via JobLauncher -
JobLauncher jobLauncher = context.getBean(JobLauncher.class);
JobExecution jobExecution = jobLauncher.run(job, jobParameters);
Now you can access them using code like -
#Bean
#StepScope
public Resource destFile(#Value("#{jobParameters[dest]}") String dest) {
return new FileSystemResource(dest);
}
Or in a #Configuration class that is configuring Spring Batch Job artifacts like - ItemReader, ItemWriter, etc...
#Bean
#StepScope
public JdbcCursorItemReader<MyPojo> reader(#Value("#{jobParameters}") Map jobParameters) {
return new MyReaderHelper.getReader(jobParameters);
}
I managed to get this working by simply annotating my bean as follows :
#Bean
#StepScope
public Resource destFile(#Value("#{jobParameters[dest]}") String dest) {
return new FileSystemResource(dest);
}