X-Ray configuration for Spring Batch Job - spring-batch

X-Ray is integrated into my service and everything works fine when some endpoints are triggered from other services.
The Spring Batch job is used to process some data and push some part of it to SNS topic. This job is launched via SimpleJobLauncher.
The issue is that during the pushing to SNS from my Spring Batch the following exception is thrown: SegmentNotFoundException: No segment in progress .
Based on the documentation it looks like I need to pass the trace ID to the job:
https://docs.aws.amazon.com/xray/latest/devguide/xray-sdk-java-multithreading.html
Does anyone know what is the best way to integrate X-Ray with Spring Batch? And what would be the cleanest solution?

I solved this issue in the following way:
I've passed name, trace id and parent id to my job via job parameters while launching the job:
Entity segment = AWSXRay.getGlobalRecorder().getTraceEntity();
asyncJobLauncher.run(
myJob,
new JobParametersBuilder()
.addLong(JOB_UNIQUENESS_KEY, System.nanoTime())
.addString(X_RAY_NAME_ID_KEY, segment.getName())
.addString(X_RAY_TRACE_ID_KEY, segment.getTraceId().toString())
.addString(X_RAY_PARENT_ID_KEY, segment.getParentId())
.toJobParameters()
);
I've implemented the job listener to create a new X-Ray segment while starting a job:
#Slf4j
#Component
#RequiredArgsConstructor
public class XRayJobListener implements JobExecutionListener {
#Value("${spring.application.name}")
private String appName;
#Override
public void beforeJob(#NonNull JobExecution jobExecution) {
AWSXRayRecorder recorder = AWSXRay.getGlobalRecorder();
String name = Objects.requireNonNullElse(
jobExecution.getJobParameters().getString(X_RAY_NAME_ID_KEY),
appName
);
Optional<String> traceIdOpt =
Optional.ofNullable(jobExecution.getJobParameters().getString(X_RAY_TRACE_ID_KEY));
TraceID traceID =
traceIdOpt
.map(TraceID::fromString)
.orElseGet(TraceID::create);
String parentId = jobExecution.getJobParameters().getString(X_RAY_PARENT_ID_KEY);
recorder.beginSegment(name, traceID, parentId);
}
#Override
public void afterJob(#NonNull JobExecution jobExecution) {
AWSXRay.getGlobalRecorder().endSegment();
}
}
And this listener is added to the configuration of my job:
#Bean
public Job myJob(
JobBuilderFactory jobBuilderFactory,
Step myStep1,
Step myStep2,
XRayJobListener xRayJobListener
) {
return
jobBuilderFactory
.get("myJob")
.incrementer(new RunIdIncrementer())
.listener(xRayJobListener)
.start(myStep1)
.next(myStep2)
.build();
}

Related

How to modify a job without effecting the other jobs deployed on Spring Cloud Data Flow

How can I modify and deploy 1 job (ex: rebuild the jar file with changing job A) on SCDF but the other jobs in that jar file are still running.
I'm setting up a Spring Batch Job on Spring Cloud Data Flow. There are multiple jobs (A,B,C,...) in my Spring Batch project. I have built a jar file from my project and deployed it on SCDF.
I have used --spring.batch.job.names=A/B/C/...when launching tasks to run each job separately.
I have tried on creating a new jar and replace it with the old one but it's not work because the old jar is still running.
I have multiple classes related to multiple job and extends from CommonBatchConfiguration:
#Configuration
public class jobAclass extends CommonBatchConfiguration{
#Bean
public Job jobA() {
return jobBuilderFactory
.get("jobA ")
.incrementer(new RunIdIncrementer())
.start(stepA1())
.build();
}
#Bean
public Step stepA1() {
return stepBuilderFactory
.get("stepA1")
.tasklet(taskletA1())
.build();
}
public Tasklet taskletA1() {
return (contribution, chunkContext) -> {
return RepeatStatus.FINISHED;
};
}
}
#Configuration
public class jobBclass extends CommonBatchConfiguration{
#Bean
public Job jobB() {
return jobBuilderFactory
.get("jobB")
.incrementer(new RunIdIncrementer())
.start(stepB1())
.build();
}
#Bean
public Step stepB1() {
return stepBuilderFactory
.get("stepB1")
.tasklet(taskletB1())
.build();
}
public Tasklet taskletB1() {
return (contribution, chunkContext) -> {
return RepeatStatus.FINISHED;
};
}
}
#EnableBatchProcessing
#Configuration
public class CommonBatchConfiguration {
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
}
I expect to modify 1 jobs in file jar and deploy it without effect the others
It looks like you need Composed tasks (configured as batch jobs) in your case and you can have the composed tasks deployed as individual tasks (batch applications). For more details on composed tasks, you can see here.
The feature of modifying one of the jobs' version without affecting the other tasks is something being addressed in 2.3.x of SCDF and you can watch the epic here

How to pass JobParameters to myBatisPagingItemReader without using #StepScope

I am using spring batch restart functionality so that it reads from the last failed point forward. My restart works fine as long as I don't use #StepScope annotation to my myBatisPagingItemReader bean method.
I have to use #StepScope so that i can do late binding to get the jobParameters using the input parameter to my myBatisPagingItemReader bean method
#Value("#{JobParameters['run-date']}"))
If I use #StepScope the restart does not work.
I tried adding listener new JobParameterExecutionContextCopyListener() to copy JobParameters to ExecutionContext.
But how will i get access to ExecutionContext inside myBatisPagingItemReader as I don't have ItemReader's open methods?
Not sure how i can get access to jobParameters when running myBatisPagingItemReader without using #StepScope? Please any inputs.
Also not sure if my understanding on spring-batch restart is correct on how it works when new instance (stateful) is used when using #StepScope.
#Configuration
#EnableBatchProcessing
public class BatchConfig {
#Bean
public Step step1(StepBuilderFactory stepBuilderFactory,
ItemReader<Model> myBatisPagingItemReader,
ItemProcessor<Model, Model> itemProcessor,
ItemWriter<Model> itemWriter) {
return stepBuilderFactory.get("data-load")
.<Model, Model>chunk(10)
.reader(myBatisPagingItemReader)
.processor(itemProcessor)
.writer(itemWriter)
.listener(itemReadListener())
.listener(new JobParameterExecutionContextCopyListener())
.build();
}
#Bean
public Job job(JobBuilderFactory jobBuilderFactory, #Qualifier("step1")
Step step1) {
return jobBuilderFactory.get("load-job")
.incrementer(new RunIdIncrementer())
.start(step1)
.listener(jobExecutionListener())
.build();
}
}
#Component
public class BatchInputReader {
#Bean
//#StepScope
public ItemReader<Model> myBatisPagingItemReader(
SqlSessionFactory sqlSessionFactory) {
MyBatisPagingItemReader<Model> reader = new
MyBatisPagingItemReader<>();
Map<String, Object> parameterValues = new HashMap<>();
// populate parameterValues from jobParameters ??
reader.setSqlSessionFactory(sqlSessionFactory);
reader.setParameterValues(parameterValues);
reader.setQueryId("query");
return reader;
}
}
You are declaring a Spring Bean (myBatisPagingItemReader) in a Class annotated with #Component (BatchInputReader). This is not correct.
What you need to do is to declare the mybatis reader as a bean in your configuration class BatchConfig. Once this is done and the bean is annotated with #StepScope, you can pass pass job parameters as follows:
#Configuration
#EnableBatchProcessing
public class BatchConfig {
#Bean
#StepScope
public ItemReader<Model> myBatisPagingItemReader(
SqlSessionFactory sqlSessionFactory,
#Value("#{jobParameters['param1']}") String param1,
#Value("#{jobParameters['param2']}") String param2) {
MyBatisPagingItemReader<Model> reader = new
MyBatisPagingItemReader<>();
Map<String, Object> parameterValues = new HashMap<>();
// populate parameterValues from jobParameters ?? => Those can be now accessed from method parameters
reader.setSqlSessionFactory(sqlSessionFactory);
reader.setParameterValues(parameterValues);
reader.setQueryId("query");
return reader;
}
#Bean
public Step step1(StepBuilderFactory stepBuilderFactory,
ItemReader<Model> myBatisPagingItemReader,
ItemProcessor<Model, Model> itemProcessor,
ItemWriter<Model> itemWriter) {
return stepBuilderFactory.get("data-load")
.<Model, Model>chunk(10)
.reader(myBatisPagingItemReader)
.processor(itemProcessor)
.writer(itemWriter)
.listener(itemReadListener())
.listener(new JobParameterExecutionContextCopyListener())
.build();
}
#Bean
public Job job(JobBuilderFactory jobBuilderFactory, #Qualifier("step1")
Step step1) {
return jobBuilderFactory.get("load-job")
.incrementer(new RunIdIncrementer())
.start(step1)
.listener(jobExecutionListener())
.build();
}
}
More details about this in the Late Binding of Job and Step Attributes section. BatchInputReader will be left empty and is not needed anymore. Less is more! :-)
Hope this helps.
Adding to my question. I have added myBatisPagingItemReader() as suggested to my configuration annoated class.
Restart example when I use #Stepscope annotaton to myBatisPagingItemReader(), the reader is fetching 5 records and I have chunk size(commit-interval) set to 3.
Job Instance - 01 - Job Parameter - 01/02/2019.
chunk-1:
- process record-1
- process record-2
- process record-3
writer - writes all 3 records
chunk-1 commit successful
chunk-2:
process record-4
process record-5 - Throws and exception
Job completes and set to 'FAILED' status
Now the Job is Restarted again using same Job Parameter.
Job Instance - 01 - Job Parameter - 01/02/2019.
chunk-1:
process record-1
process record-2
process record-3
writer - writes all 3 records
chunk-1 commit successful
chunk-2:
process record-4
process record-5 - Throws and exception
Job completes and set to 'FAILED' status
Please note: Here as I am using #Stepscope annotation on myBatisPagingItemReader() bean method, the job creates a new instance , see below log message.
Creating object in scope=step, name=scopedTarget.myBatisPagingItemReader
Registered destruction callback in scope=step, name=scopedTarget.myBatisPagingItemReader
As it is new instance it start the process from start, instead of starting from chunk-2.
If i don't use Stepscope, it restarts from chunk-2 as the restarted job step sets - MyBatisPagingItemReader.read.count=3.
I would like to use Stepscope to use the late bindings, if i use stepscope, is it possible for my myBatisPagingItemReader to set the read.count from the last failure to get restart working?
Or
If I don't use #Stepscope, is there a way to get job parameters inside myBatisPagingItemReader?

Spring Batch Integration using Java DSL / launching jobs

I've a working spring boot/batch projet containing 2 jobs.
I'm now trying to add Integration to poll files from a remote SFTP using only java configuration / java DSL, and then launch a job.
The file polling is working but I've no idea on how to launch a Job in my flow, despite reading these links :
Spring Batch Integration config using Java DSL
and
Spring Batch Integration job-launching-gateway
some code snippets:
#Bean
public SessionFactory SftpSessionFactory()
{
DefaultSftpSessionFactory sftpSessionFactory = new DefaultSftpSessionFactory();
sftpSessionFactory.setHost("myip");
sftpSessionFactory.setPort(22);
sftpSessionFactory.setUser("user");
sftpSessionFactory.setPrivateKey(new FileSystemResource("path to my key"));
return sftpSessionFactory;
}
#Bean
public IntegrationFlow ftpInboundFlow() {
return IntegrationFlows
.from(Sftp.inboundAdapter(SftpSessionFactory())
.deleteRemoteFiles(Boolean.FALSE)
.preserveTimestamp(Boolean.TRUE)
.autoCreateLocalDirectory(Boolean.TRUE)
.remoteDirectory("remote dir")
.regexFilter(".*\\.txt$")
.localDirectory(new File("C:/sftp/")),
e -> e.id("sftpInboundAdapter").poller(Pollers.fixedRate(600000)))
.handle("FileMessageToJobRequest","toRequest")
// what to put next to process the jobRequest ?
For .handle("FileMessageToJobRequest","toRequest") I use the one described here http://docs.spring.io/spring-batch/trunk/reference/html/springBatchIntegration.html
I would appreciate any help on that, many thanks.
EDIT after Gary comment
I've added, it doesn't compile -of course- because I don't understand how the request is propagated :
.handle("FileMessageToJobRequest","toRequest")
.handle(jobLaunchingGw())
.get();
}
#Bean
public MessageHandler jobLaunchingGw() {
return new JobLaunchingGateway(jobLauncher());
}
#Autowired
private JobLauncher jobLauncher;
#Bean
public JobExecution jobLauncher(JobLaunchRequest req) throws JobExecutionException {
JobExecution execution = jobLauncher.run(req.getJob(), req.getJobParameters());
return execution;
}
I've found a way to launch a job using a #ServiceActivator and adding this to my flow but I'm not sure it's good practice :
.handle("lauchBatchService", "launch")
#Component("lauchBatchService")
public class LaunchBatchService {
private static Logger log = LoggerFactory.getLogger(LaunchBatchService.class);
#Autowired
private JobLauncher jobLauncher;
#ServiceActivator
public JobExecution launch(JobLaunchRequest req) throws JobExecutionException {
JobExecution execution = jobLauncher.run(req.getJob(), req.getJobParameters());
return execution;
}
}
.handle(jobLaunchingGw())
// handle result
...
#Bean
public MessageHandler jobLaunchingGw() {
return new JobLaunchingGateway(jobLauncher());
}
where jobLauncher() is the JobLauncher bean.
EDIT
Your service activator is doing about the same as the JLG; it uses this code.
Your jobLauncher #Bean is wrong.
#Beans are definitions; they don't do runtime stuff like this
#Bean
public JobExecution jobLauncher(JobLaunchRequest req) throws JobExecutionException {
JobExecution execution = jobLauncher.run(req.getJob(), req.getJobParameters());
return execution;
}
Since you are already autowiring a JobLauncher, just use that.
#Autowired
private JobLauncher jobLauncher;
#Bean
public MessageHandler jobLaunchingGw() {
return new JobLaunchingGateway(jobLauncher);
}

How to continually run a Spring Batch job

What is the best way to continually run a Spring Batch job? Do we need to write a shell file which loops and starts the job at predefined intervals? Or is there a way within Spring Batch itself to configure a job so that it repeats at either
1) pre-defined intervals
2) after the completion of each run
Thanks
If you want to launch your jobs periodically, you can combine Spring Scheduler and Spring Batch. Here is a concrete example : Spring Scheduler + Batch Example.
If you want to re-launch your job continually (Are you sure !), You can configure a Job Listener on your job. Then, through the method jobListener.afterJob(JobExecution jobExecution) you can relaunch your job.
Id did something like this for importing emails, so i have to check it periodically
#SpringBootApplication
#EnableScheduling
public class ImportBillingFromEmailBatchRunner
{
private static final Logger LOG = LoggerFactory.getLogger(ImportBillingFromEmailBatchRunner.class);
public static void main(String[] args)
{
SpringApplication app = new SpringApplication(ImportBillingFromEmailBatchRunner.class);
app.run(args);
}
#Bean
BillingEmailCronService billingEmailCronService()
{
return new BillingEmailCronService();
}
}
So the BillingEmailCronService takes care of the continuation:
public class BillingEmailCronService
{
private static final Logger LOG = LoggerFactory.getLogger(BillingEmailCronService.class);
#Autowired
private JobLauncher jobLauncher;
#Autowired
private JobExplorer jobExplorer;
#Autowired
private JobRepository jobRepository;
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Autowired
private #Qualifier(BillingBatchConfig.QUALIFIER)
Step fetchBillingFromEmailsStep;
#Scheduled(fixedDelay = 5000)
public void run()
{
LOG.info("Procesando correos con facturas...");
try
{
Job job = createNewJob();
JobParameters jobParameters = new JobParameters();
jobLauncher.run(job, jobParameters);
}catch(...)
{
//Handle each exception
}
}
}
Implement your createNewJob logic and try it out.
one easy way would be configure cron job from Unix which will run application at specified interval

How do I set JobParameters in spring batch with spring-boot

I followed the guide at http://spring.io/guides/gs/batch-processing/ but it describes a job with no configurable parameters. I'm using Maven to build my project.
I'm porting an existing job that I have defined in XML and would like to pass-in the jobParameters through the command.
I tried the following :
#Configuration
#EnableBatchProcessing
public class MyBatchConfiguration {
// other beans ommited
#Bean
public Resource destFile(#Value("#{jobParameters[dest]}") String dest) {
return new FileSystemResource(dest);
}
}
Then I compile my project using :
mvn clean package
Then I try to launch the program like this :
java my-jarfile.jar dest=/tmp/foo
And I get an exception saying :
[...]
Caused by: org.springframework.expression.spel.SpelEvaluationException:
EL1008E:(pos 0): Field or property 'jobParameters' cannot be found on object of
type 'org.springframework.beans.factory.config.BeanExpressionContext'
Thanks !
Parse in job parameters from the command line and then create and populate JobParameters.
public JobParameters getJobParameters() {
JobParametersBuilder jobParametersBuilder = new JobParametersBuilder();
jobParametersBuilder.addString("dest", <dest_from_cmd_line);
jobParametersBuilder.addDate("date", <date_from_cmd_line>);
return jobParametersBuilder.toJobParameters();
}
Pass them to your job via JobLauncher -
JobLauncher jobLauncher = context.getBean(JobLauncher.class);
JobExecution jobExecution = jobLauncher.run(job, jobParameters);
Now you can access them using code like -
#Bean
#StepScope
public Resource destFile(#Value("#{jobParameters[dest]}") String dest) {
return new FileSystemResource(dest);
}
Or in a #Configuration class that is configuring Spring Batch Job artifacts like - ItemReader, ItemWriter, etc...
#Bean
#StepScope
public JdbcCursorItemReader<MyPojo> reader(#Value("#{jobParameters}") Map jobParameters) {
return new MyReaderHelper.getReader(jobParameters);
}
I managed to get this working by simply annotating my bean as follows :
#Bean
#StepScope
public Resource destFile(#Value("#{jobParameters[dest]}") String dest) {
return new FileSystemResource(dest);
}