Get last completed jobs execution context spring batch - spring-batch

I am trying to fetch execution context of last completed job. Below is the code:
#Component
public class MyJobDataReader implements ItemReader<MyDbEntity>, StepExecutionListener {
#Autowired
#Qualifier("jobService")
JobService jobService;
#Override
public void beforeStep(StepExecution stepExecution) {
Collection<JobExecution> jobExecutions = null;
try {
jobExecutions = jobService.listJobExecutionsForJob("myjob", 0, jobService.countJobExecutionsForJob("myjob"));
} catch (NoSuchJobException e) {
//log
}
JobExecution lastCompletedJobExecution =
jobExecutions
.stream()
.filter(param -> param.getExitStatus().getExitCode().equals(ExitStatus.COMPLETED.getExitCode()))
.findFirst()
.orElse(null);
Long myExecutionContextParam =
lastCompletedJobExecution
.getExecutionContext()
.getLong(contextParam);
}
}
But I am getting the myExecutionContextParam also null , even though I see the value in BATCH_JOB_EXECUTION_CONTEXT table for the fetched job execution.

The issue was that jobService.listJobExecutionsForJob was returning a JobExecution with few fields , then I had to call jobService.getJobExecution(jobExecutionId) to get a fully enriched JobExecution

Related

Reset scheduled job after completion

I have an scheduled job implemented with Spring batch. Right now when it finishes it doesn't start again because it is detected as completed, is it possible to reset its state after completion?
#Component
class JobScheduler {
#Autowired
private Job job1;
#Autowired
private JobLauncher jobLauncher;
#Scheduled(cron = "0 0/15 * * * ?")
public void launchJob1() throws Exception {
this.jobLauncher.run(this.job1, new JobParameters());
}
}
#Configuration
public class Job1Configuration{
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Bean
public Job job1() {
return this.jobBuilderFactory.get("job1")
.start(this.step1()).on(STEP1_STATUS.NOT_READY.get()).end()
.from(this.step1()).on(STEP1_STATUS.READY.get()).to(this.step2())
.next(this.step3())
.end()
.build();
}
}
I know I can set a job parameter with the time or the id, but this will launch a new execution every 15 minutes. I want to repeat the same execution until is completed without errors, and then, execute a new one.
You can't restart your job because you're setting the job status to COMPLETE by calling end() in .start(this.step1()).on(STEP1_STATUS.NOT_READY.get()).end().
You should instead either fail the job by calling .start(this.step1()).on(STEP1_STATUS.NOT_READY.get()).fail()
or stop the job by calling .start(this.step1()).on(STEP1_STATUS.NOT_READY.get()).stopAndRestart(step1())
Those options will mean the job status is either FAILED or STOPPED instead of COMPLETE which means that if you launch the job with the same JobParameters, it will restart the previous job execution.
See https://docs.spring.io/spring-batch/docs/current/reference/html/step.html#configuringForStop
To launch the job in a way that handles restarting previous instances or starting a new instance, you could look at how the SimpleJobService in spring-batch-admin does it and modify the launch method slightly for your purposes. This requires you to specify an incremental job parameter that is used to launch new instances of your job.
https://github.com/spring-attic/spring-batch-admin/blob/master/spring-batch-admin-manager/src/main/java/org/springframework/batch/admin/service/SimpleJobService.java#L250
#Override
public JobExecution launch(String jobName, JobParameters jobParameters) throws NoSuchJobException,
JobExecutionAlreadyRunningException, JobRestartException, JobInstanceAlreadyCompleteException,
JobParametersInvalidException {
JobExecution jobExecution = null;
if (jobLocator.getJobNames().contains(jobName)) {
Job job = jobLocator.getJob(jobName);
JobExecution lastJobExecution = jobRepository.getLastJobExecution(jobName, jobParameters);
boolean restart = false;
if (lastJobExecution != null) {
BatchStatus status = lastJobExecution.getStatus();
if (status.isUnsuccessful() && status != BatchStatus.ABANDONED) {
restart = true;
}
}
if (job.getJobParametersIncrementer() != null && !restart) {
jobParameters = job.getJobParametersIncrementer().getNext(jobParameters);
}
jobExecution = jobLauncher.run(job, jobParameters);
if (jobExecution.isRunning()) {
activeExecutions.add(jobExecution);
}
} else {
if (jsrJobOperator != null) {
// jobExecution = this.jobExecutionDao
// .getJobExecution(jsrJobOperator.start(jobName, jobParameters.toProperties()));
jobExecution = new JobExecution(jsrJobOperator.start(jobName, jobParameters.toProperties()));
} else {
throw new NoSuchJobException(String.format("Unable to find job %s to launch",
String.valueOf(jobName)));
}
}
return jobExecution;
}
I think the difficulty here comes from mixing scheduling with restartability. I would make each schedule execute a distinct job instance (for example by adding the run time as an identifying job parameter).
Now if a given schedule fails, it could be restarted separately until completion without affecting subsequent schedules. This can be done manually or programmtically in another scheduled method.
This is the solution I came up with after all the comments:
#Component
class JobScheduler extends JobSchedulerLauncher {
#Autowired
private Job job1;
#Scheduled(cron = "0 0/15 * * * ?")
public void launchJob1() throws Exception {
this.launch(this.job1);
}
}
public abstract class JobSchedulerLauncher {
#Autowired
private JobOperator jobOperator;
#Autowired
private JobExplorer jobExplorer;
public void launch(Job job) throws JobExecutionAlreadyRunningException, JobRestartException, JobInstanceAlreadyCompleteException,
JobParametersInvalidException, NoSuchJobException, NoSuchJobExecutionException, JobExecutionNotRunningException, JobParametersNotFoundException, UnexpectedJobExecutionException {
// Get the last instance
final List<JobInstance> jobInstances = this.jobExplorer.findJobInstancesByJobName(job.getName(), 0, 1);
if (CollectionUtils.isNotEmpty(jobInstances)) {
// Get the last executions
final List<JobExecution> jobExecutions = this.jobExplorer.getJobExecutions(jobInstances.get(0));
if (CollectionUtils.isNotEmpty(jobExecutions)) {
final JobExecution lastJobExecution = jobExecutions.get(0);
if (lastJobExecution.isRunning()) {
this.jobOperator.stop(lastJobExecution.getId().longValue());
this.jobOperator.abandon(lastJobExecution.getId().longValue());
} else if (lastJobExecution.getExitStatus().equals(ExitStatus.FAILED) || lastJobExecution.getExitStatus().equals(ExitStatus.STOPPED)) {
this.jobOperator.restart(lastJobExecution.getId().longValue());
return;
}
}
}
this.jobOperator.startNextInstance(job.getName());
}
}
My job now uses an incrementer, based on this one https://docs.spring.io/spring-batch/docs/current/reference/html/job.html#JobParametersIncrementer:
#Bean
public Job job1() {
return this.jobBuilderFactory.get("job1")
.incrementer(new CustomJobParameterIncrementor())
.start(this.step1()).on(STEP1_STATUS.NOT_READY.get()).end()
.from(this.step1()).on(STEP1_STATUS.READY.get()).to(this.step2())
.next(this.step3())
.end()
.build();
}
In my case my scheduler won't start 2 instances of the same job at the same time, so if I detect a running job in this code it means that the server restarted leaving the job with status STARTED, that's why I stop it and abandon it.

How to commit the offsets when using KafkaItemReader in spring batch job, once all the messages are processed and written to the .dat file?

I have developed a Spring Batch Job which read from Kafka topic using KafkaItemReader class. I want to commit the offset only when the messages read in defined chunk are Processed and written successfully to an Output .dat file.
#Bean
public Job kafkaEventReformatjob(
#Qualifier("MaintStep") Step MainStep,
#Qualifier("moveFileToFolder") Step moveFileToFolder,
#Qualifier("compressFile") Step compressFile,
JobExecutionListener listener)
{
return jobBuilderFactory.get("kafkaEventReformatJob")
.listener(listener)
.incrementer(new RunIdIncrementer())
.flow(MainStep)
.next(moveFileToFolder)
.next(compressFile)
.end()
.build();
}
#Bean
Step MainStep(
ItemProcessor<IncomingRecord, List<Record>> flatFileItemProcessor,
ItemWriter<List<Record>> flatFileWriter)
{
return stepBuilderFactory.get("mainStep")
.<InputRecord, List<Record>> chunk(5000)
.reader(kafkaItemReader())
.processor(flatFileItemProcessor)
.writer(writer())
.listener(basicStepListener)
.build();
}
//Reader reads all the messages from akfka topic and sending back in form of IncomingRecord.
#Bean
KafkaItemReader<String, IncomingRecord> kafkaItemReader() {
Properties props = new Properties();
props.putAll(this.properties.buildConsumerProperties());
List<Integer> partitions = new ArrayList<>();
partitions.add(0);
partitions.add(1);
return new KafkaItemReaderBuilder<String, IncomingRecord>()
.partitions(partitions)
.consumerProperties(props)
.name("records")
.saveState(true)
.topic(topic)
.pollTimeout(Duration.ofSeconds(40L))
.build();
}
#Bean
public ItemWriter<List<Record>> writer() {
ListUnpackingItemWriter<Record> listUnpackingItemWriter = new ListUnpackingItemWriter<>();
listUnpackingItemWriter.setDelegate(flatWriter());
return listUnpackingItemWriter;
}
public ItemWriter<Record> flatWriter() {
FlatFileItemWriter<Record> fileWriter = new FlatFileItemWriter<>();
String tempFileName = "abc";
LOGGER.info("Output File name " + tempFileName + " is in working directory ");
String workingDir = service.getWorkingDir().toAbsolutePath().toString();
Path outputFile = Paths.get(workingDir, tempFileName);
fileWriter.setName("fileWriter");
fileWriter.setResource(new FileSystemResource(outputFile.toString()));
fileWriter.setLineAggregator(lineAggregator());
fileWriter.setForceSync(true);
fileWriter.setFooterCallback(customFooterCallback());
fileWriter.close();
LOGGER.info("Successfully created the file writer");
return fileWriter;
}
#StepScope
#Bean
public TransformProcessor processor() {
return new TransformProcessor();
}
==============================================================================
Writer Class
#BeforeStep
public void beforeStep(StepExecution stepExecution) {
this.stepExecution = stepExecution;
}
#AfterStep
public void afterStep(StepExecution stepExecution) {
this.stepExecution.setWriteCount(count);
}
#Override
public void write(final List<? extends List<Record>> lists) throws Exception {
List<Record> consolidatedList = new ArrayList<>();
for (List<Record> list : lists) {
if (!list.isEmpty() && null != list)
consolidatedList.addAll(list);
}
delegate.write(consolidatedList);
count += consolidatedList.size(); // to count Trailer record count
}
===============================================================
Item Processor
#Override
public List process(IncomingRecord record) {
List<Record> recordList = new ArrayList<>();
if (null != record.getEventName() and a few other conditions inside this section) {
// setting values of Record Class by extracting from the IncomingRecord.
recordList.add(the valid records which matching the condition);
}else{
return null;
}
Synchronizing a read operation and a write operation between two transactional resources (a queue and a database for instance)
is possible by using a JTA transaction manager that coordinates both transaction managers (2PC protocol).
However, this approach is not possible if one of the resources is not transactional (like the majority of file systems). So unless you use
a transactional file system and a JTA transaction manager that coordinates a kafka transaction manager and a file system transaction manager..
you need another approach, like the Compensating Transaction pattern. In your case, the "undo" operation (compensating action) would be rewinding the offset where it was before the failed chunk.

Kafka: Consumer api: Regression test fails if runs in a group (sequentially)

I have implemented a kafka application using consumer api. And I have 2 regression tests implemented with stream api:
To test happy path: by producing data from the test ( into the input topic that the application is listening to) that will be consumed by the application and application will produce data (into the output topic ) that the test will consume and validate against expected output data.
To test error path: behavior is the same as above. Although this time application will produce data into output topic and test will consume from application's error topic and will validate against expected error output.
My code and the regression-test codes are residing under the same project under expected directory structure. Both time ( for both tests) data should have been picked up by the same listener at the application side.
The problem is :
When I am executing the tests individually (manually), each test is passing. However, If I execute them together but sequentially ( for example: gradle clean build ) , only first test is passing. 2nd test is failing after the test-side-consumer polling for data and after some time it gives up not finding any data.
Observation:
From debugging, it looks like, the 1st time everything works perfectly ( test-side and application-side producers and consumers). However, during the 2nd test it seems that application-side-consumer is not receiving any data ( It seems that test-side-producer is producing data, but can not say that for sure) and hence no data is being produced into the error topic.
What I have tried so far:
After investigations, my understanding is that we are getting into race conditions and to avoid that found suggestions like :
use #DirtiesContext(classMode = DirtiesContext.ClassMode.AFTER_EACH_TEST_METHOD)
Tear off broker after each test ( Please see the ".destry()" on brokers)
use different topic names for each test
I applied all of them and still could not recover from my issue.
I am providing the code here for perusal. Any insight is appreciated.
Code for 1st test (Testing error path):
#DirtiesContext(classMode = DirtiesContext.ClassMode.AFTER_EACH_TEST_METHOD)
#EmbeddedKafka(
partitions = 1,
controlledShutdown = false,
topics = {
AdapterStreamProperties.Constants.INPUT_TOPIC,
AdapterStreamProperties.Constants.ERROR_TOPIC
},
brokerProperties = {
"listeners=PLAINTEXT://localhost:9092",
"port=9092",
"log.dir=/tmp/data/logs",
"auto.create.topics.enable=true",
"delete.topic.enable=true"
}
)
public class AbstractIntegrationFailurePathTest {
private final int retryLimit = 0;
#Autowired
protected EmbeddedKafkaBroker embeddedFailurePathKafkaBroker;
//To produce data
#Autowired
protected KafkaTemplate<PreferredMediaMsgKey, SendEmailCmd> inputProducerTemplate;
//To read from output error
#Autowired
protected Consumer<PreferredMediaMsgKey, ErrorCmd> outputErrorConsumer;
//Service to execute notification-preference
#Autowired
protected AdapterStreamProperties projectProerties;
protected void subscribe(Consumer consumer, String topic, int attempt) {
try {
embeddedFailurePathKafkaBroker.consumeFromAnEmbeddedTopic(consumer, topic);
} catch (ComparisonFailure ex) {
if (attempt < retryLimit) {
subscribe(consumer, topic, attempt + 1);
}
}
}
}
.
#TestConfiguration
public class AdapterStreamFailurePathTestConfig {
#Autowired
private EmbeddedKafkaBroker embeddedKafkaBroker;
#Value("${spring.kafka.adapter.application-id}")
private String applicationId;
#Value("${spring.kafka.adapter.group-id}")
private String groupId;
//Producer of records that the program consumes
#Bean
public Map<String, Object> sendEmailCmdProducerConfigs() {
Map<String, Object> results = KafkaTestUtils.producerProps(embeddedKafkaBroker);
results.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
AdapterStreamProperties.Constants.KEY_SERDE.serializer().getClass());
results.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
AdapterStreamProperties.Constants.INPUT_VALUE_SERDE.serializer().getClass());
return results;
}
#Bean
public ProducerFactory<PreferredMediaMsgKey, SendEmailCmd> inputProducerFactory() {
return new DefaultKafkaProducerFactory<>(sendEmailCmdProducerConfigs());
}
#Bean
public KafkaTemplate<PreferredMediaMsgKey, SendEmailCmd> inputProducerTemplate() {
return new KafkaTemplate<>(inputProducerFactory());
}
//Consumer of the error output, generated by the program
#Bean
public Map<String, Object> outputErrorConsumerConfig() {
Map<String, Object> props = KafkaTestUtils.consumerProps(
applicationId, Boolean.TRUE.toString(), embeddedKafkaBroker);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
AdapterStreamProperties.Constants.KEY_SERDE.deserializer().getClass()
.getName());
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
AdapterStreamProperties.Constants.ERROR_VALUE_SERDE.deserializer().getClass()
.getName());
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
return props;
}
#Bean
public Consumer<PreferredMediaMsgKey, ErrorCmd> outputErrorConsumer() {
DefaultKafkaConsumerFactory<PreferredMediaMsgKey, ErrorCmd> rpf =
new DefaultKafkaConsumerFactory<>(outputErrorConsumerConfig());
return rpf.createConsumer(groupId, "notification-failure");
}
}
.
#RunWith(SpringRunner.class)
#SpringBootTest(classes = AdapterStreamFailurePathTestConfig.class)
#ActiveProfiles(profiles = "errtest")
public class ErrorPath400Test extends AbstractIntegrationFailurePathTest {
#Autowired
private DataGenaratorForErrorPath400Test datagen;
#Mock
private AdapterHttpClient httpClient;
#Autowired
private ErroredEmailCmdDeserializer erroredEmailCmdDeserializer;
#Before
public void setup() throws InterruptedException {
Mockito.when(httpClient.callApi(Mockito.any()))
.thenReturn(
new GenericResponse(
400,
TestConstants.ERROR_MSG_TO_CHK));
Mockito.when(httpClient.createURI(Mockito.any(),Mockito.any(),Mockito.any())).thenCallRealMethod();
inputProducerTemplate.send(
projectProerties.getInputTopic(),
datagen.getKey(),
datagen.getEmailCmdToProduce());
System.out.println("producer: "+ projectProerties.getInputTopic());
subscribe(outputErrorConsumer , projectProerties.getErrorTopic(), 0);
}
#Test
public void testWithError() throws InterruptedException, InvalidProtocolBufferException, TextFormat.ParseException {
ConsumerRecords<PreferredMediaMsgKeyBuf.PreferredMediaMsgKey, ErrorCommandBuf.ErrorCmd> records;
List<ConsumerRecord<PreferredMediaMsgKeyBuf.PreferredMediaMsgKey, ErrorCommandBuf.ErrorCmd>> outputListOfErrors = new ArrayList<>();
int attempt = 0;
int expectedRecords = 1;
do {
records = KafkaTestUtils.getRecords(outputErrorConsumer);
records.forEach(outputListOfErrors::add);
attempt++;
} while (attempt < expectedRecords && outputListOfErrors.size() < expectedRecords);
//Verify the recipient event stream size
Assert.assertEquals(expectedRecords, outputListOfErrors.size());
//Validate output
}
#After
public void tearDown() {
outputErrorConsumer.close();
embeddedFailurePathKafkaBroker.destroy();
}
}
2nd test is almost the same in structure. Although this time the test-side-consumer is consuming from application-side-output-topic( instead of error topic). And I named the consumers,broker,producer,topics differently. Like :
#DirtiesContext(classMode = DirtiesContext.ClassMode.AFTER_EACH_TEST_METHOD)
#EmbeddedKafka(
partitions = 1,
controlledShutdown = false,
topics = {
AdapterStreamProperties.Constants.INPUT_TOPIC,
AdapterStreamProperties.Constants.OUTPUT_TOPIC
},
brokerProperties = {
"listeners=PLAINTEXT://localhost:9092",
"port=9092",
"log.dir=/tmp/data/logs",
"auto.create.topics.enable=true",
"delete.topic.enable=true"
}
)
public class AbstractIntegrationSuccessPathTest {
private final int retryLimit = 0;
#Autowired
protected EmbeddedKafkaBroker embeddedKafkaBroker;
//To produce data
#Autowired
protected KafkaTemplate<PreferredMediaMsgKey,SendEmailCmd> sendEmailCmdProducerTemplate;
//To read from output regular topic
#Autowired
protected Consumer<PreferredMediaMsgKey, NotifiedEmailCmd> ouputConsumer;
//Service to execute notification-preference
#Autowired
protected AdapterStreamProperties projectProerties;
protected void subscribe(Consumer consumer, String topic, int attempt) {
try {
embeddedKafkaBroker.consumeFromAnEmbeddedTopic(consumer, topic);
} catch (ComparisonFailure ex) {
if (attempt < retryLimit) {
subscribe(consumer, topic, attempt + 1);
}
}
}
}
Please let me know if I should provide any more information.,
"port=9092"
Don't use a fixed port; leave that out and the embedded broker will use a random port; the consumer configs are set up in KafkaTestUtils to point to the random port.
You shouldn't need to dirty the context after each test method - use a different group.id for each test and a different topic.
In my case the consumer was not closed properly. I had to do :
#After
public void tearDown() {
// shutdown hook to correctly close the streams application
Runtime.getRuntime().addShutdownHook(new Thread(ouputConsumer::close));
}
to resolve.

Spring Batch 4.0 ~ Code under JdbcCursorItemReader method doesn't run with #StepScope defined

I have a sql query defined in my batch job that needs to get input at runtime from the user.
I have the following item reader in my batch job defined as follows
#StepScope
#Bean
public JdbcCursorItemReader<QueryCount> queryCountItemReader() throws Exception {
ListPreparedStatementSetter preparedStatementSetter = new ListPreparedStatementSetter() {
#Override
public void setValues(PreparedStatement pstmt) throws SQLException {
pstmt.setString(1, "#{jobparameters[fromDate]}");
pstmt.setString(2, "#{jobparameters[toDate]}");
pstmt.setString(3, "#{jobparameters[fromDate]}");
pstmt.setString(4, "#{jobparameters[toDate]}");
pstmt.setString(5, "#{jobparameters[fromDate]}");
pstmt.setString(6, "#{jobparameters[toDate]}");
pstmt.setString(7, "#{jobparameters[eventType]}");
pstmt.setString(8, "#{jobparameters[businessUnit]}");
pstmt.setString(9, "#{jobparameters[deviceCategory]}");
pstmt.setString(10, "#{jobparameters[numberOfSearchIds]}");
}
};
JdbcCursorItemReader<QueryCount> queryCountJdbcCursorItemReader = new JdbcCursorItemReader<>();
queryCountJdbcCursorItemReader.setDataSource(dataSource);
queryCountJdbcCursorItemReader.setSql(sqlQuery);
queryCountJdbcCursorItemReader.setRowMapper(new QueryCountMapper());
queryCountJdbcCursorItemReader.setPreparedStatementSetter(preparedStatementSetter);
int counter = 0;
ExecutionContext executionContext = new ExecutionContext();
queryCountJdbcCursorItemReader.open(executionContext);
try {
QueryCount queryCount;
while ((queryCount = queryCountJdbcCursorItemReader.read()) != null) {
System.out.println(queryCount.toString());
counter++;
}
}catch (Exception e){
e.printStackTrace();
}finally {
queryCountJdbcCursorItemReader.close();
}
return queryCountJdbcCursorItemReader;
}
I am sending in the job parameters from my application class as follows
JobParameters jobParameters = new JobParametersBuilder()
.addString("fromDate", "20180410")
.addString("toDate", "20180410")
.addString("eventType", "WEB")
.addString("businessUnit", "UPT")
.addString("numberOfSearchIds", "10")
.toJobParameters();
JobExecution execution = jobLauncher.run(job, jobParameters);
The issue is, when I run my batch job the code inside the queryCountItemReader() method is never executed and the job completes with no errors. Essentially the sql query I am trying to run never executes. If I remove the #StepScope annotation the code will then run but fail with an error since it is enable to bind the parameters sent in from the application class to the sql query. I realize that #StepScope is necessary to use job parameters but why doesn't the code in my method execute?
Solved this by adding #EnableBatchProcessing & #EnableAutoConfigurationannotations and changing the item reader method definition as follows,
#StepScope
#Bean
public JdbcCursorItemReader<QueryCount> queryCountItemReader(#Value("#{jobParameters['fromDate']}") String fromDate,
#Value("#{jobParameters['toDate']}") String toDate,
#Value("#{jobParameters['eventType']}") String eventType,
#Value("#{jobParameters['businessUnit']}") String businessUnit,
#Value("#{jobParameters['deviceCategory']}") String deviceCategory,
#Value("#{jobParameters['numberOfSearchIds']}") String numberOfSearchIds) throws Exception {

How to exclude job parameter from uniqueness in Spring Batch?

I am trying to launch a job in Spring Batch 2, and I need to pass some information in the job parameters, but I do not want it to count for the uniqueness of the job instance. For example, I'd want these two sets of parameters to be considered unique:
file=/my/file/path,session=1234
file=/my/file/path,session=5678
The idea is that there will be two different servers trying to start the same job, but with different sessions attached to them. I need that session number in both cases. Any ideas?
Thanks!
So, if 'file' is the only attribute that's supposed to be unique and 'session' is used by downstream code, then your problem matches almost exactly what I had. I had a JMSCorrelationId that i needed to store in the execution context for later use and I didn't want it to play into the job parameters' uniqueness. Per Dave Syer, this really wasn't possible, so I took the route of creating the job with the parameters (not the 'session' in your case), and then adding the 'session' attribute to the execution context before anything actually runs.
This gave me access to 'session' downstream but it was not in the job parameters so it didn't affect uniqueness.
References
https://jira.springsource.org/browse/BATCH-1412
http://forum.springsource.org/showthread.php?104440-Non-Identity-Job-Parameters&highlight=
You'll see from this forum that there's no good way to do it (per Dave Syer), but I wrote my own launcher based on the SimpleJobLauncher (in fact I delegate to the SimpleLauncher if a non-overloaded method is called) that has an overloaded method for starting a job that takes a callback interface that allows contribution of parameters to the execution context while not being 'true' job parameters. You could do something very similar.
I think the applicable LOC for you is right here:
jobExecution = jobRepository.createJobExecution(job.getName(),
jobParameters);
if (contributor != null) {
if (contributor.contributeTo(jobExecution.getExecutionContext())) {
jobRepository.updateExecutionContext(jobExecution);
}
}
which is where, after execution context creatin, the execution context is added to. Hopefully this helps you in your implementation.
public class ControlMJobLauncher implements JobLauncher, InitializingBean {
private JobRepository jobRepository;
private TaskExecutor taskExecutor;
private SimpleJobLauncher simpleLauncher;
private JobFilter jobFilter;
public void setJobRepository(JobRepository jobRepository) {
this.jobRepository = jobRepository;
}
public void setTaskExecutor(TaskExecutor taskExecutor) {
this.taskExecutor = taskExecutor;
}
/**
* Optional filter to prevent job launching based on some specific criteria.
* Jobs that are filtered out will return success to ControlM, but will not run
*/
public void setJobFilter(JobFilter jobFilter) {
this.jobFilter = jobFilter;
}
public JobExecution run(final Job job, final JobParameters jobParameters, ExecutionContextContributor contributor)
throws JobExecutionAlreadyRunningException, JobRestartException,
JobInstanceAlreadyCompleteException, JobParametersInvalidException, JobFilteredException {
Assert.notNull(job, "The Job must not be null.");
Assert.notNull(jobParameters, "The JobParameters must not be null.");
//See if job is filtered
if(this.jobFilter != null && !jobFilter.launchJob(job, jobParameters)) {
throw new JobFilteredException(String.format("Job has been filtered by the filter: %s", jobFilter.getFilterName()));
}
final JobExecution jobExecution;
JobExecution lastExecution = jobRepository.getLastJobExecution(job.getName(), jobParameters);
if (lastExecution != null) {
if (!job.isRestartable()) {
throw new JobRestartException("JobInstance already exists and is not restartable");
}
logger.info(String.format("Restarting job %s instance %d", job.getName(), lastExecution.getId()));
}
// Check the validity of the parameters before doing creating anything
// in the repository...
job.getJobParametersValidator().validate(jobParameters);
/*
* There is a very small probability that a non-restartable job can be
* restarted, but only if another process or thread manages to launch
* <i>and</i> fail a job execution for this instance between the last
* assertion and the next method returning successfully.
*/
jobExecution = jobRepository.createJobExecution(job.getName(),
jobParameters);
if (contributor != null) {
if (contributor.contributeTo(jobExecution.getExecutionContext())) {
jobRepository.updateExecutionContext(jobExecution);
}
}
try {
taskExecutor.execute(new Runnable() {
public void run() {
try {
logger.info("Job: [" + job
+ "] launched with the following parameters: ["
+ jobParameters + "]");
job.execute(jobExecution);
logger.info("Job: ["
+ job
+ "] completed with the following parameters: ["
+ jobParameters
+ "] and the following status: ["
+ jobExecution.getStatus() + "]");
} catch (Throwable t) {
logger.warn(
"Job: ["
+ job
+ "] failed unexpectedly and fatally with the following parameters: ["
+ jobParameters + "]", t);
rethrow(t);
}
}
private void rethrow(Throwable t) {
if (t instanceof RuntimeException) {
throw (RuntimeException) t;
} else if (t instanceof Error) {
throw (Error) t;
}
throw new IllegalStateException(t);
}
});
} catch (TaskRejectedException e) {
jobExecution.upgradeStatus(BatchStatus.FAILED);
if (jobExecution.getExitStatus().equals(ExitStatus.UNKNOWN)) {
jobExecution.setExitStatus(ExitStatus.FAILED
.addExitDescription(e));
}
jobRepository.update(jobExecution);
}
return jobExecution;
}
static interface ExecutionContextContributor {
boolean CONTRIBUTED_SOMETHING = true;
boolean CONTRIBUTED_NOTHING = false;
/**
*
* #param executionContext
* #return true if the exeuctioncontext was contributed to
*/
public boolean contributeTo(ExecutionContext executionContext);
}
#Override
public void afterPropertiesSet() throws Exception {
Assert.state(jobRepository != null, "A JobRepository has not been set.");
if (taskExecutor == null) {
logger.info("No TaskExecutor has been set, defaulting to synchronous executor.");
taskExecutor = new SyncTaskExecutor();
}
this.simpleLauncher = new SimpleJobLauncher();
this.simpleLauncher.setJobRepository(jobRepository);
this.simpleLauncher.setTaskExecutor(taskExecutor);
this.simpleLauncher.afterPropertiesSet();
}
#Override
public JobExecution run(Job job, JobParameters jobParameters)
throws JobExecutionAlreadyRunningException, JobRestartException,
JobInstanceAlreadyCompleteException, JobParametersInvalidException {
return simpleLauncher.run(job, jobParameters);
}
}
Starting from spring batch 2.2.x, there is support for non-identifying parameters. If you are using CommandLineJobRunner, you can specify non-identifying parameters with '-' prefix.
For example:
java org.springframework.batch.core.launch.support.CommandLineJobRunner file=/my/file/path -session=5678
If you are using old version of spring batch, you need to migrate your database schema. See 'Migrating to 2.x.x' section at http://docs.spring.io/spring-batch/getting-started.html.
This is the Jira page of the feature https://jira.springsource.org/browse/BATCH-1412, and here are the change that implement it https://fisheye.springsource.org/changelog/spring-batch?cs=557515df45c0f596588418d53c3f2bae3781c1c3
In more recent versions of Spring Batch (I am using spring-batch-core:4.3.3), you can use the JobParametersBuilder to specify whether a parameter is identifying or not. For example:
new JobParametersBuilder()
.addString("identifying-param-name", paramValue1)
.addString("non-identifying-param-name", paramValue2, false)
.toJobParameters();
The 'false' in the third argument makes the parameter non-identifying.