Skip step based on job parameter - spring-batch

I've read through the spring batch docs a few times and searched for a way to skip a job step based on job parameters.
For example say I have this job
<batch:job id="job" restartable="true"
xmlns="http://www.springframework.org/schema/batch">
<batch:step id="step1-partitioned-export-master">
<batch:partition handler="partitionHandler"
partitioner="partitioner" />
<batch:next on="COMPLETED" to="step2-join" />
</batch:step>
<batch:step id="step2-join">
<batch:tasklet>
<batch:chunk reader="xmlMultiResourceReader" writer="joinXmlItemWriter"
commit-interval="1000">
</batch:chunk>
</batch:tasklet>
<batch:next on="COMPLETED" to="step3-zipFile" />
</batch:step>
<batch:step id="step3-zipFile">
<batch:tasklet ref="zipFileTasklet" />
<!-- <batch:next on="COMPLETED" to="step4-fileCleanUp" /> -->
</batch:step>
<!-- <batch:step id="step4-fileCleanUp">
<batch:tasklet ref="fileCleanUpTasklet" />
</batch:step> -->
</batch:job>
I want to be able to skip step4 if desired by specifying in the job paramaters.
The only somewhat related question I could find was how to select which spring batch job to run based on application argument - spring boot java config
Which seems to indicate that 2 distinct job contexts should be created and the decision made outside the batch step definition.
I have already followed this pattern, since I had a csv export as well as xml as in the example. I split the 2 jobs into to separate spring-context.xml files one for each export type, even though the there where not many differences.
At that point I though it was perhaps cleaner since I could find no examples of alternatives.
But having to create 4 separate context files just to make it possible to include step 4 or not for each export case seems a bit crazy.
I must be missing something here.

Can't you do that with a decider? http://docs.spring.io/spring-batch/reference/html/configureStep.html (chapter 5.3.4 Programmatic Flow Decisions)
EDIT: link to the updated url
https://docs.spring.io/spring-batch/trunk/reference/html/configureStep.html#programmaticFlowDecisions

Related

Spring: How to restart transaction in a for-loop?

I have a Spring Batch app that, during the write step, loops through records to be inserted into a Postgres database. Every now and again we get a DuplicateKeyException in the loop, but don't want the whole job to fail. We log that record and want to continue inserting the following records.
But upon getting an exception, the transaction becomes "bad" and Postgres won't accepted any more commands, as described in this excellent post. So my question is, what's the best way to restart the transaction? Again, I'm not retrying the record that failed - I just want to continue in my loop with the next record.
This is part of my job config xml:
<batch:job id="portHoldingsJob">
<batch:step id="holdingsStep">
<tasklet throttle-limit="10">
<chunk reader="PortHoldingsReader" processor="PortHoldingsProcessor" writer="PortHoldingsWriter" commit-interval="1" />
</tasklet>
</batch:step>
<batch:listeners>
<batch:listener ref="JobExecutionListener"/>
</batch:listeners>
</batch:job>
Thanks for any input!
Not sure if you are using the Spring transaction annotations to manage the transactions or not ... if so perhaps you can try.
#Transactional(noRollbackFor = DuplicateKeyException.class)
Hope that helps.
No rollback exceptions in Spring Batch are apparently designated like
<batch:tasklet>
<batch:chunk ... />
<batch:no-rollback-exception-classes>
<batch:include class="MyRuntimeException"/>
</batch:no-rollback-exception-classes>
</batch:tasklet>

Spring Batch SkipPolicy not used

I use the setup below in a project for a job definition.
On the project the batch-jobs are defined in a database. The xml-job definition below serves as a template for creating all these batch jobs at runtime.
This works fine, except in the case of a BeanCreationException in the dataProcessor. When this exception occurs the skip policy is never called and the batch ends immediately instead.
What could be the reason for that? What do I have to do so that every Exception in the dataProcessor is going to use the SkipPolicy?
Thanks a lot in advance
Christian
Version: spring-batch 3.0.7
<batch:job id="MassenGevoJob" restartable="true">
<batch:step id="selectDataStep" parent="selectForMassenGeVoStep" next="executeProcessorStep" />
<batch:step id="executeProcessorStep"
allow-start-if-complete="true" next="decideExitStatus" >
<batch:tasklet>
<batch:chunk reader="dataReader" processor="dataProcessor"
writer="dataItemWriter" commit-interval="10"
skip-policy="batchSkipPolicy">
</batch:chunk>
<batch:listeners>
<batch:listener ref="batchItemListener" />
<batch:listener ref="batchSkipListener" />
<batch:listener ref="batchChunkListener" />
</batch:listeners>
</batch:tasklet>
</batch:step>
<batch:decision decider="failOnPendingObjectsDecider"
id="decideExitStatus">
<batch:fail on="FAILED_PENDING_OBJECTS" exit-code="FAILED_PENDING_OBJECTS" />
<batch:next on="*" to="endFlowStep" />
</batch:decision>
<batch:step id="endFlowStep">
<batch:tasklet ref="noopTasklet"></batch:tasklet>
</batch:step>
<batch:validator ref="batchParameterValidator" />
<batch:listeners>
<batch:listener ref="batchJobListener" />
</batch:listeners>
</batch:job>
A BeanCreationException isn't really skippable because it usually happens before Spring Batch starts. It's also typically a fatal error for your application (Spring couldn't create a component you've defined as being critical to your application). If the creation of that bean is subject to issues and not having it is ok, I'd suggest wrapping it's creation in a factory so that you can control any exceptions that come out of the creation of that bean. For example, if you can't create your custom ItemProcessor, your FactoryBean could return the PassthroughItemProcessor if that's ok.

Can you configure an ItemReader for a Partitioner in Spring Batch?

I have the following requirement: a CSV input file contains lines where one of the fields is an ID. There can be several lines with the same ID. The lines should be processed grouped by ID (meaning, if one line fails validation, then all lines with that same ID should fail to process). The groups of lines can be processed in parallel.
I have an implementation that works fine, but it is reading the CSV input file using my own code in a Partitioner implementation. It would be nicer if I could use an out-of-the-box implementation for that (e.g. FlatFileItemReader) and just configure that just like you would for a Chunk step.
To clarify, my job config is like this:
<batch:job id="job">
<batch:step id="partitionStep">
<batch:partition step="chunkStep" partitioner="partitioner">
<batch:handler grid-size="10" task-executor="taskExecutor" />
</batch:partition>
</batch:step>
</batch:job>
<batch:step id="chunkStep">
<batch:tasklet transaction-manager="transactionManager">
<batch:chunk reader="reader" processor="processor" writer="writer" chunk-completion-policy="completionPolicy">
.. skip and retry policies omitted for brevity
</batch:chunk>
</batch:tasklet>
</batch:step>
<bean id="partitioner" class="com.acme.InputFilePartitioner" scope="step">
<property name="inputFileName" value="src/main/resources/input/example.csv" />
</bean>
<bean id="reader" class="org.springframework.batch.item.support.ListItemReader" scope="step">
<constructor-arg value="#{stepExecutionContext['key']}"/>
</bean>
where the Partitioner implementation reads the input file, "manually" parses the lines to get the ID field, groups them by that ID and puts them in Lists, and create ExecutionContexts that each get one of those Lists.
It would be great if I could replace that "manual" code in the Partitioner by a configuration of FlatFileItemReader with an ObjectMapper. (I hope I express myself clearly).
Is it possible ?

Spring Batch: Child step depends on multiple parent steps

In Spring Batch we can create one step which is dependent on other like:
<batch:job id="firstJob">
<batch:step id="firstStep" next="secondStep">
<batch:tasklet ref="firstTasklet"/>
</batch:step>
<batch:step id="secondStep">
<batch:tasklet ref="secondTasklet"/>
</batch:step>
</batch:job>
In my case, we have dependency as shown below, task C (child) needs to be executed only when A (parent) and B (parent) both are completed:
Is there any way in Spring Batch where we can say something like:
<batch:job id="firstJob">
<batch:step id="A,B" next="C">
<batch:tasklet ref="firstTasklet"/>
</batch:step>
...
</batch:job>
What I thought of is using listener on A and B, and keep track of both listeners in database. When both listeners gets executed, task C can be invoked.
Please help.
Note: I am using Spring Batch version: 2.1.9-RELEASE, if above requirement is available on higher releases, I can update version as well.
You can use "next" tag as many times as you want to define a chain so:
<batch:step id="A" next="B">
<batch:tasklet ref="firstTasklet"/>
</batch:step>
<batch:step id="B" next="C">
<batch:tasklet ref="secondTasklet"/>
</batch:step>
<batch:step id="C">
<batch:tasklet ref="thirdTasklet"/>
</batch:step>
The chain is: A -> B -> C
C step will be executed after B
Probably not useful anymore but:
<job id="job1">
<split id="split1" task-executor="taskExecutor" next="stepC">
<flow>
<step id="stepA" parent="stepA" />
</flow>
<flow>
<step id="stepB" parent="stepB"/>
</flow>
</split>
<step id="stepC" parent="stepC"/>
</job>
So C will execute once A and B have executed.
http://docs.spring.io/spring-batch/trunk/reference/html/scalability.html#scalabilityParallelSteps

two jobs in one single spring batch

This is my problem:
I want two jobs configured in the same spring batch. There are 2 totally different tasks(jobs - read-process-write) I want to perform based on the arguement that I pass from the command line.
a) Is it possible to have something like this in the same batch config file?
<batch:job id="job1">
<batch:tasklet>
<batch:chunk reader="reader1" writer="writer1"
processor="processor1" commit-interval="1">
</batch:chunk>
</batch:tasklet>
</batch:job>
<batch:job id="job2">
<batch:tasklet>
<batch:chunk reader="reader2" writer="writer2"
processor="processor2" commit-interval="1">
</batch:chunk>
</batch:tasklet>
</batch:job>
b) If yes, how because when I try that this is what I get:
Exception in thread "main" org.springframework.beans.factory.xml.XmlBeanDefinitionStoreException: Line 48 in XML document from class path resource [hefo-job.xml] is invalid; nested exception is org.xml.sax.SAXParseException: cvc-complex-type.2.4.a: Invalid content was found starting with element 'batch:tasklet'.