Accessing Spring Batch Job/Step details via the generic query provider - spring-batch

I'm setting up a batch job that moves data between three databases. I am planning on using the out of the box spring batch classes to handle the query from the first database, but i want to include details of the current job/step in the extract. The example spring config might look like this
<bean id="jdbcPagingItemReader" class="org.springframework.batch.item.database.JdbcPagingItemReader"> <property name="dataSource" ref="dataSource"/>
<property name="pageSize" value="1000"/>
<property name="fetchSize" value="100"/>
<property name="queryProvider">
<bean class="org.springframework.batch.item.database.support.HsqlPagingQueryProvider">
<property name="selectClause" value="select id, bar"/>
<property name="fromClause" value="foo"/>
<property name="sortKeys">
Is there a way via groovy or SpEL to access the current JobExecution? I had found this thread on access-spring-batch-job-definition but is assumes custom code.

Your configuration is cut off at the sortKeys entry so I'm not 100% sure what you are attempting to accomplish. That being said, using step scope, you can inject the StepExecution which has a reference to the JobExecution. Getting the JobExecution would look something like this:
<bean id="jdbcPagingItemReader" class="org.springframework.batch.item.database.JdbcPagingItemReader">
…
<property id="jobExecution" value="#{stepExecution.jobExecution}"/>
</bean>

The write count exists on the step context (StepExecution.getWriteCount()) so first you need to promote it to the JobExecutionContext.
I suggest using a Step listener with #AfterStep annotation for this
#Component
public class PromoteWriteCountToJobContextListener implements StepListener {
#AfterStep
public ExitStatus afterStep(StepExecution stepExecution){
int writeCount = stepExecution.getWriteCount();
stepExecution.getJobExecution().getExecutionContext()
.put(stepExecution.getStepName()+".writeCount", writeCount);
return stepExecution.getExitStatus();
}
}
Every step that perform the inserts will be added with this listner
<batch:step id="readFromDB1WrietToDB2" next="readFroDB2WrietToDB3">
<batch:tasklet transaction-manager="transactionManager">
<batch:chunk reader="reader" writer="writter" />
</batch:tasklet>
<batch:listeners>
<batch:listener ref="promoteWriteCountToJobContextListener"/>
</batch:listeners>
</batch:step>
<batch:step id="readFromDB2WrietToDB3" next="summerizeWrites">
<batch:tasklet transaction-manager="transactionManager">
<batch:chunk reader="reader" writer="writter" />
</batch:tasklet>
<batch:listeners>
<batch:listener ref="promoteWriteCountToJobContextListener"/>
</batch:listeners>
</batch:step>
You can use log4j to write the results to log within the Step Listener, or you can use the saved values in a future step. You need to use SpEL expression to read it, to Use SpEL expression you need to set the Writter/Tasklet at scope="step"
<batch:step id="summerizeWrites">
<batch:tasklet id="summerizeCopyWritesTaskelt"">
<bean class="Tasklet" scope="step">
<property name="writeCountsList">
<list>
<value>#{jobExecutionContext['readFromDB1WrietToDB2.writeCount']}</value>
<value>#{jobExecutionContext['readFromDB2WrietToDB3.writeCount']}</value>
</list>
</property>
</bean>
</batch:tasklet>

Related

RECORDS OMITTED FROM QUERY USING JPAPAGINGITEMREADER while reading

<batch:job id="xyzJob" job-repository="jobRepository"
incrementer="jobParametersIncrementerImpl" restartable="false">
<batch:step id="feeStep">
<batch:tasklet transaction-manager="transactionManager" allow-start-if-complete="true">
<batch:chunk reader="xyzReader" processor="xyzProcessor"
writer="xyzWriter" commit-interval="4" >
<batch:streams>
<batch:stream ref="fileWriter"/>
</batch:streams>
</batch:chunk>
<batch:listeners>
<batch:listener ref="stepExecutionListener"/>
</batch:listeners>
</batch:tasklet>
</batch:step>
<batch:listeners>
<batch:listener ref="xyzJobListener" />
</batch:listeners>
</batch:job>
<bean id="xyzProcessor" class="com.batch.core.processor.XYZProcessor" scope="step">
<property name="fundDAO" ref="fundDAO"/>
<property name="loanDAO" ref="loanDAO"/>
<property name="aumBlnceDAO" ref="aumBalanceDAO"/>
---
--
</bean>
<bean id="xyzWriter" class="org.springframework.batch.item.support.CompositeItemWriter">
<property name="delegates">
<list>
<bean class="com.batch.core.writer.xyzWriter">
<property name="xyzDetailsDomain" ref="xyzDetailsDomain" />
<property name="xyzHistoryDomain" ref="xyzHistoryDomain"></property>
</bean>
<ref bean="fileWriter"/>
</list>
</property>
</bean>
<bean id="xyzReader" class="org.springframework.batch.item.database.JpaPagingItemReader" scope="step">
<property name="entityManagerFactory" ref="batchEntityManagerFactoryBean"/>
<property name="queryString">
<value><![CDATA[
SELECT pe, pap
FROM A pe, B pap
WHERE pap.userID =pe.userID and pe.status = 'E'
and pe.startDT <= '#{jobExecutionContext[previousQuarterEndDate]}'
and (pe.endDT is null
or pe.endDT > '#{jobExecutionContext[previousQuarterEndDate]}')]]>
</value>
</property>
<property name="pageSize" value="1000"/>
<property name="saveState" value="false" />
</bean>
When this spring batch is running what I have noticed sometimes its skipped some records while reading data from tables .. this issue not consistent... but out of 10 times it may occur 1 time.
Please help me to resolve this. Thanks in advance.!!
The JpaPagingItemReader does not skip records. What is probably happening in your case is that some records corresponding to your search criteria are added while your reader is reading data. This may lead to wrong page calculation and some items may seem to be skipped but it is not the case.
A batch processing job is by definition acting on a fixed data set. If your query returns a fixed data set while your job is running, the same pages should be returned for each run of your job. If on the other hand another process inserts data that can be returned by your query, then your issue can happen.
Hope this helps.

Spring Batch persist data from step1 to step2

I have thoroughly searched the Spring docs and supporting sites, but have not found and answer to this inquiry; if I want to access and store some values in the ExecutionContext, do I have to write custom databaseItemReader and ItemWriter's that implement the ItemStream or can I use the "out-of-the-box" readers and writers and edit the beans in the spring-batch-context.xml file to do this? Any code examples would be greatly appreciated. Thanks!
http://www.springframework.org/schema/batch/spring-batch-3.0.xsd
http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-4.0.xsd">
<import resource="classpath:context-datasource.xml" />
<!-- JobRepository and JobLauncher are configuration/setup classes -->
<bean id="jobRepository"
class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean" />
<bean id="jobLauncher"
class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository" />
</bean>
<!-- ItemReader which reads from database and returns the row mapped by
rowMapper -->
<bean id="databaseItemReader"
class="org.springframework.batch.item.database.JdbcCursorItemReader">
<property name="dataSource" ref="dataSource" />
<property name="sql"
value="SELECT PartnerID, ftpUserName, ftpPassword, ftpPath, jobRunTime, jobFrequency FROM tblRosterJobParams" />
<property name="rowMapper">
<bean class="com.explorelearning.batch.ParamResultRowMapper" />
</property>
</bean>
<!-- This was supposed to change to a SavingItemWriter that persists these values to the Step ExecutionContext -->
<bean id="flatFileItemWriter" class="org.springframework.batch.item.file.FlatFileItemWriter"
scope="step">
<property name="resource" value="file:csv/ParamResult.txt" />
<property name="lineAggregator">
<!--An Aggregator which converts an object into delimited list of strings -->
<bean
class="org.springframework.batch.item.file.transform.DelimitedLineAggregator">
<property name="delimiter" value="," />
<property name="fieldExtractor">
<!-- Extractor which returns the value of beans property through reflection -->
<bean
class="org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor">
<property name="names" value="PartnerID" />
</bean>
</property>
</bean>
</property>
</bean>
<!-- Optional JobExecutionListener to perform business logic before and after the job -->
<bean id="jobListener" class="com.explorelearning.batch.RosterBatchJobListener" />
<!-- Optional StepExecutionListener to perform business logic before and after the job -->
<bean id="stepExecutionListener" class="com.explorelearning.batch.ParamResultStepExecutionListener" />
<!-- Optional ItemProcessor to perform business logic/filtering on the input records -->
<bean id="itemProcessor" class="com.explorelearning.batch.ParamResultItemProcessor" />
<!-- Step will need a transaction manager -->
<bean id="transactionManager"
class="org.springframework.batch.support.transaction.ResourcelessTransactionManager" />
<!-- Actual Job -->
<batch:job-repository id="jobRepository" data-source="dataSource" table-prefix="BATCH_"
transaction-manager="transactionManager" isolation-level-for-create="SERIALIZABLE" />
<batch:job id="RosterBatchJob" job-repository="jobRepository">
<batch:step id="readParams" >
<batch:tasklet transaction-manager="transactionManager" allow-start-if-complete="true">
<batch:chunk reader="databaseItemReader" writer="flatFileItemWriter"
processor="itemProcessor" commit-interval="10" />
</batch:tasklet>
</batch:step>
<!--<batch:step id="grabCSVs" next="validateCSVs">
</batch:step>
<batch:step id="validateCSVs" next="filterRecords">
</step>
<batch:step id="filterRecords" next="determineActions">
</batch:step>
<batch:step id="determineActions" next="executeActions">
</batch:step>
<batch:step id="executeActions" next="">
</batch:step> -->
</batch:job>
Updated now that the question has more detail...
Okay, the way I'm reading your context file, you need to:
Get some FTP login info from the Database
Download some CSV files
Validate the files (file-level? or just record-level validation?)
Filter out some garbage records
Determine some "actions"
Execute some "actions"
The best way to accomplish this (in my opinion) is to create the following steps:
Step 1: A simple Tasklet that queries the database for FTP login info and then downloads the CSV files to a local folder
Step 2: Partitioned step that creates one partition per CSV in the folder
Each partition will have a reader (FlatFileItemReader) that reads out records.
Records will then go to an ItemProcessor that returns null if the record is garbage.
Valid items will be written to some DB staging table for further action
Optionally you can use a ClassifierItemWriter to do something w/ the junk records
Step 3: The next step reads the valid data in the staging table and does "Actions"
Step 4: Maybe another step to do something with the junk records

Does TaskExecutorPartitionHandler uses data grid?

I am using step partitioning in my batch job which is deployed in a distributed Spring XD environment. I would like to know if TaskExecutorPartitionHandler uses data transport in our case is Rabbit MQ?
<bean id="itemReader"
class="sample.ItemReader"
scope="step" >
<property name=requestList"
value="#{stepExecutionContext[test]}" />
</bean>
<bean id="taskExecutor"
class="org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor">
<property name="corePoolSize" value="10" />
<property name="maxPoolSize" value="15" />
<property name="allowCoreThreadTimeOut" value="true" />
</bean>
<batch:job id='partitionJob' restartable="false"
incrementer="jobParametersIncrementerImpl" >
<batch:step id="startLoopStep">
<batch:tasklet ref="initTasklet" />
</batch:step>
<batch:step id='partitionerStep'>
<batch:partition step="slave" partitioner="rangePartitioner">
<batch:handler grid-size="${gridSize}" task-executor="taskExecutor" />
</batch:partition>
</batch:step>
</batch:job>
<batch:step id="slave">
<batch:tasklet>
<batch:chunk reader="itemReader" writer="itemWriter"
commit-interval="1" retry-limit="3" >
</batch:chunk>
</batch:tasklet>
</batch:step>
No. The TaskExecutorPartitionHandler uses local threads to do the partitioning. The MessageChannelPartitionHandler is what you want for distributed partitioning. Spring XD comes with a context file that you can import for adding partitioning to single step jobs easily. Most of the out of the box jobs within Spring XD utilize this functionality and can serve as a reference.

Spring Batch- Ibatis Batch Item Writer - Null Pointer Exception

I was trying to copy a data from one data source to another, used to Ibaitsbatchitemwriter class to do so. Record was got inserted into target database but end of the batch getting null pointer exception as below,
java.lang.NullPointerException
at org.springframework.batch.item.database.IbatisBatchItemWriter.write(IbatisBatchItemWriter.java:142)
at org.springframework.batch.core.step.item.SimpleChunkProcessor.writeItems(SimpleChunkProcessor.java:175)
at org.springframework.batch.core.step.item.SimpleChunkProcessor.doWrite(SimpleChunkProcessor.java:151)
at org.springframework.batch.core.step.item.SimpleChunkProcessor.write(SimpleChunkProcessor.java:274)
at org.springframework.batch.core.step.item.SimpleChunkProcessor.process(SimpleChunkProcessor.java:199)
at org.springframework.batch.core.step.item.ChunkOrientedTasklet.execute(ChunkOrientedTasklet.java:75)
but after adding the property assertupdates = fales i was not getting the error and then data is also got copied. but i was not convinced with the null pointer error, looks like am missing something in my config or so.
i use spring infra 2.2.4 and ibatis version 2.3.0.
<bean id="targetWriterDepAcct03"
class="org.springframework.batch.item.database.IbatisBatchItemWriter">
<property name="sqlMapClient" ref="targetDatabaseMap" />
<property name="statementId" value="DepositAccountSqlMap.updtDepositAccount" />
<property name="assertUpdates" value="false" />
</bean>
<batch:job id="baseJob" abstract="true" restartable="true"
job-repository="jobRepository" />
<batch:job id="TboltSyncBatchJob">
<batch:step id="CheckForConfigFileStep">
<batch:tasklet ref="CheckForConfigFile" />
<batch:next on="COMPLETED" to="SyncDataDepAcct03" />
<batch:end on="FAILED" />
</batch:step>
<batch:step id="SyncDataDepAcct03">
<batch:tasklet transaction-manager="transactionManager">
<batch:chunk reader="sourceReaderForDepAcct03" writer="targetWriterDepAcct03"
commit-interval="1000" />
</batch:tasklet>
</batch:step>
any thoughts?
That NPE is due to the fact that zero results were returned by the SqlMapClient. If you don't need the number of records to be checked, you can turn that off. If you need them checked, you'll want to look into why that query isn't returning any results. You can see the code for the IbatisBatchItemWriter here: IbatisBatchItemWriter.

Passing data through two steps - Custom Field Mapper and Custom Tasklet

This is my job configuration:
<batch:job id="clientesJob" job-repository="jobRepository">
<batch:step id="step1" next="renameFiles">
<tasklet>
<chunk reader="multiResourceReader" writer="sqlWriter"
commit-interval="1" />
</tasklet>
</batch:step>
<batch:step id="renameFiles">
<tasklet ref="fileRenamingTasklet" />
</batch:step>
</batch:job>
<bean id="multiResourceReader"
class=" org.springframework.batch.item.file.MultiResourceItemReader">
<property name="resources" value="file:c:/cvs/basecli*" />
<property name="delegate" ref="flatFileItemReader" />
</bean>
<bean id="flatFileItemReader" class="org.springframework.batch.item.file.FlatFileItemReader">
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="fieldSetMapper" ref="clienteMapper" />
<property name="lineTokenizer" ref="tickerLineTokenizer" />
</bean>
</property>
</bean>
<bean name="tickerLineTokenizer"
class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer" />
<bean id="clienteMapper" class="com.bind.mapper.ClienteFieldSetMapper">
</bean>
<bean id="fileRenamingTasklet" class="com.bind.tasklet.FileRenamingTasklet">
<property name="directory" value="file:c:/cvs/" />
</bean>
In the first step I'm reading the folder with a MultiResourceItemReader, then write it to a SQL Server.
The second one rename the files like "PROCESSFILE-{originalname}".
I thing I want to archive is in the first step there was a problem rename the file in a diferent way like "PROCESSERROR-{originalname}".
So I have to know the status of the first step in my FileRenamingTasklet.
I read about setting the data to the stepExecutionContext. But I cant access in ClienteFieldSetMapper.
I also try using listeners, but there i can't pass the data through.
For further considerations I need the file name and the status.
Any ideas?
Make your fileRenamingTasklet a StepExecutionListener and listen step1 afterStep result; in StepExecutionListener.afterStep(StepExecution stepExecution) check stepExecution.getExitStatus() and you are able to rename correctly your files.
To add listener you have to modify your xml as:
<batch:step id="step1" next="renameFiles">
<tasklet>
<chunk reader="multiResourceReader" writer="sqlWriter" commit-interval="1" />
</tasklet>
<listeners>
<listener ref="fileRenamingTasklet" />
</listeners>
</batch:step>