I am using step partitioning in my batch job which is deployed in a distributed Spring XD environment. I would like to know if TaskExecutorPartitionHandler uses data transport in our case is Rabbit MQ?
<bean id="itemReader"
class="sample.ItemReader"
scope="step" >
<property name=requestList"
value="#{stepExecutionContext[test]}" />
</bean>
<bean id="taskExecutor"
class="org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor">
<property name="corePoolSize" value="10" />
<property name="maxPoolSize" value="15" />
<property name="allowCoreThreadTimeOut" value="true" />
</bean>
<batch:job id='partitionJob' restartable="false"
incrementer="jobParametersIncrementerImpl" >
<batch:step id="startLoopStep">
<batch:tasklet ref="initTasklet" />
</batch:step>
<batch:step id='partitionerStep'>
<batch:partition step="slave" partitioner="rangePartitioner">
<batch:handler grid-size="${gridSize}" task-executor="taskExecutor" />
</batch:partition>
</batch:step>
</batch:job>
<batch:step id="slave">
<batch:tasklet>
<batch:chunk reader="itemReader" writer="itemWriter"
commit-interval="1" retry-limit="3" >
</batch:chunk>
</batch:tasklet>
</batch:step>
No. The TaskExecutorPartitionHandler uses local threads to do the partitioning. The MessageChannelPartitionHandler is what you want for distributed partitioning. Spring XD comes with a context file that you can import for adding partitioning to single step jobs easily. Most of the out of the box jobs within Spring XD utilize this functionality and can serve as a reference.
Related
<batch:job id="xyzJob" job-repository="jobRepository"
incrementer="jobParametersIncrementerImpl" restartable="false">
<batch:step id="feeStep">
<batch:tasklet transaction-manager="transactionManager" allow-start-if-complete="true">
<batch:chunk reader="xyzReader" processor="xyzProcessor"
writer="xyzWriter" commit-interval="4" >
<batch:streams>
<batch:stream ref="fileWriter"/>
</batch:streams>
</batch:chunk>
<batch:listeners>
<batch:listener ref="stepExecutionListener"/>
</batch:listeners>
</batch:tasklet>
</batch:step>
<batch:listeners>
<batch:listener ref="xyzJobListener" />
</batch:listeners>
</batch:job>
<bean id="xyzProcessor" class="com.batch.core.processor.XYZProcessor" scope="step">
<property name="fundDAO" ref="fundDAO"/>
<property name="loanDAO" ref="loanDAO"/>
<property name="aumBlnceDAO" ref="aumBalanceDAO"/>
---
--
</bean>
<bean id="xyzWriter" class="org.springframework.batch.item.support.CompositeItemWriter">
<property name="delegates">
<list>
<bean class="com.batch.core.writer.xyzWriter">
<property name="xyzDetailsDomain" ref="xyzDetailsDomain" />
<property name="xyzHistoryDomain" ref="xyzHistoryDomain"></property>
</bean>
<ref bean="fileWriter"/>
</list>
</property>
</bean>
<bean id="xyzReader" class="org.springframework.batch.item.database.JpaPagingItemReader" scope="step">
<property name="entityManagerFactory" ref="batchEntityManagerFactoryBean"/>
<property name="queryString">
<value><![CDATA[
SELECT pe, pap
FROM A pe, B pap
WHERE pap.userID =pe.userID and pe.status = 'E'
and pe.startDT <= '#{jobExecutionContext[previousQuarterEndDate]}'
and (pe.endDT is null
or pe.endDT > '#{jobExecutionContext[previousQuarterEndDate]}')]]>
</value>
</property>
<property name="pageSize" value="1000"/>
<property name="saveState" value="false" />
</bean>
When this spring batch is running what I have noticed sometimes its skipped some records while reading data from tables .. this issue not consistent... but out of 10 times it may occur 1 time.
Please help me to resolve this. Thanks in advance.!!
The JpaPagingItemReader does not skip records. What is probably happening in your case is that some records corresponding to your search criteria are added while your reader is reading data. This may lead to wrong page calculation and some items may seem to be skipped but it is not the case.
A batch processing job is by definition acting on a fixed data set. If your query returns a fixed data set while your job is running, the same pages should be returned for each run of your job. If on the other hand another process inserts data that can be returned by your query, then your issue can happen.
Hope this helps.
I have implemented my own ItemStreamReader to make it Synchronized. However I don't know how to use it in my step.
SynchronizedItemReader<T> implements ItemStreamReader<T>
That's my custom class. My XML configuration for the step is as follows
<bean id="xmlItemReaderStep2" class="org.springframework.batch.item.xml.StaxEventItemReader">
<property name="resource" value="classpath:report.xml" />
<property name="fragmentRootElementName" value="class" />
<property name="unmarshaller">
<bean class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
<property name="classesToBeBound">
<list>
<value>com.model.ClassNode</value>
</list>
</property>
</bean>
</property>
</bean>
How do I use my SynchronizedItemReader<T> in the step?
Job config below
<batch:step id="step2" next="step3">
<batch:tasklet transaction-manager="transactionManager">
<batch:chunk reader="xmlItemReaderStep2" writer="testResultsWriter"
processor="itemProcessor2" commit-interval="500" />
<batch:listeners>
<batch:listener ref="promotionListener" />
<batch:listener ref="jobListener3" />
</batch:listeners>
</batch:tasklet>
</batch:step>
Configure a bean named itemStreamReader in your xml configuration for your item reader and configure it this way in your step :
<batch:step id="step2" next="step3">
<batch:tasklet transaction-manager="transactionManager">
<batch:chunk reader="itemStreamReader" writer="testResultsWriter"
processor="itemProcessor2" commit-interval="500" />
<batch:listeners>
<batch:listener ref="promotionListener" />
<batch:listener ref="jobListener3" />
</batch:listeners>
</batch:tasklet>
</batch:step>
I was trying to configure my first multi-threaded job. We have a master file of about 200,000 records which we need to process. I want to break the file down into 10 files and process them. The split file tasklet is working fine & it splits the bug file into 10 smaller files. I am passing the path of the files to the job in the job parameters key as "urlFilesPath" and value as"file:/scraper /spliturl*". I verified that the files are being read correctly in the MultiResourcePartitioner bean.
The master step is running in my configuration but the slave step does not run. Below is my configuration.
Partitioner:
<bean id="partitioner"
class="org.springframework.batch.core.partition.support.MultiResourcePartitioner"
scope="step">
<property name="resources" value="#{jobParameters['urlFilesPath']}" />
</bean>
MultiResourceItemReader:
<bean id="multiResourceItemReader"
class="org.springframework.batch.item.file.MultiResourceItemReader"
scope="step">
<property name="resources" value="#{jobParameters['urlFilesPath']}" />
<property name="delegate" ref="urlFileItemReader" />
<property name="strict" value="true" />
<property name="saveState" value="false" />
</bean>
FlatFileItemWriter:
<bean id="urlFileItemReader" class="org.springframework.batch.item.file.FlatFileItemReader"
scope="step">
<property name="lineMapper" ref="passThroughLineMapper" />
<property name="resource" value="#{stepExecutionContext['fileName']}" />
<property name="saveState" value="true" />
</bean>
Job Configuration:
<batch:job id="importJob" job-repository="jobRepository">
<batch:step id="fileSplitter" next="readURLFileRunner">
<batch:tasklet ref="fileSplittingTasklet"
transaction-manager="transactionManager" />
</batch:step>
<batch:step id="readURLFileRunner">
<batch:partition step="readURLFile" partitioner="partitioner">
<batch:handler grid-size="10" task-executor="taskExecutor" />
</batch:partition>
</batch:step>
</batch:job>
Slave Step Configuration:
<batch:step id="readURLFile">
<batch:tasklet transaction-manager="transactionManager"
task-executor="taskExecutor" throttle-limit="10">
<batch:chunk reader="multiResourceItemReader" processor="urlFileItemProcessor"
writer="validURLItemWriter" commit-interval="200" skip-limit="100">
<batch:skippable-exception-classes>
<batch:include class="java.net.MalformedURLException" />
<batch:include class="java.net.URISyntaxException" />
<batch:include class="java.net.UnknownHostException" />
</batch:skippable-exception-classes>
</batch:chunk>
<batch:listeners>
<batch:listener ref="malformedURLExceptionListener" />
<batch:listener ref="uriSyntaxExceptionListener" />
<batch:listener ref="unknownHostExceptionListener" />
</batch:listeners>
</batch:tasklet>
<batch:end on="COMPLETED" />
</batch:step>
Please advise what I am doing incorrectly. I do not see the processor urlFileItemProcessor & the writer validURLItemWriter being executed.
Update
I followed the answer given by #dimzak. But I still do not see the step readURLFile being executed and loggers from the urlFileItemProcessor & validURLItemWriter are not printed to the console. The job hangs at after the following logger
org.springframework.batch.repeat.support.TaskExecutorRepeatTemplate - Starting repeat context.
I have doubt on the way I have configured the step scope.
StepScope Configuration
<bean class="org.springframework.batch.core.scope.StepScope">
<property name="autoProxy" value="true"/>
<property name="proxyTargetClass" value="true"/>
</bean>
On the Spring forums, I have read that the delegate property of multiResourceItemReader need not be step scoped. When I removed the scope="step" from the urlFileItemReader, I get the below exception.
INFO : 26 Sep 2014 00:10:58,811 - org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor - Initializing ExecutorService 'taskExecutor'
INFO : 26 Sep 2014 00:10:59,066 - org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor - Shutting down ExecutorService 'taskExecutor'
Exception in thread "main" org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'urlFileItemReader' defined in class path resource [beansBatchService.xml]: Initialization of bean failed; nested exception is org.springframework.beans.factory.BeanExpressionException: Expression parsing failed; nested exception is org.springframework.expression.spel.SpelEvaluationException: EL1008E:(pos 0): Field or property 'stepExecutionContext' cannot be found on object of type 'org.springframework.beans.factory.config.BeanExpressionContext'
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:547)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:475)
at org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:304)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:228)
at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:300)
at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:195)
at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:703)
at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:760)
at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:482)
at org.springframework.context.support.ClassPathXmlApplicationContext.<init>(ClassPathXmlApplicationContext.java:139)
at org.springframework.context.support.ClassPathXmlApplicationContext.<init>(ClassPathXmlApplicationContext.java:83)
at com.chw.hma.service.batch.jobrunner.MainJobRunner.run(MainJobRunner.java:29)
at com.chw.hma.service.batch.jobrunner.MainJobRunner.main(MainJobRunner.java:21)
Caused by: org.springframework.beans.factory.BeanExpressionException: Expression parsing failed; nested exception is org.springframework.expression.spel.SpelEvaluationException: EL1008E:(pos 0): Field or property 'stepExecutionContext' cannot be found on object of type 'org.springframework.beans.factory.config.BeanExpressionContext'
at org.springframework.context.expression.StandardBeanExpressionResolver.evaluate(StandardBeanExpressionResolver.java:146)
at org.springframework.beans.factory.support.AbstractBeanFactory.evaluateBeanDefinitionString(AbstractBeanFactory.java:1364)
at org.springframework.beans.factory.support.BeanDefinitionValueResolver.evaluate(BeanDefinitionValueResolver.java:214)
at org.springframework.beans.factory.support.BeanDefinitionValueResolver.resolveValueIfNecessary(BeanDefinitionValueResolver.java:186)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.applyPropertyValues(AbstractAutowireCapableBeanFactory.java:1456)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.populateBean(AbstractAutowireCapableBeanFactory.java:1197)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:537)
... 12 more
Caused by: org.springframework.expression.spel.SpelEvaluationException: EL1008E:(pos 0): Field or property 'stepExecutionContext' cannot be found on object of type 'org.springframework.beans.factory.config.BeanExpressionContext'
at org.springframework.expression.spel.ast.PropertyOrFieldReference.readProperty(PropertyOrFieldReference.java:217)
at org.springframework.expression.spel.ast.PropertyOrFieldReference.getValueInternal(PropertyOrFieldReference.java:85)
at org.springframework.expression.spel.ast.PropertyOrFieldReference.getValueInternal(PropertyOrFieldReference.java:78)
at org.springframework.expression.spel.ast.CompoundExpression.getValueRef(CompoundExpression.java:49)
at org.springframework.expression.spel.ast.CompoundExpression.getValueInternal(CompoundExpression.java:85)
at org.springframework.expression.spel.ast.SpelNodeImpl.getValue(SpelNodeImpl.java:102)
at org.springframework.expression.spel.standard.SpelExpression.getValue(SpelExpression.java:94)
at org.springframework.context.expression.StandardBeanExpressionResolver.evaluate(StandardBeanExpressionResolver.java:143)
... 18 more
Please advise.
Update Sep 27, 2014
Posting the task executor etc, as advised by #dimzak
<bean id="taskExecutor"
class="org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor">
<property name="corePoolSize" value="10" />
<property name="maxPoolSize" value="10" />
</bean>
<bean id="jobRepository"
class="org.springframework.batch.core.repository.support.JobRepositoryFactoryBean">
<property name="transactionManager" ref="transactionManager" />
<property name="dataSource" ref="dataSource" />
<property name="databaseType" value="mySQL" />
</bean>
<bean id="jobLauncher"
class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository" />
<property name="taskExecutor" ref="taskExecutor" />
</bean>
MultiResourcePartitioner takes file:/scrapper/spliturl* and splits each file like spliturl1, spliturl2, in different steps.
Afterwards, in the slave step you have to only read the specific file for this step.
So your configuration should be without a MultiResourceItemReader.
<batch:step id="readURLFile">
<batch:tasklet transaction-manager="transactionManager"
task-executor="taskExecutor" throttle-limit="10">
<batch:chunk reader="urlFileItemReader" processor="urlFileItemProcessor"
writer="validURLItemWriter" commit-interval="200" skip-limit="100">
<batch:skippable-exception-classes>
<batch:include class="java.net.MalformedURLException" />
<batch:include class="java.net.URISyntaxException" />
<batch:include class="java.net.UnknownHostException" />
</batch:skippable-exception-classes>
</batch:chunk>
<batch:listeners>
<batch:listener ref="malformedURLExceptionListener" />
<batch:listener ref="uriSyntaxExceptionListener" />
<batch:listener ref="unknownHostExceptionListener" />
</batch:listeners>
</batch:tasklet>
<batch:end on="COMPLETED" />
</batch:step>
I'm setting up a batch job that moves data between three databases. I am planning on using the out of the box spring batch classes to handle the query from the first database, but i want to include details of the current job/step in the extract. The example spring config might look like this
<bean id="jdbcPagingItemReader" class="org.springframework.batch.item.database.JdbcPagingItemReader"> <property name="dataSource" ref="dataSource"/>
<property name="pageSize" value="1000"/>
<property name="fetchSize" value="100"/>
<property name="queryProvider">
<bean class="org.springframework.batch.item.database.support.HsqlPagingQueryProvider">
<property name="selectClause" value="select id, bar"/>
<property name="fromClause" value="foo"/>
<property name="sortKeys">
Is there a way via groovy or SpEL to access the current JobExecution? I had found this thread on access-spring-batch-job-definition but is assumes custom code.
Your configuration is cut off at the sortKeys entry so I'm not 100% sure what you are attempting to accomplish. That being said, using step scope, you can inject the StepExecution which has a reference to the JobExecution. Getting the JobExecution would look something like this:
<bean id="jdbcPagingItemReader" class="org.springframework.batch.item.database.JdbcPagingItemReader">
…
<property id="jobExecution" value="#{stepExecution.jobExecution}"/>
</bean>
The write count exists on the step context (StepExecution.getWriteCount()) so first you need to promote it to the JobExecutionContext.
I suggest using a Step listener with #AfterStep annotation for this
#Component
public class PromoteWriteCountToJobContextListener implements StepListener {
#AfterStep
public ExitStatus afterStep(StepExecution stepExecution){
int writeCount = stepExecution.getWriteCount();
stepExecution.getJobExecution().getExecutionContext()
.put(stepExecution.getStepName()+".writeCount", writeCount);
return stepExecution.getExitStatus();
}
}
Every step that perform the inserts will be added with this listner
<batch:step id="readFromDB1WrietToDB2" next="readFroDB2WrietToDB3">
<batch:tasklet transaction-manager="transactionManager">
<batch:chunk reader="reader" writer="writter" />
</batch:tasklet>
<batch:listeners>
<batch:listener ref="promoteWriteCountToJobContextListener"/>
</batch:listeners>
</batch:step>
<batch:step id="readFromDB2WrietToDB3" next="summerizeWrites">
<batch:tasklet transaction-manager="transactionManager">
<batch:chunk reader="reader" writer="writter" />
</batch:tasklet>
<batch:listeners>
<batch:listener ref="promoteWriteCountToJobContextListener"/>
</batch:listeners>
</batch:step>
You can use log4j to write the results to log within the Step Listener, or you can use the saved values in a future step. You need to use SpEL expression to read it, to Use SpEL expression you need to set the Writter/Tasklet at scope="step"
<batch:step id="summerizeWrites">
<batch:tasklet id="summerizeCopyWritesTaskelt"">
<bean class="Tasklet" scope="step">
<property name="writeCountsList">
<list>
<value>#{jobExecutionContext['readFromDB1WrietToDB2.writeCount']}</value>
<value>#{jobExecutionContext['readFromDB2WrietToDB3.writeCount']}</value>
</list>
</property>
</bean>
</batch:tasklet>
This is my job configuration:
<batch:job id="clientesJob" job-repository="jobRepository">
<batch:step id="step1" next="renameFiles">
<tasklet>
<chunk reader="multiResourceReader" writer="sqlWriter"
commit-interval="1" />
</tasklet>
</batch:step>
<batch:step id="renameFiles">
<tasklet ref="fileRenamingTasklet" />
</batch:step>
</batch:job>
<bean id="multiResourceReader"
class=" org.springframework.batch.item.file.MultiResourceItemReader">
<property name="resources" value="file:c:/cvs/basecli*" />
<property name="delegate" ref="flatFileItemReader" />
</bean>
<bean id="flatFileItemReader" class="org.springframework.batch.item.file.FlatFileItemReader">
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="fieldSetMapper" ref="clienteMapper" />
<property name="lineTokenizer" ref="tickerLineTokenizer" />
</bean>
</property>
</bean>
<bean name="tickerLineTokenizer"
class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer" />
<bean id="clienteMapper" class="com.bind.mapper.ClienteFieldSetMapper">
</bean>
<bean id="fileRenamingTasklet" class="com.bind.tasklet.FileRenamingTasklet">
<property name="directory" value="file:c:/cvs/" />
</bean>
In the first step I'm reading the folder with a MultiResourceItemReader, then write it to a SQL Server.
The second one rename the files like "PROCESSFILE-{originalname}".
I thing I want to archive is in the first step there was a problem rename the file in a diferent way like "PROCESSERROR-{originalname}".
So I have to know the status of the first step in my FileRenamingTasklet.
I read about setting the data to the stepExecutionContext. But I cant access in ClienteFieldSetMapper.
I also try using listeners, but there i can't pass the data through.
For further considerations I need the file name and the status.
Any ideas?
Make your fileRenamingTasklet a StepExecutionListener and listen step1 afterStep result; in StepExecutionListener.afterStep(StepExecution stepExecution) check stepExecution.getExitStatus() and you are able to rename correctly your files.
To add listener you have to modify your xml as:
<batch:step id="step1" next="renameFiles">
<tasklet>
<chunk reader="multiResourceReader" writer="sqlWriter" commit-interval="1" />
</tasklet>
<listeners>
<listener ref="fileRenamingTasklet" />
</listeners>
</batch:step>