JdbcCursorItemReader how to read different JobParameters other than String - spring-batch

I've a requirement where it is a dynamic query and JobParameters are build using JobParametersBuilder by setting String, Date, Long etc, but able to read only String values in JdbcCursorItemReader. How can we read other than String in JdbcCursorItemReader so it can be set in PreparedStatement query. Thanks in Advance

With a step-scoped bean and a SpEL expression you can inject and use any type of parameter in your query. Here is an example with a non string parameter from one of the samples:
<bean id="itemReader" scope="step" autowire-candidate="false" class="org.springframework.batch.item.database.JdbcPagingItemReader">
<property name="dataSource" ref="dataSource" />
<property name="rowMapper">
<bean class="org.springframework.batch.sample.domain.trade.internal.CustomerCreditRowMapper" />
</property>
<property name="queryProvider">
<bean class="org.springframework.batch.item.database.support.SqlPagingQueryProviderFactoryBean">
<property name="dataSource" ref="dataSource"/>
<property name="fromClause" value="CUSTOMER"/>
<property name="selectClause" value="ID,NAME,CREDIT"/>
<property name="sortKeys">
<map>
<entry key="ID" value="ASCENDING"/>
</map>
</property>
<property name="whereClause" value="ID >= :minId and ID <= :maxId"/>
</bean>
</property>
<property name="parameterValues">
<map>
<entry key="minId" value="#{stepExecutionContext[minValue]}"/>
<entry key="maxId" value="#{stepExecutionContext[maxValue]}"/>
</map>
</property>
</bean>
EDIT: Add example with Java configuration style
#Bean
#StepScope
public JdbcCursorItemReader<Person> personReader(#Value("#{jobParameters['id']}") Long id) {
JdbcCursorItemReader<Person> itemReader = new JdbcCursorItemReader<>();
itemReader.setSql("select * from person where id = " + id);
// set other properties on the reader
return itemReader;
}

Related

How to specify fragmentRootElementNames in StaxEventItemReader for nested elements?

Working with Spring-batch 4.1.2 to consume a complex XML and write to database.
Using StaxEventItemReader for reading XML data.
I am facing an issue for nested elements, while specifying fragmentRootElementNames.
<bean id="productFileItemReader" class="org.springframework.batch.item.xml.StaxEventItemReader" scope="step">
<property name="resource" ref="inputResource" />
<property name="fragmentRootElementNames" value="ProductFeatures,ProductFeaturesDetail,SomeOtherNon-NestedFragmentNames..." />
<property name="unmarshaller" ref="productMarshaller" />
</bean>
<bean id="productMarshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
<property name="classesToBeBound">
<list>
<value>com.xyz.jaxb.ProductFeaturesType</value>
<value>com.xyz.jaxb.ProductFeaturesDetailType</value>
...Some more types
</list>
</property>
</bean>
<bean id="productFileItemProcessor" class="org.springframework.batch.item.support.ClassifierCompositeItemProcessor"
scope="step">
<property name="classifier" ref="prodItemProcessorclassifier" />
</bean>
<bean id="prodItemProcessorclassifier" class="org.springframework.classify.BackToBackPatternClassifier">
<property name="routerDelegate">
<bean class="com.xyz.batch.CustomProdTypeClassifier" />
</property>
<property name="matcherMap">
<map>
<entry key="ProductFeatures" value-ref="prodFeaturesProcessor" />
<entry key="ProductFeaturesDetail" value-ref="prodFeaturesDetailProcessor" />
</map>
</property>
</bean>
<bean id="prodItemWriter" class="org.springframework.batch.item.support.ClassifierCompositeItemWriter"
scope="step">
<property name="classifier" ref="prodItemWriterclassifier" />
</bean>
<bean id="prodItemWriterclassifier" class="org.springframework.classify.BackToBackPatternClassifier">
<property name="routerDelegate">
<bean class="com.xyz.batch.CustomProdTypeClassifier" />
</property>
<property name="matcherMap">
<map>
<entry key="ProductFeaturesDetail" value-ref="prodHibernateItemWriter" />
<entry key="ProductParameters" value-ref="myListItemWriter" />
</map>
</property>
</bean>
<bean id="myListItemWriter" class="com.xyz.writer.ListDelegateItemWriter">
<property name="delegate" ref="prodHibernateItemWriter" />
</bean>
<bean id="prodHibernateItemWriter" class="org.springframework.batch.item.database.HibernateItemWriter">
<property name="sessionFactory" ref="mySessionFactory" />
</bean>
Here is the signature of the ItemProcessors
public class ProductFeaturesProcessor implements ItemProcessor<ProductFeaturesType, List<ProductParamEntity>>{}
public class ProductFeaturesDetailProcessor implements ItemProcessor<ProductFeaturesDetailType, ProductFeaturesDetailEntity> {}
XML structure is as follows -
<ProductFeatures>
<Version>12</Version>
<MessageEN>Welcome</MessageEN>
<MessageFR>Bienvenue</MessageFR>
<ProductFeaturesDetail>
<!-- Some elements here -->
</ProductFeaturesDetail>
<ProductFeaturesDetail>
<!-- Some elements here -->
</ProductFeaturesDetail>
<ProductFeaturesDetail>
<!-- Some elements here -->
</ProductFeaturesDetail>
<ProductFeaturesDetail>
<!-- Some elements here -->
</ProductFeaturesDetail>
</ProductFeatures>
Here, ProductFeatures has minOccurs=1, maxOccurs=1.
ProductFeaturesDetail is unbounded.
Each of Version, MessageEN, MessageFR need to be persisted as a record in Table-A which has 2 columns ParamName|ParamValue Each ProductFeaturesDetail is a record to be persisted in Table-B.
i.e. The 3 elements are each mapped to ProductParamEntity (i.e. with
ParamName=Version, ParamValue=12 ; ParamName=MessageEN,
ParamValue=Welcome etc.)
Logically, ProductFeatures is NOT mapped to a table directly but data
is stored in 2 tables as List and
List into tables PRODUCT_PARAM and
PRODUCT_FEATURES_DETAIL respectively.
Here are the Objects - JAXB and Corresponding Entities
#XmlRootElement(name = "ProductFeatures")
public class ProductFeaturesType {
#XmlElement(name = "Version")
protected String version;
#XmlElement(name = "MessageEN")
protected String messageEN;
#XmlElement(name = "MessageFR")
protected String messageFR;
#XmlElement(name = "ProductFeaturesDetail")
protected List<ProductFeaturesDetailType> prodFeaturesDetail;
}
#Entity
#Table(name="PRODUCT_PARAM")
public class ProductParamsEntity extends BaseEntity implements Serializable {
#Id
#Column(name="PARAM_NAME")
private String paramName;
#Column(name="PARAM_VALUE")
private String paramValue;
}
#Entity
#Table(name="PRODUCT_FEATURES_DETAIL")
public class ProductFeaturesDetailEntity extends BaseEntity implements Serializable {
#Id
#Column(name="PROD_CATEGORY")
private String prodCategory;
//---More attributes ---
}
Here is my Job Config :
<batch:job id="consumeProductFileJob" job-repository="jobRepository" restartable="true">
<batch:step id="validateFile" parent="validateXMLSchema">
<batch:next on="COMPLETED" to="persistData" />
<batch:next on="FAILED" to="notifyException" />
</batch:step>
<batch:step id="persistData">
<batch:tasklet transaction-manager="prodHibernateTransactionManager">
<batch:chunk reader="productFileItemReader" processor="productFileItemProcessor" writer="prodItemWriter" commit-interval="500" >
</batch:chunk>
<batch:transaction-attributes isolation="DEFAULT" propagation="REQUIRED" />
</batch:tasklet>
</batch:step>
<batch:step id="notifyException">
-- Do something---
</batch:step>
</batch:job>
Issue : If I specify both as fragmentRootElementNames, only ProductFeatures elements Version, MessageEN, MessageFR are persisted
in Table-A. ProductFeaturesDetail is being ignored.
If I specify just ProductFeaturesDetail, it works fine. But I dont get
the individual elements within ProductFeatures.
I want the data from both elements. What is the way to achieve this?
P.S. I am using HibernateItemWriter for persistence.

maxItemCount property not working for JdbcPagingItemReader

I am setting the maxItemCount property of JdbcPagingItemReader.
I am setting it to 200 but I am getting read/processed/written of 203, 205.
Most of the time I am getting 200 but I get around 200+ commonly.
Why is this happening??
I've checked and there are no same timestamp value for the the sortkey in the 203-205 processed and the max.item.count field is not present in the batch_execution_context entry in the database table.
There is a JdbcPagingItemReader.read.count.max field but it is set to 200.
I am using oracle.
<bean id="batchReader" class="org.springframework.batch.item.database.JdbcPagingItemReader" scope="step">
<property name="dataSource" ref="myDataSource"/>
<property name="queryProvider">
<bean class="org.springframework.batch.item.database.support.SqlPagingQueryProviderFactoryBean">
<property name="dataSource" ref="myDataSource"/>
<property name="selectClause" value="select *" />
<property name="fromClause" value="from TRANSACTION" />
<property name="whereClause" value="where STATUS = 'OK' and TYPE = 200 " />
<property name="sortKey" value="TRANSACTION_TIMESTAMP"/>
</bean>
</property>
<!-- Inject via the ExecutionContext in rangePartitioner -->
<property name="parameterValues">
<map>
</map>
</property>
<property name="maxItemCount" value="200"/>
<property name="pageSize" value="50"/>
<property name="rowMapper">
<bean class="com.mappers.TransactionMapper" scope="step"/>
</property>
</bean>

Processing a large file using spring batch

I have a large file which may contain 100K to 500K records. I am planning to use chunk oriented processing and my thought is
1) Split the large file into smaller based on the count let say 10K in each file.
2) If there are 100K records then I will get 10 files each containing 10K reocrds
3) I would like to partition these 10 files and would like to process using 5 threads. I am thinking to use custom MultiResourcePartioner
4) The 5 threads should process all the 10 files created in split process.
5) I don't want to create same number of threads equal to file count as in that case I may face memory issues. What I am looking is whatever the number of files I would like to process them using only 5 threads (I can increase based on my requirements).
Expert could you let me know this can be achieved using spring batch? If yes could you please share pointers or reference implementations
Thanks in advance
The working job-config xml
<description>Spring Batch File Chunk Processing</description>
<import resource="../config/batch-context.xml" />
<batch:job id="file-partition-batch" job-repository="jobRepository" restartable="false">
<batch:step id="master">
<batch:partition partitioner="partitioner" handler="partitionHandler" />
</batch:step>
</batch:job>
<batch:step id="slave">
<batch:tasklet>
<batch:chunk reader="reader" processor="compositeProcessor"
writer="compositeWriter" commit-interval="5">
</batch:chunk>
</batch:tasklet>
</batch:step>
<bean id="partitionHandler" class="org.springframework.batch.core.partition.support.TaskExecutorPartitionHandler">
<property name="taskExecutor" ref="taskExecutor"/>
<property name="step" ref="slave" />
<property name="gridSize" value="5" />
</bean>
<bean id="partitioner" class="com.poc.partitioner.FileMultiResourcePartitioner">
<property name="resources" value="file:/Users/anupghosh/Documents/Spring_Batch/FilePartitionBatch/*.txt" />
<property name="threadName" value="feed-processor" />
</bean>
<bean id="taskExecutor" class="org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor">
<property name="corePoolSize" value="5" />
<property name="maxPoolSize" value="5" />
</bean>
<bean id="reader" class="org.springframework.batch.item.file.FlatFileItemReader" scope="step">
<property name="resource" value="#{stepExecutionContext['fileName']}" />
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="lineTokenizer">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="delimiter" value="|"/>
<property name="names" value="key,docName,docTypCD,itemType,itemNum,launchDate,status" />
</bean>
</property>
<property name="fieldSetMapper">
<bean class="com.poc.mapper.FileRowMapper" />
</property>
</bean>
</property>
</bean>
<bean id="validatingProcessor" class="org.springframework.batch.item.validator.ValidatingItemProcessor">
<constructor-arg ref="feedRowValidator" />
</bean>
<bean id="feedProcesor" class="com.poc.processor.FeedProcessor" />
<bean id="compositeProcessor" class="org.springframework.batch.item.support.CompositeItemProcessor" scope="step">
<property name="delegates">
<list>
<ref bean="validatingProcessor" />
<ref bean="feedProcesor" />
</list>
</property>
</bean>
<bean id="recordDecWriter" class="com.poc.writer.RecordDecWriter" />
<bean id="reconFlatFileCustomWriter" class="com.poc.writer.ReconFileWriter">
<property name="reconFlatFileWriter" ref="reconFlatFileWriter" />
</bean>
<bean id="reconFlatFileWriter" class="org.springframework.batch.item.file.FlatFileItemWriter" scope="step">
<property name="resource" value="file:/Users/anupghosh/Documents/Spring_Batch/recon-#{stepExecutionContext[threadName]}.txt" />
<property name="shouldDeleteIfExists" value="true" />
<property name="lineAggregator">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineAggregator">
<property name="delimiter" value="|" />
<property name="fieldExtractor">
<bean class="org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor">
<property name="names" value="validationError" />
</bean>
</property>
</bean>
</property>
</bean>
<bean id="compositeWriter" class="org.springframework.batch.item.support.CompositeItemWriter">
<property name="delegates">
<list>
<ref bean="recordDecWriter" />
<ref bean="reconFlatFileCustomWriter" />
</list>
</property>
</bean>
<bean id="feedRowValidator" class="org.springframework.batch.item.validator.SpringValidator">
<property name="validator">
<bean class="com.poc.validator.FeedRowValidator"/>
</property>
</bean>
was able to solve this using MultiResourcePartitioner. below are java config
#Bean
public Partitioner partitioner() {
MultiResourcePartitioner partitioner = new MultiResourcePartitioner();
ClassLoader cl = this.getClass().getClassLoader();
ResourcePatternResolver resolver = new PathMatchingResourcePatternResolver(cl);
Resource[] resources = resolver.getResources("file:" + filePath + "/"+"*.csv");
partitioner.setResources(resources);
partitioner.partition(10);
return partitioner;
}
#Bean
public TaskExecutor taskExecutor() {
ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
taskExecutor.setMaxPoolSize(4);
taskExecutor.afterPropertiesSet();
return taskExecutor;
}
#Bean
#Qualifier("masterStep")
public Step masterStep() {
return stepBuilderFactory.get("masterStep")
.partitioner(ProcessDataStep())
.partitioner("ProcessDataStep",partitioner())
.taskExecutor(taskExecutor())
.listener(pcStressStepListener)
.build();
}
#Bean
#Qualifier("processData")
public Step processData() {
return stepBuilderFactory.get("processData")
.<pojo, pojo> chunk(5000)
.reader(reader)
.processor(processor())
.writer(writer)
.build();
}
#Bean(name="reader")
#StepScope
public FlatFileItemReader<pojo> reader(#Value("#{stepExecutionContext['fileName']}") String filename) {
FlatFileItemReader<pojo> reader = new FlatFileItemReader<>();
reader.setResource(new UrlResource(filename));
reader.setLineMapper(new DefaultLineMapper<pojo>() {
{
setLineTokenizer(new DelimitedLineTokenizer() {
{
setNames(FILE HEADER);
}
});
setFieldSetMapper(new BeanWrapperFieldSetMapper<pojo>() {
{
setTargetType(pojo.class);
}
});
}
});
return reader;
}

Spring Batch Database Dependency pass parameters to SQL statement

I want separate the sql from the batch.xml file, so I defined the sql statement into a properties file. Inside the batch.xml I bind the property-placeholder bean then point to the properties file.
For simple select statement should not be a problem. But if I want to pass the parameter as where clause condition is it possible to do that?
<context:property-placeholder
location="classpath:batch-sql.properties/>
<bean id="secondReader"
class="org.springframework.batch.item.database.JdbcCursorItemReader"
scope="step">
<property name="dataSource" ref="dataSource" />
<property name="sql" value="${sql1}" />
<property name="rowMapper">
<bean class="com.test.batchjob.process.TestPersonMapper" />
</property>
</bean>
This is my sql statment in properties file:
SELECT * FROM Person WHERE id = ?
Can the id pass from jobparameter?
To set the parameters of the query in a JdbcPagingItemReader, you have to use the property parametersValue. This property takes a Map<String,Object> where the key is either the named parameter or the index of the parameter (if you use ?).
<bean id="secondReader"
class="org.springframework.batch.item.database.JdbcPagingItemReader"
scope="step">
<property name="queryProvider">
<bean class="org.springframework.batch.item.database.support.SqlPagingQueryProviderFactoryBean">
<property name="dataSource" ref="dataSource" />
<property name="selectClause" value="select *" />
<property name="fromClause" value="from persons" />
<property name="whereClause" value="where id = ?" />
</bean>
</property>
<property name="parametersValue">
<map>
<entry key="1" value="#{jobParameters['id']}" />
</map>
</property>
<property name="rowMapper">
<bean class="com.test.batchjob.process.TestPersonMapper" />
</property>
</bean>
See documentation : JdbcPagingItemReader
UPDATE
You have to use a QueryProvider instead of sql and datasource properties.
You can replace the text of the query by values of the properties file.
To set the parameters of the query in a JdbcCursorItemReader, you have to use the property preparedStatementSetter. This property takes a PreparedStatementSetter which you have to implement yourself to set either named or index-based parameters.
<bean id="secondReader"
class="org.springframework.batch.item.database.JdbcCursorItemReader"
scope="step">
<property name="sql" value="${sql1}" />
<property name="itemPreparedStatementSetter">
<bean class="xx.xx.xx.YourPreparedStatementSetter">
<property name="id" value="#{jobParameters['id']}" />
</bean>
</property>
<property name="rowMapper">
<bean class="com.test.batchjob.process.TestPersonMapper" />
</property>
An example implementation of a PreparedStatementSetter :
public class YourPreparedStatementSetter implements PreparedStatementSetter {
private String id;
#Override
public void setValues(PreparedStatement ps) throws SQLException {
ps.setString(1, this.id);
}
public setId(String id) {
this.id = id;
}
}
For JdbcCursorItemReader, you can take a look at answer here : Using Spring Batch JdbcCursorItemReader with NamedParameters

Spring Batch Item Reader passing file object instead of resource name?

Generally, ItemReader has resource name as attribute, Can we pass file object to any of the Implementation of ItemReader.
I am using 3 version of Spring Batch API.
UPDATED :::
<bean id="cvsFileItemReader" class="org.springframework.batch.item.file.FlatFileItemReader" scope="step">
<!-- Read a csv file -->
<property name="resource" value="classpath:cvs/I_10000_3ColRem_input_File.csv" />
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<!-- split it -->
<property name="lineTokenizer">
<bean
class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="names" value="customerId,year,month,numPurchases,sow,purchaseAmt,cm,mc,multiChannel,loyalty,productReturn,relationDur,cb" />
</bean>
</property>
<property name="fieldSetMapper">
<!-- return back to reader, rather than a mapped object. -->
<!-- map to an object -->
<bean
class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
<property name="prototypeBeanName" value="testCLV" />
<property name="customEditors">
<map>
<entry key="java.lang.Double">
<ref local="doubleEditor" />
</entry>
</map>
</property>
</bean>
</property>
</bean>
</property>
</bean>
My App.java looks like
ApplicationContext context =
new ClassPathXmlApplicationContext(springConfig);
JobLauncher jobLauncher = (JobLauncher) context.getBean("jobLauncher");
Job job = (Job) context.getBean("reportJob");
try {
long a, b;
a = System.currentTimeMillis();
JobExecution execution = jobLauncher.run(job, new JobParameters());
b = System.currentTimeMillis();
System.out.println("Exit Status : " + execution.getStatus());
System.out.println("jobLauncher.run "+(b-a)+"mil to execute. ("+((b-a)/1000)+" seconds)");
} catch (Exception e) {
e.printStackTrace();
}
System.out.println("Done");
My requirement is follow ::
Through web-application user will upload a file, from which i extracted inputStream
Let say i have a streamInput obj as 'streamInput', how could i inject this to resource of ItemReader and run my job.
You can create a Resource from an InputStream (InputStreamResource) which you could get from a File object. You can read more about the InputStreamResource here: http://docs.spring.io/spring/docs/current/javadoc-api/org/springframework/core/io/InputStreamResource.html