Spring Batch MultiResourceItemWriter doesn't correctly writes data in files - spring-batch

This is my SpringBatch maven depedency:
<dependency>
<groupId>org.springframework.batch</groupId>
<artifactId>spring-batch-core</artifactId>
<version>2.2.0.RELEASE</version>
</dependency>
Below is my job.xml file
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:batch="http://www.springframework.org/schema/batch"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/batch
http://www.springframework.org/schema/batch/spring-batch-2.2.xsd
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-3.2.xsd">
<import resource="../config/launch-context.xml" />
<bean id="inputFileForMultiResource" class="org.springframework.core.io.FileSystemResource" scope="step">
<constructor-arg value="src/main/resources/files/customerInputValidation.txt"/>
</bean>
<bean id="outputFileForMultiResource" class="org.springframework.core.io.FileSystemResource" scope="step">
<constructor-arg value="src/main/resources/files/xml/customerOutput.xml"/>
</bean>
<bean id="readerForMultiResource" class="org.springframework.batch.item.file.FlatFileItemReader">
<property name="resource" ref="inputFileForMultiResource" />
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="lineTokenizer">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="names" value="firstName,middleInitial,lastName,address,city,state,zip" />
<property name="delimiter" value="," />
</bean>
</property>
<property name="fieldSetMapper">
<bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
<property name="prototypeBeanName" value="customer5TO" />
</bean>
</property>
</bean>
</property>
</bean>
<bean id="customer5TO" class="net.gsd.group.spring.batch.tutorial.to.Customer5TO"></bean>
<bean id="xmlOutputWriter" class="org.springframework.batch.item.xml.StaxEventItemWriter">
<property name="resource" ref="outputFileForMultiResource" />
<property name="marshaller" ref="customerMarshallerJMSJob" />
<property name="rootTagName" value="customers" />
</bean>
<bean id="customerMarshallerJMSJob" class="org.springframework.oxm.xstream.XStreamMarshaller">
<property name="aliases">
<map>
<entry key="customer" value="net.gsd.group.spring.batch.tutorial.to.Customer5TO"></entry>
</map>
</property>
</bean>
<bean id="multiResourceItemWriter" class="org.springframework.batch.item.file.MultiResourceItemWriter">
<property name="resource" ref="customerOutputXmlFile"/>
<property name="delegate" ref="xmlOutputWriter"/>
<property name="itemCountLimitPerResource" value="3"/>
<property name="resourceSuffixCreator" ref="suffix"/>
</bean>
<bean id="suffix" class="net.gsd.group.spring.batch.tutorial.util.CustomerOutputFileSuffixCreator"></bean>
<batch:step id="multiResourceWriterParentStep">
<batch:tasklet>
<batch:chunk
reader="readerForMultiResource"
writer="multiResourceItemWriter"
commit-interval="3">
</batch:chunk>
</batch:tasklet>
</batch:step>
<batch:job id="multiResourceWriterJob">
<batch:step id="multiResourceStep" parent="multiResourceWriterParentStep"/>
</batch:job>
</beans>
Basically I have one input file and more output files. I read the data from the input file (which contains 10 rows) and I want to write it by chunks of 3 into more than one output file (in my case there will be 4 files 3/3/3/1). For each chunk there will be one file.
The job works correctly but the content of the output files is wrong.
Each of the files contain the last element that has bean read in the current chunk.
Let's say the first chunk reads A, B and C. When this chunk is written to the file only the C is written and is written 3 times.
From my tests it works correctly only when the chunk commit-interval is 1.
Is this the correct behaviour?
What am I doing wrong?
Thank you in advance

Your customer5TO bean should have scope prototype. Currently it is a singleton, so the bean properties will always be overwritten with the last item.

Related

Spring Batch - create a new file each time instead of overriding it for transferring data from CSV to XML

I am new to Spring Batch. I was trying to shift data from CSV file to XML file & able to shift it successfully. But when each time I run the code my XML (output file) getting override which I dont want, instead I want to create new output file (old output files should be there, require for data tracking purpose) for each run. How can I do that ?
Here is the my code: What I need to change in below file? Let me know if you need more file code from my side.
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:batch="http://www.springframework.org/schema/batch" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch.xsd
http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd">
<!-- JobRepository and JobLauncher are configuration/setup classes -->
<bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean" />
<bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository" />
</bean>
<!-- ============= ItemReader reads a complete line one by one from input file ============ -->
<bean id="flatFileItemReader" class="org.springframework.batch.item.file.FlatFileItemReader" scope="step">
<!-- Get the Resource file -->
<property name="resource" value="classpath:ExamResult.txt" />
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="fieldSetMapper">
<!-- Mapper which maps each individual items in a record to properties in POJO -->
<bean class="com.websystique.springbatch.mapper.ExamResultFieldSetMapper" />
</property>
<property name="lineTokenizer">
<!-- A tokenizer class to be used when items in input record are separated by specific characters -->
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="delimiter" value="|" />
</bean>
</property>
</bean>
</property>
</bean>
<!-- ======== XML ItemWriter which writes the data in XML format =========== -->
<bean id="xmlItemWriter" class="org.springframework.batch.item.xml.StaxEventItemWriter">
<property name="resource" value="file:xml/ExamResult.xml" />
<property name="rootTagName" value="UniversityExamResultList" />
<property name="marshaller">
<bean class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
<property name="classesToBeBound">
<list>
<value>com.websystique.springbatch.model.ExamResult</value>
</list>
</property>
</bean>
</property>
</bean>
<!-- Optional ItemProcessor to perform business logic/filtering on the input records -->
<bean id="itemProcessor" class="com.websystique.springbatch.processor.ExamResultItemProcessor" />
<!-- Optional JobExecutionListener to perform business logic before and after the job -->
<bean id="jobListener" class="com.websystique.springbatch.listener.ExamResultJobListener" />
<!-- Step will need a transaction manager -->
<bean id="transactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager" />
<!-- ==================== Actual Job =================== -->
<batch:job id="examResultJob">
<batch:step id="step1">
<batch:tasklet transaction-manager="transactionManager">
<batch:chunk reader="flatFileItemReader" writer="xmlItemWriter" processor="itemProcessor" commit-interval="10" />
</batch:tasklet>
</batch:step>
<batch:listeners>
<batch:listener ref="jobListener" />
</batch:listeners>
</batch:job>
</beans>
Try using the Spring Expression Language (SpEL) to add a date and time to the end of the output file name. Something like:
<property name="resource"
value="file:xml/ExamResult-#{new java.text.SimpleDateFormat("Mddyyyyhhmmss").format(new java.util.GregorianCalendar().getTime())}.xml" />

RetryLogic using AOP is not working for MongoItemWritrer , but the same is working for my custom Readers and writers

RetryLogic using AOP is not working for MongoItemWritrer , but the same is working for my custom Readers and writers
Is there anything that I'm doing wrong here.
Below is the code snipopet.
<bean id="retryAdvice"
class="org.springframework.retry.interceptor.RetryOperationsInterceptor">
<property name="retryOperations" ref="taskBatchRetryTemplate" />
</bean>
<bean id="taskBatchRetryTemplate" class="org.springframework.retry.support.RetryTemplate">
<property name="retryPolicy" ref="genericRetryPolicy" />
<property name="backOffPolicy">
<bean class="org.springframework.retry.backoff.ExponentialBackOffPolicy">
<property name="initialInterval" value="${mongoloader.backOffPeriod.initialInterval}"/>
<property name="maxInterval" value="${mongoloader.backOffPeriod.maxInterval}"/>
<property name="multiplier" value="${mongoloader.backOffPeriod.multiplier}"/>
</bean>
</property>
<property name="listeners">
<bean class="com.company.ens.myload.job.StepRetryListener"/>
</property>
</bean>
<bean id="genericRetryPolicy" class="org.springframework.retry.policy.SimpleRetryPolicy" >
<constructor-arg index="0" value="${mongoloader.retry.limit}"/>
<constructor-arg index="1">
<map>
<entry key="org.springframework.data.mongodb.CannotGetMongoDbConnectionException" value="true"/>
<entry key="org.springframework.jdbc.CannotGetJdbcConnectionException" value="true"/>
<!-- Just included the below exception for testing purpose, needs to be removed -->
<entry key="java.io.FileNotFoundException" value="true"/>
<entry key="org.springframework.dao.DuplicateKeyException" value="true"/>
</map>
</constructor-arg>
</bean>
//including only my step configution here
<batch:step id="Step4a-MainFlow_TranslateRawFedObjectsToFilingModel"
allow-start-if-complete="false">
<batch:tasklet>
<batch:chunk reader="rawFedObjectMongoReader"
processor="translatingProcessor" writer="SctModelMongoWriter"
commit-interval="100"/>
</batch:tasklet>
<batch:listeners>
<batch:listener ref="step4aListener"/>
</batch:listeners>
</batch:step>
Did not find any posts related to my question. Any help would be appreciated.

Use named parameters in queries Spring batch

I have a spring batch job in which a step is as follows:
<bean id="abstractReader" class="org.springframework.batch.item.database.JdbcCursorItemReader" abstract="true">
<property name="fetchSize" value="1000"/>
<property name="verifyCursorPosition" value="true"/>
<property name="rowMapper">
<bean class="org.springframework.jdbc.core.ColumnMapRowMapper"/>
</property>
</bean>
<bean id="masterReader" parent="abstractReader" abstract="true">
<property name="fetchSize" value="1000"/>
<property name="dataSource" ref="masterDataSource"/>
</bean>
<bean id="abstractWriter" class="org.springframework.batch.item.database.JdbcBatchItemWriter" abstract="true">
<property name="assertUpdates" value="false"/>
<property name="itemPreparedStatementSetter">
<bean class="org.springframework.batch.item.database.support.ColumnMapItemPreparedStatementSetter"/>
</property>
</bean>
<bean id="masterWriter" parent="abstractWriter" abstract="true">
<property name="dataSource" ref="masterDataSource"/>
</bean>
<bean id="tempWriter" parent="masterWriter" scope="step">
<property name="sql" value="${insert_query}"/>
</bean>
<bean id="tempReader" parent="masterReader" scope="step">
<property name="sql" value="${select_query}"/>
</bean>
<batch:step id="tempStep">
<batch:tasklet>
<batch:chunk commit-interval="100"
reader="tempReader"
writer="tempWriter"/>
</batch:tasklet>
</batch:step>
Is there a way to bring named parameter support in the queries? Currently JdbcCursorItemReader is using PreparedStatement. (Too many ? in queries now)
There isn't a way with the JdbcCursorItemReader however you can do it with the JdbcPagingItemReader. You can read more about that reader in the documentation here: https://docs.spring.io/spring-batch/apidocs/org/springframework/batch/item/database/JdbcPagingItemReader.html

File to file basics

Being new to Spring Batch, I wanted to begin with something simple....Reading a csv file and writing the same objects(records) to another one. Simple, isn't it ? But I was unable to find a working sample. After some time of research, I found some things which almost work....The file I want to write to is allways empty. Is it because I use a ressourcelesstransactionmanager ? Do I need to declare some optional property somewhere to flush the thing on my hard drive ?
By the way, I find the documentation on the subject very light and confusing, for a beginner. Maybe it's because one must earn spring Batch...
Here's the evil but very simple code that's driving me crazy.
TIA.
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:batch="http://www.springframework.org/schema/batch"
xmlns:aop="http://www.springframework.org/schema/aop"
xmlns:tx="http://www.springframework.org/schema/tx"
xmlns:p="http://www.springframework.org/schema/p"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
http://www.springframework.org/schema/batch
http://www.springframework.org/schema/batch/spring-batch-2.0.xsd
http://www.springframework.org/schema/aop
http://www.springframework.org/schema/aop/spring-aop-2.5.xsd
http://www.springframework.org/schema/tx
http://www.springframework.org/schema/tx/spring-tx-2.1.xsd">
<bean id="jobRepository"
class="org.springframework.batch.core.repository.support.SimpleJobRepository">
<constructor-arg>
<bean
class="org.springframework.batch.core.repository.dao.MapJobInstanceDao"/>
</constructor-arg>
<constructor-arg>
<bean
class="org.springframework.batch.core.repository.dao.MapJobExecutionDao"/>
</constructor-arg>
<constructor-arg>
<bean
class="org.springframework.batch.core.repository.dao.MapStepExecutionDao"/>
</constructor-arg>
<constructor-arg>
<bean
class="org.springframework.batch.core.repository.dao.MapExecutionContextDao"/>
</constructor-arg>
</bean>
<bean id="jobRepository-transactionManager"
class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/>
<bean id="jobLauncher"
class="org.springframework.batch.core.launch.support.SimpleJobLauncher"
p:jobRepository-ref="jobRepository"/>
<!-- Les bean projet -->
<bean id="csvReader"
class="org.springframework.batch.item.file.FlatFileItemReader"
p:resource="file:c:\AppPerso\csvManager\client.csv">
<property name="lineMapper">
<bean
class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="lineTokenizer">
<bean
class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer"
p:delimiter=";"
p:names="client_id;client_status;client_name;client_surname;client_dnais;client_addr1;client_addr2;client_addr3;client_addr4"/>
</property>
<property name="fieldSetMapper">
<bean
class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper"
p:targetType="com.bigmac.spring.batch.csv.Client"/>
</property>
</bean>
</property>
</bean>
<bean id="csvWriter"
class="org.springframework.batch.item.file.FlatFileItemWriter">
<property name="resource" value="file:c:\AppPerso\csvManager\client-reformat.csv"/>
<property name="shouldDeleteIfExists" value="true"/>
<property name="lineAggregator">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineAggregator">
<property name="delimiter" value=";"/>
<property name="fieldExtractor">
<bean class="org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor">
<property name="names" value="client_id;client_status;client_name;client_surname;client_dnais;client_addr1;client_addr2;client_addr3;client_addr4"/>
</bean>
</property>
</bean>
</property>
</bean>
<batch:job id="CSVManager" job-repository="jobRepository">
<batch:step id="step1">
<batch:tasklet
transaction-manager="jobRepository-transactionManager">
<batch:chunk
reader="csvReader"
writer="csvWriter"
commit-interval="10"/>
</batch:tasklet>
</batch:step>
</batch:job>
</beans>
Sorry guys.
My mistake. I was getting confused with delimiters...The lack of properly configured log4j wasn't helping either.
I put the working configuration here.
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:batch="http://www.springframework.org/schema/batch"
xmlns:aop="http://www.springframework.org/schema/aop"
xmlns:tx="http://www.springframework.org/schema/tx"
xmlns:p="http://www.springframework.org/schema/p"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
http://www.springframework.org/schema/batch
http://www.springframework.org/schema/batch/spring-batch-2.0.xsd
http://www.springframework.org/schema/aop
http://www.springframework.org/schema/aop/spring-aop-2.5.xsd
http://www.springframework.org/schema/tx
http://www.springframework.org/schema/tx/spring-tx-2.1.xsd">
<bean id="jobRepository"
class="org.springframework.batch.core.repository.support.SimpleJobRepository">
<constructor-arg>
<bean
class="org.springframework.batch.core.repository.dao.MapJobInstanceDao"/>
</constructor-arg>
<constructor-arg>
<bean
class="org.springframework.batch.core.repository.dao.MapJobExecutionDao"/>
</constructor-arg>
<constructor-arg>
<bean
class="org.springframework.batch.core.repository.dao.MapStepExecutionDao"/>
</constructor-arg>
<constructor-arg>
<bean
class="org.springframework.batch.core.repository.dao.MapExecutionContextDao"/>
</constructor-arg>
</bean>
<bean id="jobRepository-transactionManager"
class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/>
<bean id="jobLauncher"
class="org.springframework.batch.core.launch.support.SimpleJobLauncher"
p:jobRepository-ref="jobRepository"/>
<!-- Les bean projet -->
<bean id="csvReader"
class="org.springframework.batch.item.file.FlatFileItemReader"
p:resource="file:c:\AppPerso\springBatch\csvManager\client.csv">
<property name="lineMapper">
<bean
class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="lineTokenizer">
<bean
class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer"
p:delimiter=";"
p:names="client_id,client_status,client_name,client_surname,client_dnais,client_addr1,client_addr2,client_addr3,client_addr4"/>
</property>
<property name="fieldSetMapper">
<bean
class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper"
p:targetType="com.bigmac.spring.batch.csv.Client"/>
</property>
</bean>
</property>
</bean>
<bean id="csvWriter"
class="org.springframework.batch.item.file.FlatFileItemWriter">
<property name="resource" value="file:c:\AppPerso\springBatch\csvManager\client-reformat.csv"/>
<property name="shouldDeleteIfExists" value="true"/>
<property name="lineAggregator">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineAggregator">
<property name="delimiter" value=";"/>
<property name="fieldExtractor">
<bean class="org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor">
<property name="names" value="client_id,client_status,client_name,client_surname,client_dnais,client_addr1,client_addr2,client_addr3,client_addr4"/>
</bean>
</property>
</bean>
</property>
</bean>
<batch:job id="CSVManager" job-repository="jobRepository">
<batch:step id="step1">
<batch:tasklet
transaction-manager="jobRepository-transactionManager">
<batch:chunk
reader="csvReader"
writer="csvWriter"
commit-interval="10"/>
</batch:tasklet>
</batch:step>
</batch:job>
</beans>

Making Spring Batch ItemReader restartable

The spring Batch program which I am working on is reading data from a table. It’s using ‘org.springframework.batch.item.database.JdbcCursorItemReader’ itemReader . Earlier the plan was to Alter table and add a PROCESSED_INDICATOR flag and prepopulate it with status ‘PENDING’. Once the record is processed and writer will update the status of PROCESSED_INDICATOR flag to ‘Processed’. This is to support re-startability . For example if batch picks up 1 million records and died in ½ million records then when I restart the batch; it should start where I have left off.
But unfortunately, management didn’t approve this solution. I am digging in ways to make itemreader re-startable. As per Spring documentation “Most ItemReaders have much more sophisticated restart logic. The JdbcCursorItemReader, for example, stores the row id of the last processed row in the Cursor.”
Does anyone have any sample example of such custom reader which implements JdbcCursorItemReader and stores last processed row in the cursor.
https://docs.spring.io/spring-batch/trunk/reference/html/readersAndWriters.html
==FULL XML CONFIGURATION==
<import resource="classpath:/batch/utility/skip/batch_skip.xml" />
<import resource="classpath:/batch/config/context-postgres.xml" />
<import resource="classpath:/batch/config/oracle-database.xml" />
<context:property-placeholder
location="classpath:/batch/jobs/TPF-1001-DD-01/TPF-1001-DD-01.properties" />
<bean id="gridSizePartitioner"
class="com.tpf.partitioner.GridSizePartitioner" />
<task:executor id="taskExecutor" pool-size="${pool.size}" />
<batch:job id="XYZJob" job-repository="jobRepository"
restartable="true">
<batch:step id="XYZSTEP">
<batch:description>Convert TIF files to PDF</batch:description>
<batch:partition partitioner="gridSizePartitioner">
<batch:handler task-executor="taskExecutor"
grid-size="${pool.size}" />
<batch:step>
<batch:tasklet allow-start-if-complete="true">
<batch:chunk commit-interval="${commit.interval}"
skip-limit="${job.skip.limit}">
<batch:reader>
<bean id="timeReader"
class="org.springframework.batch.item.database.JdbcCursorItemReader"
scope="step">
<property name="dataSource" ref="oracledataSource" />
<property name="sql">
<value>
select TIME_ID as timesheetId,count(*),max(CREATION_DATETIME) as creationDateTime , ILN_NUMBER as ilnNumber
from TS_FAKE_NAME
where creation_datetime >= '#{jobParameters['creation_start_date1']} 12.00.00.000000000 AM'
and creation_datetime < '#{jobParameters['creation_start_date2']} 11.59.59.999999999 PM'
and mod(time_id,${pool.size})=#{stepExecutionContext['partition.id']}
group by time_id ,ILN_NUMBER
</value>
</property>
<property name="rowMapper">
<bean
class="org.springframework.jdbc.core.BeanPropertyRowMapper">
<property name="mappedClass"
value="com.tpf.model.Time" />
</bean>
</property>
</bean>
</batch:reader>
<batch:processor>
<bean id="compositeItemProcessor"
class="org.springframework.batch.item.support.CompositeItemProcessor">
<property name="delegates">
<list>
<ref bean="timeProcessor" />
</list>
</property>
</bean>
</batch:processor>
<batch:writer>
<bean id="compositeItemWriter"
class="org.springframework.batch.item.support.CompositeItemWriter">
<property name="delegates">
<list>
<ref bean="timeWriter" />
</list>
</property>
</bean>
</batch:writer>
<batch:skippable-exception-classes>
<batch:include
class="com.utility.skip.BatchSkipException" />
</batch:skippable-exception-classes>
<batch:listeners>
<batch:listener ref="batchSkipListener" />
</batch:listeners>
</batch:chunk>
</batch:tasklet>
</batch:step>
</batch:partition>
</batch:step>
<batch:validator>
<bean
class="org.springframework.batch.core.job.DefaultJobParametersValidator">
<property name="requiredKeys">
<list>
<value>batchRunNumber</value>
<value>creation_start_date1</value>
<value>creation_start_date2</value>
</list>
</property>
</bean>
</batch:validator>
</batch:job>
<bean id="timesheetWriter" class="com.tpf.writer.TimeWriter"
scope="step">
<property name="dataSource" ref="dataSource" />
</bean>
<bean id="timeProcessor"
class="com.tpf.processor.TimeProcessor" scope="step">
<property name="dataSource" ref="oracledataSource" />
</bean>
Does anyone have any sample example of such custom reader which implements JdbcCursorItemReader and stores last processed row in the cursor
The JdbcCursorItemReader does that, see Javadoc, here is an excerpt:
ExecutionContext: The current row is returned as restart data,
and when restored from that same data, the cursor is opened and the current row
set to the value within the restart data.
So you don't need a custom reader.