Spring Batch Writing as complex XML output - spring-batch

I am new to Spring Batch and I have to design a task which reads from database and write the data in to multiple XMLs the output format is as follows
<Records xmlns"somevalue" ...>
<Version>1.0</Version>
<SequenceNo>1</SeqeunceNo>
<Date>12/12/2012 12:12:12 PM<Date>
<RecordCount>100</RecordCount><!--This is total number of Update and Insert txns-->
<SenderEmail>asds#asds.com</SenderEmail>
<Transaction type="Update">
<TxnNo>1</TxnNo>
<Details>
<MoreDetails>
</MoreDetails>
</Details>
</Transaction>
<Transaction type="Insert">
<TxnNo>2</TxnNo>
<Details>
<MoreDetails>
</MoreDetails>
</Details>
</Transaction>
<Transaction type="Update">
</Transaction>
<Transaction type="Update">
</Transaction>
</Records>
Please suggest what unmarshaller should I use and how to start on this. Eventually later I have to convert it to multithreading for optimization and performance.

No need to write your own writer. Spring include a MultiResourceItemWriter to write your items into multiple xml.
I'm using a jaxb2Marshaller to write my complex XML.
<bean id="multiItemWriter" class="org.springframework.batch.item.file.MultiResourceItemWriter">
<property name="resource" value="file:data/output/output.xml"/>
<!-- <property name="resourceSuffixCreator" ref="resourceSuffixCreator"/> -->
<property name="saveState" value="true"/>
<property name="itemCountLimitPerResource" value="10"/>
<property name="delegate" ref="itemWriter" />
</bean>
<bean id="itemWriter" class="org.springframework.batch.item.xml.StaxEventItemWriter">
<!-- <property name="resource" value="file:data/output/output.xml" /> -->
<property name="marshaller" ref="customVrdbMarshaller" />
<property name="rootTagName" value="recordings" />
<property name="overwriteOutput" value="true" />
</bean>
<bean id="customVrdbMarshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
<property name="classesToBeBound">
<list>
<value>your.model.model.Albums</value>
</list>
</property>
</bean>

You should code a Writer that writes XML files. Choose a lib and use it in a Writer.
Be careful to write thread safe code for your future multithreading optimization.
An example from Spring Batch samples : XML Processing

Related

Spring Batch - create a new file each time instead of overriding it for transferring data from CSV to XML

I am new to Spring Batch. I was trying to shift data from CSV file to XML file & able to shift it successfully. But when each time I run the code my XML (output file) getting override which I dont want, instead I want to create new output file (old output files should be there, require for data tracking purpose) for each run. How can I do that ?
Here is the my code: What I need to change in below file? Let me know if you need more file code from my side.
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:batch="http://www.springframework.org/schema/batch" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch.xsd
http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd">
<!-- JobRepository and JobLauncher are configuration/setup classes -->
<bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean" />
<bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository" />
</bean>
<!-- ============= ItemReader reads a complete line one by one from input file ============ -->
<bean id="flatFileItemReader" class="org.springframework.batch.item.file.FlatFileItemReader" scope="step">
<!-- Get the Resource file -->
<property name="resource" value="classpath:ExamResult.txt" />
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="fieldSetMapper">
<!-- Mapper which maps each individual items in a record to properties in POJO -->
<bean class="com.websystique.springbatch.mapper.ExamResultFieldSetMapper" />
</property>
<property name="lineTokenizer">
<!-- A tokenizer class to be used when items in input record are separated by specific characters -->
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="delimiter" value="|" />
</bean>
</property>
</bean>
</property>
</bean>
<!-- ======== XML ItemWriter which writes the data in XML format =========== -->
<bean id="xmlItemWriter" class="org.springframework.batch.item.xml.StaxEventItemWriter">
<property name="resource" value="file:xml/ExamResult.xml" />
<property name="rootTagName" value="UniversityExamResultList" />
<property name="marshaller">
<bean class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
<property name="classesToBeBound">
<list>
<value>com.websystique.springbatch.model.ExamResult</value>
</list>
</property>
</bean>
</property>
</bean>
<!-- Optional ItemProcessor to perform business logic/filtering on the input records -->
<bean id="itemProcessor" class="com.websystique.springbatch.processor.ExamResultItemProcessor" />
<!-- Optional JobExecutionListener to perform business logic before and after the job -->
<bean id="jobListener" class="com.websystique.springbatch.listener.ExamResultJobListener" />
<!-- Step will need a transaction manager -->
<bean id="transactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager" />
<!-- ==================== Actual Job =================== -->
<batch:job id="examResultJob">
<batch:step id="step1">
<batch:tasklet transaction-manager="transactionManager">
<batch:chunk reader="flatFileItemReader" writer="xmlItemWriter" processor="itemProcessor" commit-interval="10" />
</batch:tasklet>
</batch:step>
<batch:listeners>
<batch:listener ref="jobListener" />
</batch:listeners>
</batch:job>
</beans>
Try using the Spring Expression Language (SpEL) to add a date and time to the end of the output file name. Something like:
<property name="resource"
value="file:xml/ExamResult-#{new java.text.SimpleDateFormat("Mddyyyyhhmmss").format(new java.util.GregorianCalendar().getTime())}.xml" />

Using optional fields with StaxEventItemReader

I have a Spring Batch application and I'm using the StaxEventItemReader as my ItemReader. By default XStream requires us to declare a property for each possible XML tag or else it throws an UnknownFieldException exception. There are ways to code around this with Java but with Spring Batch, the InputReader doesn't seem to have a way to modify it. Is there a way to flag fields as optional in the xml?
My bean is configured basically like this
<job id="synchronizecustomerData" xmlns="http://www.springframework.org/schema/batch">
<step id="readWritecustomers">
<tasklet>
<chunk reader="customerReader"
processor="customerProcessor"
writer="customerSyncWriter"
commit-interval="1"
skip-policy="alwaysSkip" >
</chunk>
</tasklet>
</step>
</job>
<bean id="customerReader" class="org.springframework.batch.item.xml.StaxEventItemReader">
<property name="fragmentRootElementName" value="customer" />
<property name="resource" ref="inputResource" />
<property name="unmarshaller" ref="customerMarshaller" />
</bean>
<bean id="inputResource" class="org.springframework.core.io.FileSystemResource">
<constructor-arg value="c:/sf/data.xml" />
</bean>
<bean id="customerMarshaller" class="org.springframework.oxm.xstream.XStreamMarshaller">
<property name="aliases">
<util:map id="aliases">
<entry key="customer" value="com.company.batchmaster.sf.beans.customer" />
<entry key="name" value="java.lang.String" />
</util:map>
</property>
</bean>
<bean id="customerProcessor" class="org.springframework.batch.item.support.CompositeItemProcessor">
<property name="delegates">
<list>
<ref bean="customerTransformer" />
</list>
</property>
</bean>
<bean id="customerTransformer" class="com.company.batchmaster.sf.chunk.customerTransformer" />
<bean id="customerSyncWriter" class="com.company.batchmaster.sf.chunk.customerSyncWriter" />
My import file looks like this, just getting it up and running
<?xml version="1.0" encoding="UTF-8"?>
<records>
<customer xmlns="http://springframework.org/batch/sample/io/oxm/domain">
<name>ABC Dealer</name>
<types>CR</types>
</customer>
</records>
Thanks for any help.
I am assuming Customer class has properties name and type.
Annotate it as XmlAttribute.defaultValue() as described in Jaxb guide
No need to this (<entry key="name" value="java.lang.String" />) alias because you are unmarshalling a complete Customer object from a node as specified with <property name="fragmentRootElementName" value="customer" />

In-memory Job-Explorer definition in Spring batch

I was trying to share My in-memory jobRepository to the jobExplorer. But it throws an error as,
Nested exception is
org.springframework.beans.ConversionNotSupportedException:
Failed to convert property value of type '$Proxy1 implementing
org.springframework.batch.core.repository.JobRepository,org.
springframework.aop.SpringProxy,org.springframework.aop.framework.Advised'
to required type
Even i tried putting '&' sign before jobRepository when passing to jobExplorer for sharing.But attempt end in vain.
I am using Spring Batch 2.2.1
Is the dependency for jobExplorer is only database not in-memory?
Definition is,
<bean id="jobRepository"
class="com.test.repository.BatchRepositoryFactoryBean">
<property name="cache" ref="cache" />
<property name="transactionManager" ref="transactionManager" />
</bean>
<bean id="jobOperator" class="test.batch.LauncherTest.TestBatchOperator">
<property name="jobExplorer" ref="jobExplorer" />
<property name="jobRepository" ref="jobRepository" />
<property name="jobRegistry" ref="jobRegistry" />
<property name="jobLauncher" ref="jobLauncher" />
</bean>
<bean id="jobExplorer" class="test.batch.LauncherTest.TestBatchExplorerFactoryBean">
<property name="repositoryFactory" ref="&jobRepository" />
</bean>
<bean id="transactionManager"
class="org.springframework.batch.support.transaction.ResourcelessTransactionManager" />
<bean id="jobLauncher" class="com.scb.smartbatch.core.BatchLauncher">
<property name="jobRepository" ref="jobRepository" />
</bean>
<!-- To store Batch details -->
<bean id="jobRegistry" class="com.scb.smartbatch.repository.SmartBatchRegistry" />
<bean id="jobRegistryBeanPostProcessor"
class="org.springframework.batch.core.configuration.support.JobRegistryBeanPostProcessor">
<property name="jobRegistry" ref="jobRegistry" />
</bean>
<!--Runtime cache of batch executions -->
<bean id="cache" class="com.scb.cache.TCRuntimeCache" />
thanks for your valuable inputs.
But I used '&' before the job repository reference, which allowed me to use it for my job explorer as a shared resource.
problem solved.
kudos.
Usually you have to wire interface instead of implementation.
Else, probably, you have to add <aop:config proxy-target-class="true"> to create CGLIB-based proxy instead of standard Java-based proxy.
Read Spring official documentation about that

Usage of CustomEditor with BeanWrapperFieldExtractor just like with BeanWrapperFieldSetMapper

I have written a simple Spring Batch application that reads a CSV file, does some transforming and writes a modified CSV to the disk.
The reading of the file into domain objects works like a charm. I use DelimitedLineTokenizer to tokenize the lines and a BeanWrapperFieldSetMapper to feed the values into a bean:
<bean id="reader" class="org.springframework.batch.item.file.FlatFileItemReader" scope="step">
<property name="resource" value="#{jobParameters['inputResource']}" />
<property name="linesToSkip" value="1" />
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="lineTokenizer">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="delimiter" value=";" />
<property name="names"
value="ID,NAME,DESCRIPTION,PRICE,DATE" />
</bean>
</property>
<property name="fieldSetMapper">
<bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
<property name="targetType" value="myapp.MyDomainObject" />
<property name="customEditors">
<map>
<entry key="java.util.Date" value-ref="dateEditor" />
<entry key="java.math.BigDecimal" value-ref="numberEditor" />
</map>
</property>
</bean>
</property>
</bean>
</property>
</bean>
I especially like the features of BeanWrapperFieldSetMapper to "guess" the field names and the possibility to define CustomEditors which I use to define the special date and number formats used in the input file.
Now I would like to write the modified file in the same format like the input file.
I use the following configuration:
<bean id="writer" class="org.springframework.batch.item.file.FlatFileItemWriter" scope="step">
<property name="resource" value="#{jobParameters['outputResource']}" />
<property name="lineAggregator">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineAggregator">
<property name="delimiter" value=";" />
<property name="fieldExtractor">
<bean class="org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor">
<property name="names" value="id,name,description,price,date" />
</bean>
</property>
</bean>
</property>
</bean>
There are two things I miss with this configuration:
BeanWrapperFieldSetMapper allowed me to set CustomEditors, but BeanWrapperFieldExtractor has no such possibility. Is there a way to use these?
Is there a way to define the headings in the first line of the file? I have not found any way to write an initial line that is not a bean... It would be great to use the same names here as in BeanWrapperFieldSetMapper such that BeanWrapperFieldExtractor writes the inital line and guesses the bean property namens as BeanWrapperFieldSetMapper does.
The process to load files is so comfortable in Spring Batch. Why is the writing of files so different? Am I missing something?
I have to use Spring Batch 2.1.x because we are using Spring 3.0.x . Therefor an upgrade to 2.2.x would not be an option.
Which is your need? Extract field property as text? You can
use a FormatterLineAggregator if you needs are not too complicated
write your own CustomEditorsFieldExtractor (better)
Generate a complex domain object composed by original domain object and by text-formatted object and use last one as parameter of writer (but breaks your current processor/writer)
Use FlatFileItemWriter.headerCallback: if setted allow custom header write
Writing - in your case - seems a pain respect read process because spring-batch's reading components fits your needs. Standard components fits more used use-case and they cover a lot of scenario. Let us write a custom FieldExtractor sometimes! :)

Spring Batch File Writer Exception handling

I have a Spring Batch process which has following kind of code.
<step id="step1" xmlns="http://www.springframework.org/schema/batch">
<tasklet allow-start-if-complete="true">
<chunk reader="reader1" writer="writer1" commit-interval="10"/>
</tasklet>
</step>
<bean id="writer1" class="org.springframework.batch.item.file.FlatFileItemWriter">
<property name="resource" ref="resourceFlatFile" />
<property name="shouldDeleteIfExists" value="true" />
<property name="transactional" value="true" />
<property name = "lineAggregator">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineAggregator" >
<property name="delimiter" value=""/>
<property name ="fieldExtractor">
<bean class="com.path.MyExtractor" />
</property>
</bean>
</property>
</bean>
Basically my reader gives set of records from database. My writer (writer1) writes it to a flat file. If there is any problem in writing a record to the file, I would like to mark that record status as failed in database. So how to handle these kind of scenarios? Any help is appreciated.
Thanks
My question is if I get any kind of exception
I would recommend you look into using a ItemWriteListener and update the status of the failed records in the onWriteError implementation.