write multi layout in fixed length flat file using spring batch - spring-batch

The requirement is to fetch the data(pls refer below for sample data in file) from database and create fixed length flat file using Spring Batch. The specification for this file is there will be multiple records with the same alignment like 1-5,4-7,8-24...... Now the problem is due to records with same alignment obviously data will get overwritten. I need some solution to resolve this.
My sample code:
Here bean1,bean2 are the mapped bean object for the table.
<!-- ======================================================= -->
<!-- Jobs Definition -->
<!-- ======================================================= -->
<!-- Main processes -->
<!-- Active -->
<job id="FixedLengthFlatFileGenerationJob" xmlns="http://www.springframework.org/schema/batch">
<step id="FixedLengthFlatFileGenerationStep">
<tasklet start-limit="1">
<chunk reader="SampleReader" writer="SampleWriter"
commit-interval="1000">
</chunk>
<listeners>
<listener ref="stepExecutionListener"/>
</listeners>
</tasklet>
<next on="COMPLETED" to="FileTransferStep"/>
<next on="STOPPED" to="SendMailOnFailure"/>
<fail on="*"/>
</step>
<step id="SendMailOnFailure">
<tasklet ref="OnFailureTasklet"/>
</step>
<step id="FileTransferStep">
<tasklet ref="FileTransferTasklet" />
</step>
</job>
<!-- ======================================================= -->
<!-- Readers -->
<!-- ======================================================= -->
<bean id="SampleReader"
class="org.springframework.batch.item.database.JdbcCursorItemReader">
<property name="dataSource" ref="SampleDataSource" />
<property name="sql">
<value>
SQL READ FROM TABLE
</value>
</property>
<property name="rowMapper" ref="SampleMapper" />
</bean>
<!-- ======================================================= -->
<!-- Writers -->
<!-- ======================================================= -->
<bean id="SampleWriter"
class="org.springframework.batch.item.support.CompositeItemWriter">
<property name="delegates">
<list>
<ref local="FileTransferWriter" />
</list>
</property>
</bean>
<bean id="FileTransferWriter" class="org.springframework.batch.item.file.FlatFileItemWriter" scope="step">
<property name="resource" value="file:filelocation/file.txt" />
<property name="lineAggregator">
<bean class="org.springframework.batch.item.file.transform.FormatterLineAggregator">
<property name="fieldExtractor">
<bean class="org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor">
<property name="names" value="bean1,bean2" />
</bean>
</property>
<property name="format" value="%-20s%-30s" />
</bean>
</property>
</bean>
Sample Fixedlengthflatfile.txt
file.txt TM 45150605033000
UJK5457 0000000000HC605-B045285 D34151631115600 A
BHJ5457 724570420 34151631315600 77014 ct scan for therapy guide Physical Therapy 1S 001 002060 O AA
NTS5457 This is a test for Policy Number 5457
UJK8334 0000000000HC605-B045285 D34151631315600 A
QWS6334 724570420 34151631315600 72142 mri neck spine w/dye Occupational Therapy 2V001 002060 O AA
ETS4334 This is a test for Policy Number 4334.
RYT6313 0000000000HC216-B406574 D34151611115600 A

Related

Sprint batch update insert on heavy db takes time

I have implemented a spring batch, which reads data from csv files and insert and update based on records
I have a table XXX037 which has 800k records and it takes too much time inserting and updating it.
I have used spring batch configuration, commit-interval of 1000. Still it takes time to process
Is there any way i can improve performance?
Configiraion
<batch:job id="pcqJob">
<batch:step id="pcqStep">
<batch:tasklet>
<batch:chunk reader="pcqReader" writer="compositeWriter" commit-interval="1000">
<!-- <batch:skippable-exception-classes>
<batch:include class="javax.persistence.PersistenceException"/>
</batch:skippable-exception-classes> -->
</batch:chunk>
</batch:tasklet>
</batch:step>
</batch:job>
<!-- <bean id="skipPolicy" class="com.test.domain.services.writer.SkipPolicy">
<property name="skipLimit" value="2500"/>
</bean> -->
<bean id="compositeWriter" class="org.springframework.batch.item.support.ClassifierCompositeItemWriter">
<property name="classifier">
<bean class="org.springframework.classify.BackToBackPatternClassifier">
<property name="routerDelegate">
<bean class="com.test.domain.services.writer.ItemCodeClassifier" />
</property>
<property name="matcherMap">
<map>
<entry key="*Doss*" value-ref="fileItemWriter1" />
<entry key="*Ldt*" value-ref="fileItemWriter2" />
<entry key="*Old*" value-ref="oldDossierfileItemWriter" />
<entry key="*Tpm*" value-ref="tpmfileItemWriter" />
<entry key="*Txm*" value-ref="tpxfileItemWriter" />
<entry key="*DoD*" value-ref="dossierDeletefileItemWriter" />
<entry key="*LdD*" value-ref="ldtDeletefileItemWriter" />
<entry key="*TpD*" value-ref="tpmDeletefileItemWriter" />
<entry key="*TxD*" value-ref="txmDeletefileItemWriter" />
</map>
</property>
</bean>
</property>
</bean>

Spring Batch - create a new file each time instead of overriding it for transferring data from CSV to XML

I am new to Spring Batch. I was trying to shift data from CSV file to XML file & able to shift it successfully. But when each time I run the code my XML (output file) getting override which I dont want, instead I want to create new output file (old output files should be there, require for data tracking purpose) for each run. How can I do that ?
Here is the my code: What I need to change in below file? Let me know if you need more file code from my side.
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:batch="http://www.springframework.org/schema/batch" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch.xsd
http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd">
<!-- JobRepository and JobLauncher are configuration/setup classes -->
<bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean" />
<bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository" />
</bean>
<!-- ============= ItemReader reads a complete line one by one from input file ============ -->
<bean id="flatFileItemReader" class="org.springframework.batch.item.file.FlatFileItemReader" scope="step">
<!-- Get the Resource file -->
<property name="resource" value="classpath:ExamResult.txt" />
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="fieldSetMapper">
<!-- Mapper which maps each individual items in a record to properties in POJO -->
<bean class="com.websystique.springbatch.mapper.ExamResultFieldSetMapper" />
</property>
<property name="lineTokenizer">
<!-- A tokenizer class to be used when items in input record are separated by specific characters -->
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="delimiter" value="|" />
</bean>
</property>
</bean>
</property>
</bean>
<!-- ======== XML ItemWriter which writes the data in XML format =========== -->
<bean id="xmlItemWriter" class="org.springframework.batch.item.xml.StaxEventItemWriter">
<property name="resource" value="file:xml/ExamResult.xml" />
<property name="rootTagName" value="UniversityExamResultList" />
<property name="marshaller">
<bean class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
<property name="classesToBeBound">
<list>
<value>com.websystique.springbatch.model.ExamResult</value>
</list>
</property>
</bean>
</property>
</bean>
<!-- Optional ItemProcessor to perform business logic/filtering on the input records -->
<bean id="itemProcessor" class="com.websystique.springbatch.processor.ExamResultItemProcessor" />
<!-- Optional JobExecutionListener to perform business logic before and after the job -->
<bean id="jobListener" class="com.websystique.springbatch.listener.ExamResultJobListener" />
<!-- Step will need a transaction manager -->
<bean id="transactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager" />
<!-- ==================== Actual Job =================== -->
<batch:job id="examResultJob">
<batch:step id="step1">
<batch:tasklet transaction-manager="transactionManager">
<batch:chunk reader="flatFileItemReader" writer="xmlItemWriter" processor="itemProcessor" commit-interval="10" />
</batch:tasklet>
</batch:step>
<batch:listeners>
<batch:listener ref="jobListener" />
</batch:listeners>
</batch:job>
</beans>
Try using the Spring Expression Language (SpEL) to add a date and time to the end of the output file name. Something like:
<property name="resource"
value="file:xml/ExamResult-#{new java.text.SimpleDateFormat("Mddyyyyhhmmss").format(new java.util.GregorianCalendar().getTime())}.xml" />

Mapping JPA entity to more than one entityManagers with SpringBatch program

I have developed SpringBatch application and deployed as Web Application in Websphere Liberty profile container. The batch program is designed to read records from a table and invokes HTTP service. Based on the service response a column named status is updated as RECORD_SENT/COMPLETE/ERROR type.
Objective is to reuse the same program for multiple datasources. The data source is passed in job parameter using client type. The datasources are in different schemas but having same datamodel.
Question: How does the transaction manager can be applied at run time inside Job Step or Tasklet?. Seeking help in this regard.
Configuration:
<bean id="entityManagerFactory1"
class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
<property name="dataSource" ref="dataSource1" />
<property name="persistenceUnitName" value="user" />
<property name="jpaVendorAdapter">
<bean class="org.springframework.orm.jpa.vendor.HibernateJpaVendorAdapter">
<property name="showSql" value="false" />
</bean>
</property>
<property name="jpaDialect">
<bean class="org.springframework.orm.jpa.vendor.HibernateJpaDialect" />
</property>
</bean>
<bean id="entityManagerFactory2"
class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
<property name="dataSource" ref="dataSource2" />
<property name="persistenceUnitName" value="user" />
<property name="jpaVendorAdapter">
<bean class="org.springframework.orm.jpa.vendor.HibernateJpaVendorAdapter">
<property name="showSql" value="false" />
</bean>
</property>
<property name="jpaDialect">
<bean class="org.springframework.orm.jpa.vendor.HibernateJpaDialect" />
</property>
</bean>
<bean id="entityManagerSelector" class="*com.spring.jpa.test.EntitymanagerSelector">
<property name="entityManagerFactory1" ref="entityManagerFactory1"></property>
<property name="entityManagerFactory2" ref="entityManagerFactory2"></property>
</bean>
job.xml snippet
<bean id="itemReader" class="org.springframework.batch.item.database.JpaPagingItemReader" scope="step">
<property name="entityManagerFactory" value="#{entityManagerSelector.getEntitymanagerForClient({jobParameters['client']})}" />
<property name="queryString" value="select u from User u where u.age > #{jobParameters['age']}" />
</bean>
Setting the job parameters during runtime to identify the client
JobParameters param = new JobParametersBuilder()
.addString("age", "20").addString("client", "client2")
.toJobParameters();
JobExecution execution = jobLauncher.run(job, param);
It will not be possible for you to set the transaction-manager of the Step/tasklet during runtime. You will be better off creating a separate Job's for each client and using their own transaction manager in the tasklet.
<bean id="transactionManager1" class="org.springframework.orm.jpa.JpaTransactionManager">
<property name="entityManagerFactory" ref="entityManagerFactory1" />
</bean>
<bean id="transactionManager2" class="org.springframework.orm.jpa.JpaTransactionManager">
<property name="entityManagerFactory" ref="entityManagerFactory2" />
</bean>
Now use these transaction manager when creating the batch job's
<job id="testJob1" xmlns="http://www.springframework.org/schema/batch">
<step id="client1step1">
<tasklet transaction-manager="transactionManager1">
<chunk reader="itemReader" writer="itemWriter" commit-interval="1" />
</tasklet>
</step>
</job>
<job id="testJob2" xmlns="http://www.springframework.org/schema/batch">
<step id="client2step2">
<tasklet transaction-manager="transactionManager2">
<chunk reader="itemReader" writer="itemWriter" commit-interval="1" />
</tasklet>
</step>
</job>
Let me know if this works out.

Using optional fields with StaxEventItemReader

I have a Spring Batch application and I'm using the StaxEventItemReader as my ItemReader. By default XStream requires us to declare a property for each possible XML tag or else it throws an UnknownFieldException exception. There are ways to code around this with Java but with Spring Batch, the InputReader doesn't seem to have a way to modify it. Is there a way to flag fields as optional in the xml?
My bean is configured basically like this
<job id="synchronizecustomerData" xmlns="http://www.springframework.org/schema/batch">
<step id="readWritecustomers">
<tasklet>
<chunk reader="customerReader"
processor="customerProcessor"
writer="customerSyncWriter"
commit-interval="1"
skip-policy="alwaysSkip" >
</chunk>
</tasklet>
</step>
</job>
<bean id="customerReader" class="org.springframework.batch.item.xml.StaxEventItemReader">
<property name="fragmentRootElementName" value="customer" />
<property name="resource" ref="inputResource" />
<property name="unmarshaller" ref="customerMarshaller" />
</bean>
<bean id="inputResource" class="org.springframework.core.io.FileSystemResource">
<constructor-arg value="c:/sf/data.xml" />
</bean>
<bean id="customerMarshaller" class="org.springframework.oxm.xstream.XStreamMarshaller">
<property name="aliases">
<util:map id="aliases">
<entry key="customer" value="com.company.batchmaster.sf.beans.customer" />
<entry key="name" value="java.lang.String" />
</util:map>
</property>
</bean>
<bean id="customerProcessor" class="org.springframework.batch.item.support.CompositeItemProcessor">
<property name="delegates">
<list>
<ref bean="customerTransformer" />
</list>
</property>
</bean>
<bean id="customerTransformer" class="com.company.batchmaster.sf.chunk.customerTransformer" />
<bean id="customerSyncWriter" class="com.company.batchmaster.sf.chunk.customerSyncWriter" />
My import file looks like this, just getting it up and running
<?xml version="1.0" encoding="UTF-8"?>
<records>
<customer xmlns="http://springframework.org/batch/sample/io/oxm/domain">
<name>ABC Dealer</name>
<types>CR</types>
</customer>
</records>
Thanks for any help.
I am assuming Customer class has properties name and type.
Annotate it as XmlAttribute.defaultValue() as described in Jaxb guide
No need to this (<entry key="name" value="java.lang.String" />) alias because you are unmarshalling a complete Customer object from a node as specified with <property name="fragmentRootElementName" value="customer" />

Spring Batch - Steps to Improve Performance

I am currently developing data loaders.Reading a file and writing to database. I am using partition handler to process multiple Comma Separated files in 30 threads. I want to scale and have throughput.Daily i receive 15000 files(each having 1 million records ) , how do i scale using spring batch.i want the job to complete this within a day.Do we have any open source grid computing , that can do this fairly, or is there any simple fine tuning steps.
The spring batch data loader runs stand alone. There is no web container involved. it runs on single solaris machine having 24 cpus. The data is written in to single database.default isolation and propagation is provided.The xml config is given below:
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:batch="http://www.springframework.org/schema/batch"
xmlns:p="http://www.springframework.org/schema/p"
xmlns:aop="http://www.springframework.org/schema/aop"
xmlns:context="http://www.springframework.org/schema/context"
xmlns:tx="http://www.springframework.org/schema/tx"
xmlns:task="http://www.springframework.org/schema/task"
xsi:schemaLocation="http://www.springframework.org/schema/aop http://www.springframework.org/schema/aop/spring-aop.xsd
http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch-2.1.xsd
http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd
http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context-3.0.xsd
http://www.springframework.org/schema/tx http://www.springframework.org/schema/tx/spring-tx-3.0.xsd
http://www.springframework.org/schema/task http://www.springframework.org/schema/task/spring-task-3.0.xsd">
<!-- IMPORT DB CONFIG -->
<import resource="classpath:bom/bom/bomloader/job/DataSourcePoolConfig.xml" />
<!-- USE ANNOTATIONS TO CONFIGURE SPRING BEANS -->
<context:component-scan base-package="bom.bom.bom" />
<!-- INJECT THE PROCESS PARAMS HASHMAP BEFORE CONTEXT IS INITIALISED -->
<bean id="holder" class="bom.bom.bom.loader.util.PlaceHolderBean" >
<property name="beanName" value="holder"/>
</bean>
<bean id="logger" class="bom.bom.bom.loader.util.PlaceHolderBean" >
<property name="beanName" value="logger"/>
</bean>
<bean id="dataMap" class="java.util.concurrent.ConcurrentHashMap" />
<!-- JOB REPOSITORY - WE USE DATABASE REPOSITORY -->
<!-- <bean id="jobRepository" class="org.springframework.batch.core.repository.support.JobRepositoryFactoryBean" >-->
<!-- <property name="transactionManager" ref="frdtransactionManager" />-->
<!-- <property name="dataSource" ref="frddataSource" />-->
<!-- <property name="databaseType" value="oracle" />-->
<!-- <property name="tablePrefix" value="batch_"/> -->
<!-- </bean>-->
<!-- JOB REPOSITORY - WE IN MEMORY REPOSITORY -->
<bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean">
<property name="transactionManager" ref="frdtransactionManager" />
</bean>
<!-- <bean id="jobExplorer" class="org.springframework.batch.core.explore.support.JobExplorerFactoryBean">-->
<!-- <property name="dataSource" ref="frddataSource" />-->
<!-- <property name="tablePrefix" value="batch_"/> -->
<!-- </bean>-->
<!-- LAUNCH JOBS FROM A REPOSITORY -->
<bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository" />
<property name="taskExecutor">
<bean class="org.springframework.core.task.SyncTaskExecutor" />
</property>
</bean>
<!-- CONFIGURE SCHEDULING IN QUARTZ -->
<!-- <bean id="jobDetail" class="org.springframework.scheduling.quartz.JobDetailBean">-->
<!-- <property name="jobClass" value="bom.bom.bom.assurance.core.JobLauncherDetails" />-->
<!-- <property name="group" value="quartz-batch" />-->
<!-- <property name="jobDataAsMap">-->
<!-- <map>-->
<!-- <entry key="jobName" value="${jobname}"/>-->
<!-- <entry key="jobLocator" value-ref="jobRegistry"/>-->
<!-- <entry key="jobLauncher" value-ref="jobLauncher"/>-->
<!-- </map>-->
<!-- </property>-->
<!-- </bean>-->
<!-- RUN EVERY 2 HOURS -->
<!-- <bean class="org.springframework.scheduling.quartz.SchedulerFactoryBean">-->
<!-- <property name="triggers">-->
<!-- <bean id="cronTrigger" class="org.springframework.scheduling.quartz.CronTriggerBean">-->
<!-- <property name="jobDetail" ref="jobDetail" />-->
<!-- <property name="cronExpression" value="2/0 * * * * ?" />-->
<!-- </bean>-->
<!-- </property>-->
<!-- </bean>-->
<!-- -->
<!-- RUN STANDALONE -->
<bean id="jobRunner" class="bom.bom.bom.loader.core.DataLoaderJobRunner">
<constructor-arg value="${LOADER_NAME}" />
</bean>
<!-- Get all the files for the exchanges and feed as resource to the MultiResourcePartitioner -->
<bean id="fileresource" class="bom.bom.bom.loader.util.FiltersFoldersResourceFactory" p:dataMap-ref="dataMap">
<property name="filePath" value="${PARENT_PATH}" />
<property name="acceptedFolders" value="${EXCH_CODE}" />
<property name="logger" ref="logger" />
</bean>
<!-- The network Data Loading Configuration goes here -->
<job id="CDR_network _PARALLEL" xmlns="http://www.springframework.org/schema/batch" restartable="false" >
<step id="PREPARE_CLEAN" >
<flow parent="prepareCleanFlow" />
<next on="COMPLETED" to="LOAD_EXCHANGE_DATA" />
<fail on="FAILED" exit-code="Failed on cleaning error records."/>
</step>
<step id="LOAD_EXCHANGE_DATA" >
<tasklet ref="businessData" transaction-manager="ratransactionManager" />
<next on="COMPLETED" to="LOAD_CDR_FILES" />
<fail on="FAILED" exit-code="FAILED ON LOADING EXCHANGE INFORMATION FROM DB." />
</step>
<step id="LOAD_CDR_FILES" >
<tasklet ref="fileresource" transaction-manager="frdtransactionManager" />
<next on="COMPLETED" to="PROCESS_FILE_TO_STAGING_TABLE_PARALLEL" />
<fail on="FAILED" exit-code="FAILED ON LOADING CDR FILES." />
</step>
<step id="PROCESS_FILE_TO_STAGING_TABLE_PARALLEL" next="limitDecision" >
<partition step="filestep" partitioner="filepartitioner" >
<handler grid-size="100" task-executor="executorWithCallerRunsPolicy" />
</partition>
</step>
<decision id="limitDecision" decider="limitDecider">
<next on="COMPLETED" to="MOVE_RECS_STAGING_TO_MAIN_TABLE" />
<next on="CONTINUE" to="PROCESS_FILE_TO_STAGING_TABLE_PARALLEL" />
</decision>
<step id="MOVE_RECS_STAGING_TO_MAIN_TABLE" >
<tasklet ref="moveRecords" transaction-manager="ratransactionManager" >
<transaction-attributes isolation="SERIALIZABLE"/>
</tasklet>
<fail on="FAILED" exit-code="FAILED ON MOVING DATA TO THE MAIN TABLE." />
<next on="*" to="PREPARE_ARCHIVE"/>
</step>
<step id="PREPARE_ARCHIVE" >
<flow parent="prepareArchiveFlow" />
<fail on="FAILED" exit-code="FAILED ON Archiving files" />
<end on="*" />
</step>
</job>
<flow id="prepareCleanFlow" xmlns="http://www.springframework.org/schema/batch">
<step id="CLEAN_ERROR_RECORDS" next="archivefileExistsDecisionInFlow" >
<tasklet ref="houseKeeping" transaction-manager="ratransactionManager" />
</step>
<decision id="archivefileExistsDecisionInFlow" decider="archivefileExistsDecider">
<end on="NO_ARCHIVE_FILE" />
<next on="ARCHIVE_FILE_EXISTS" to="runprepareArchiveFlow" />
</decision>
<step id="runprepareArchiveFlow" >
<flow parent="prepareArchiveFlow" />
</step>
</flow>
<flow id="prepareArchiveFlow" xmlns="http://www.springframework.org/schema/batch" >
<step id="ARCHIVE_CDR_FILES" >
<tasklet ref="archiveFiles" transaction-manager="frdtransactionManager" />
</step>
</flow>
<bean id="archivefileExistsDecider" class="bom.bom.bom.loader.util.ArchiveFileExistsDecider" >
<property name="logger" ref="logger" />
<property name="frdjdbcTemplate" ref="frdjdbcTemplate" />
</bean>
<bean id="filepartitioner" class="org.springframework.batch.core.partition.support.MultiResourcePartitioner" scope="step" >
<property name="resources" value="#{dataMap[processFiles]}"/>
</bean>
<task:executor id="executorWithCallerRunsPolicy"
pool-size="90-95"
queue-capacity="6"
rejection-policy="CALLER_RUNS"/>
<!-- <bean id="dynamicJobParameters" class="bom.bom.bom.assurance.core.DynamicJobParameters" />-->
<bean id="houseKeeping" class="bom.bom.bom.loader.core.HousekeepingOperation">
<property name="logger" ref="logger" />
<property name="jdbcTemplate" ref="rajdbcTemplate" />
<property name="frdjdbcTemplate" ref="frdjdbcTemplate" />
</bean>
<bean id="businessData" class="bom.bom.bom.loader.core.BusinessValidatorData">
<property name="logger" ref="logger" />
<property name="jdbcTemplate" ref="NrajdbcTemplate" />
<property name="param" value="${EXCH_CODE}" />
<property name="sql" value="${LOOKUP_QUERY}" />
</bean>
<step id="filestep" xmlns="http://www.springframework.org/schema/batch">
<tasklet transaction-manager="ratransactionManager" allow-start-if-complete="true" >
<chunk writer="jdbcItenWriter" reader="fileItemReader" processor="itemProcessor" commit-interval="500" retry-limit="2">
<retryable-exception-classes>
<include class="org.springframework.dao.DeadlockLoserDataAccessException"/>
</retryable-exception-classes>
</chunk>
<listeners>
<listener ref="customStepExecutionListener">
</listener>
</listeners>
</tasklet>
</step>
<bean id="moveRecords" class="bom.bom.bom.loader.core.MoveDataFromStaging">
<property name="logger" ref="logger" />
<property name="jdbcTemplate" ref="rajdbcTemplate" />
</bean>
<bean id="archiveFiles" class="bom.bom.bom.loader.core.ArchiveCDRFile" >
<property name="logger" ref="logger" />
<property name="jdbcTemplate" ref="frdjdbcTemplate" />
<property name="archiveFlag" value="${ARCHIVE_FILE}" />
<property name="archiveDir" value="${ARCHIVE_LOCATION}" />
</bean>
<bean id="limitDecider" class="bom.bom.bom.loader.util.LimitDecider" p:dataMap-ref="dataMap">
<property name="logger" ref="logger" />
</bean>
<!-- <bean id="multifileReader" class="org.springframework.batch.item.file.MultiResourceItemReader" scope="step" >-->
<!-- <property name="resources" value="#{stepExecutionContext[fileName]}" />-->
<!-- <property name="delegate" ref="fileItemReader" />-->
<!-- </bean>-->
<!-- READ EACH FILE PARALLELY -->
<bean id="fileItemReader" scope="step" autowire-candidate="false" parent="itemReaderParent">
<property name="resource" value="#{stepExecutionContext[fileName]}" />
<property name="saveState" value="false" />
</bean>
<!-- LISTEN AT THE END OF EACH FILE TO DO POST PROCESSING -->
<bean id="customStepExecutionListener" class="bom.bom.bom.loader.core.StagingStepExecutionListener" scope="step">
<property name="logger" ref="logger" />
<property name="frdjdbcTemplate" ref="frdjdbcTemplate" />
<property name="jdbcTemplate" ref="rajdbcTemplate" />
<property name="sql" value="${INSERT_IA_QUERY_COLUMNS}" />
</bean>
<!-- CONFIGURE THE ITEM PROCESSOR TO DO BUSINESS LOGIC ON EACH ITEM -->
<bean id="itemProcessor" class="bom.bom.bom.loader.core.StagingLogicProcessor" scope="step">
<property name="logger" ref="logger" />
<property name="params" ref="businessData" />
</bean>
<!-- CONFIGURE THE JDBC ITEM WRITER TO WRITE IN TO DB -->
<bean id="jdbcItenWriter" class="org.springframework.batch.item.database.JdbcBatchItemWriter" scope="step">
<property name="dataSource" ref="radataSource"/>
<property name="sql">
<value>
<![CDATA[
${SQL1A}
]]>
</value>
</property>
<property name="itemSqlParameterSourceProvider">
<bean class="org.springframework.batch.item.database.BeanPropertyItemSqlParameterSourceProvider">
</bean>
</property>
</bean>
<!-- <bean id="itemWriter" class="bom.bom.bom.assurance.core.LoaderDBWriter" scope="step">-->
<!-- <property name="sQL" value="${loader.sql}" />-->
<!-- <property name="jdbcTemplate" ref="NrajdbcTemplate" />-->
<!-- </bean>-->
<!-- CONFIGURE THE FLAT FILE ITEM READER TO READ INDIVIDUAL BATCH -->
<bean id="itemReaderParent" class="org.springframework.batch.item.file.FlatFileItemReader" abstract="true">
<property name="strict" value="false"/>
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="lineTokenizer">
<bean class="org.springframework.batch.item.file.transform.FixedLengthTokenizer">
<property name="names" value="${COLUMNS}" />
<property name="columns" value="${RANGE}" />
</bean>
</property>
<property name="fieldSetMapper">
<bean class="bom.bom.bom.loader.util.DataLoaderMapper">
<property name="params" value="${BEANPROPERTIES}"/>
</bean>
</property>
</bean>
</property>
</bean>
</beans>
Tried:
i could see that the ThreadPoolExecutor hangs after 3 hours.The prstat in solaris says it is processing, but no processing in the log.
Tried with less chunk size 500 ,due memory foot print,no progress.
Since it inserts in to single database( 30 pooled connections).is there anythin i can do here.
Instances from visual vm
stacktrace of thread all are locked at connection level
Full thread dump Java HotSpot(TM) Server VM (11.3-b02 mixed mode):
"Attach Listener" daemon prio=3 tid=0x00bbf800 nid=0x26 waiting on condition [0x00000000..0x00000000]
java.lang.Thread.State: RUNNABLE
"executorWithCallerRunsPolicy-1" prio=3 tid=0x008a7000 nid=0x25 runnable [0xd5a7d000..0xd5a7fb70]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at oracle.net.ns.Packet.receive(Packet.java:240)
at oracle.net.ns.DataPacket.receive(DataPacket.java:92)
at oracle.net.ns.NetInputStream.getNextPacket(NetInputStream.java:172)
at oracle.net.ns.NetInputStream.read(NetInputStream.java:117)
at oracle.net.ns.NetInputStream.read(NetInputStream.java:92)
at oracle.net.ns.NetInputStream.read(NetInputStream.java:77)
at oracle.jdbc.driver.T4CMAREngine.unmarshalUB1(T4CMAREngine.java:1034)
at oracle.jdbc.driver.T4CMAREngine.unmarshalSB1(T4CMAREngine.java:1010)
at oracle.jdbc.driver.T4C8Oall.receive(T4C8Oall.java:588)
at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:194)
at oracle.jdbc.driver.T4CPreparedStatement.executeForRows(T4CPreparedStatement.java:953)
at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1222)
at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3387)
at oracle.jdbc.driver.OraclePreparedStatement.executeUpdate(OraclePreparedStatement.java:3468)
- locked <0xdbdafa30> (a oracle.jdbc.driver.T4CConnection)
at oracle.jdbc.driver.OraclePreparedStatementWrapper.executeUpdate(OraclePreparedStatementWrapper.java:1350)
at org.springframework.jdbc.core.JdbcTemplate$2.doInPreparedStatement(JdbcTemplate.java:818)
at org.springframework.jdbc.core.JdbcTemplate$2.doInPreparedStatement(JdbcTemplate.java:1)
at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:587)
at org.springframework.jdbc.core.JdbcTemplate.update(JdbcTemplate.java:812)
at org.springframework.jdbc.core.JdbcTemplate.update(JdbcTemplate.java:868)
at org.springframework.jdbc.core.JdbcTemplate.update(JdbcTemplate.java:876)
at
I would suggest you lower the chunk size to 50.
500 seems to be too big : you wait too much while talking with the DB.
At the same time, lower the TaskExecutor's pool size or increase your DB pool size.
You can choose which on by watching your DB host : if it's CPU and IO is not maxxed, increase your DB pool size to increase the DB load. If your DB CPU is already at it's maximum, lower the TaskExecutor's pool size. The objective is to have a fluid process.
I think the DB will be your main limitating element. So begin by adjusting the DB pool size according to the DB host capacities. When it's done, adjust your TaskExecutor's pool size according to the DB pool size (TE pool size = DB pool size * 1.5), plus the batch's host capacities (CPU, memory and IOs).
Splitting your incoming files on multiple hard drives may help too (if possible).
I think the problem here is million records in the file. Since you already reduced the chunk size , you should process smaller records. For testing sake, reduce the number of records in each file to 10k. My guess is you creating creating objects, doing some processing and you are doing this for 1m records in a loop. Each thread will hold the object in memory unless the processing is completed. My guess is because of volume of data, there are too many objects in your memory which are not garbage collected. If reducing the size helps, then you can try to use lightweight objects in your code and try setting each object to null at end of processing.
just a fix in your cron expression. This is the correct for 2 hours:
0 0 0/2 1/1 * ? *
Is your batch job relying on reflection(e.g., BeanPropertyRowMapper)? That can hamper performance.
If your database is causing problems, you may want to profile it. Don't have much concrete to offer here.
Already mentioned, drop that chunk size.