Spring batch: Create file by setting name programmatically - spring-batch

I have a spring batch job (defined in xml) which generates the csv export.
Inside FlatFileItemWriter bean I am setting resource, where the name of file is set.
<bean id="customDataFileWriter" class="org.springframework.batch.item.file.FlatFileItemWriter" scope="step">
<property name="resource" value="file:/tmp/export/custom-export.csv"/>
...
Now I need to set this file name taking account a certain logic, so I need to set the file name from some java class. Any ideas?

Use the different builder classes of spring batch (job builder, step builder, and so on). Have a look at https://blog.codecentric.de/en/2013/06/spring-batch-2-2-javaconfig-part-1-a-comparison-to-xml/ to get an idea.

You can implement your own FlatFileItemWriter to override the method setResource and add your own logic to rename the file.
Here's an example implementation :
#Override
public void setResource(Resource resource) {
if (resource instanceof ClassPathResource) {
// Convert resource
ClassPathResource res = (ClassPathResource) resource;
try {
String path = res.getPath();
// Do something to "path" here
File file = new File(path);
// Check for permissions to write
if (file.canWrite() || file.createNewFile()) {
file.delete();
// Call parent setter with new resource
super.setResource(new FileSystemResource(file.getAbsolutePath()));
return;
}
} catch (IOException e) {
// File could not be read/written
}
}
// If something went wrong or resource was delegated to MultiResourceItemWriter,
// call parent setter with default resource
super.setResource(resource);
}
Another possibility exists with the use of jobParameters, if your logic can be applied before job is launched. See 5.4 Late Binding of Spring Batch Documentation.
Example :
<bean id="flatFileItemReader" scope="step" class="org.springframework.batch.item.file.FlatFileItemReader">
<property name="resource" value="#{jobParameters['input.file.name']}" />
</bean>
You can also use a MultiResourceItemWriter with a custom ResourceSuffixCreator. That will let you create 1 to n files with a common filename pattern.
Here's an example of the method getSuffix of a custom ResourceSuffixCreator :
#Override
public String getSuffix(int index) {
// Your logic
if (true)
return "XXX" + index;
else
return "";
}

Related

Create Log4net rolling file which is based on date

I need to create files for 45 separate locations (example: Boston, London, etc). And these file names have to be based on the date. Also can I provide a maximum file size to roll the files and the maximum number of files to roll.
Basically a file name must look like : Info_Boston_(2019.02.25).txt
So far I have come up with the below code to get by date. But I couldn't limit the file size to 1MB. The file grows beyond 1MB, and a new rolling file is not created. Please assist
<appender name="MyAppenderInfo" type="log4net.Appender.RollingFileAppender">
<param name="File" value="C:\\ProgramData\\Service\\Org\\Info"/>
<param name="RollingStyle" value="Date"/>
<param name="DatePattern" value="_(yyyy.MM.dd).\tx\t"/>
<param name="StaticLogFileName" value="false"/>
<maxSizeRollBackups value="10" />
<maximumFileSize value="1MB" />
<appendToFile value="true" />
<lockingModel type="log4net.Appender.FileAppender+MinimalLock" />
<layout type="log4net.Layout.PatternLayout">
<conversionPattern value="%date %message%n" />
</layout>
<filter type="log4net.Filter.LevelRangeFilter">
<levelMin value="DEBUG" />
<levelMax value="INFO" />
</filter>
</appender>
To address your specific post, I would not do this with a config based approach, as it would get rather cumbersome to manage I would think. A more programmatic approach would be to generate the logging instances dynamically.
EDIT: I took down the original to post this reworked example based on this SO post log4net: different logs on different file appenders at runtime
EDIT-2: I had to rework this again, as I realized I had omitted some required parts, and had some things wrong after the rework. This is tested and working. However, a few things to note, you will need to provide the using statements on the controller, to the logging class you make. next, you will need to DI your logging directories in as I have done, or come up with another method of providing the list of log file outputs.
This will allow you to very cleanly dynamically generate as many logging instances as you need to, to as many independent locations as you would like. I pulled this example from a project I did, and modified it a bit to fit your needs. Let me know if you have questions.
Create a Dynamic logger class which inherits from the base logger in the heirarchy:
using log4net;
using log4net.Repository.Hierarchy;
public sealed class DynamicLogger : Logger
{
private const string REPOSITORY_NAME = "somename";
internal DynamicLogger(string name) : base(name)
{
try
{
// try and find an existing repository
base.Hierarchy = (log4net.Repository.Hierarchy.Hierarchy)LogManager.GetRepository(REPOSITORY_NAME);
} // try
catch
{
// it doesnt exist, make it.
base.Hierarchy = (log4net.Repository.Hierarchy.Hierarchy)LogManager.CreateRepository(REPOSITORY_NAME);
} // catch
} // ctor(string)
} // DynamicLogger
then, build out a class to manage the logging instances, and build the new loggers:
using log4net;
using log4net.Appender;
using log4net.Config;
using log4net.Core;
using log4net.Filter;
using log4net.Layout;
using log4net.Repository;
using Microsoft.Extensions.Options;
using System.Collections.Generic;
using System.Linq;
public class LogFactory
{
private static List<ILog> _Loggers = new List<ILog>();
private static LoggingConfig _Settings;
private static ILoggerRepository _Repository;
public LogFactory(IOptions<LoggingConfig> configuration)
{
_Settings = configuration.Value;
ConfigureRepository(REPOSITORY_NAME);
} // ctor(IOptions<LoggingConfig>)
/// <summary>
/// Configures the primary logging repository.
/// </summary>
/// <param name="repositoryName">The name of the repository.</param>
private void ConfigureRepository(string repositoryName)
{
if(_Repository == null)
{
try
{
_Repository = LogManager.CreateRepository(repositoryName);
}
catch
{
// repository already exists.
_Repository = LogManager.GetRepository(repositoryName);
} // catch
} // if
} // ConfigureRepository(string)
/// <summary>
/// Gets a named logging instance, if it exists, and creates it if it doesnt.
/// </summary>
/// <param name="name"></param>
/// <returns></returns>
public ILog GetLogger(string name)
{
string filePath = string.Empty;
switch (name)
{
case "core":
filePath = _Settings.CoreLoggingDirectory;
break;
case "image":
filePath = _Settings.ImageProcessorLoggingDirectory;
break;
} // switch
if (_Loggers.SingleOrDefault(a => a.Logger.Name == name) == null)
{
BuildLogger(name, filePath);
} // if
return _Loggers.SingleOrDefault(a => a.Logger.Name == name);
} // GetLogger(string)
/// <summary>
/// Dynamically build a new logging instance.
/// </summary>
/// <param name="name">The name of the logger (Not file name)</param>
/// <param name="filePath">The file path you want to log to.</param>
/// <returns></returns>
private ILog BuildLogger(string name, string filePath)
{
// Create a new filter to include all logging levels, debug, info, error, etc.
var filter = new LevelMatchFilter();
filter.LevelToMatch = Level.All;
filter.ActivateOptions();
// Create a new pattern layout to determine the format of the log entry.
var pattern = new PatternLayout("%d %-5p %c %m%n");
pattern.ActivateOptions();
// Dynamic logger inherits from the hierarchy logger object, allowing us to create dynamically generated logging instances.
var logger = new DynamicLogger(name);
logger.Level = Level.All;
// Create a new rolling file appender
var rollingAppender = new RollingFileAppender();
// ensures it will not create a new file each time it is called.
rollingAppender.AppendToFile = true;
rollingAppender.Name = name;
rollingAppender.File = filePath;
rollingAppender.Layout = pattern;
rollingAppender.AddFilter(filter);
// allows us to dynamically generate the file name, ie C:\temp\log_{date}.log
rollingAppender.StaticLogFileName = false;
// ensures that the file extension is not lost in the renaming for the rolling file
rollingAppender.PreserveLogFileNameExtension = true;
rollingAppender.DatePattern = "yyyy-MM-dd";
rollingAppender.RollingStyle = RollingFileAppender.RollingMode.Date;
// must be called on all attached objects before the logger can use it.
rollingAppender.ActivateOptions();
logger.AddAppender(rollingAppender);
// Sets the logger to not inherit old appenders, or the core appender.
logger.Additivity = false;
// sets the loggers effective level, determining what level it will catch log requests for and log them appropriately.
logger.Level = Level.Info;
// ensures the new logger does not inherit the appenders of the previous loggers.
logger.Additivity = false;
// The very last thing that we need to do is tell the repository it is configured, so it can bind the values.
_Repository.Configured = true;
// bind the values.
BasicConfigurator.Configure(_Repository, rollingAppender);
LogImpl newLog = new LogImpl(logger);
_Loggers.Add(newLog);
return newLog;
} // BuildLogger(string, string)
} // LogFactory
Then, in your Dependency Injection you can inject your log factory. You can do that with something like this:
services.AddSingleton<LogFactory>();
Then in your controller, or any constructor really, you can just do something like this:
private LogFactory _LogFactory;
public HomeController(LogFactory logFactory){
_LogFactory = logFactory;
}
public async Task<IActionResult> Index()
{
ILog logger1 = _LogFactory.GetLogger("core");
ILog logger2 = _LogFactory.GetLogger("image");
logger1.Info("SomethingHappened on logger 1");
logger2.Info("SomethingHappened on logger 2");
return View();
}
This example will output:
2019-03-07 10:41:21,338 INFO core SomethingHappened on logger 1
in its own file called Core_2019-03-07.log
and also:
2019-03-07 11:06:29,155 INFO image SomethingHappened on logger 2
in its own file called Image_2019-03-07
Hope that makes more sense!

Scheduler Spring boot

#Bean
public LockProvider lockProvider(DataSource dataSource) {
return new JdbcTemplateLockProvider(dataSource);
}
#Bean
public ScheduledLockConfiguration taskScheduler(LockProvider lockProvider) {
return ScheduledLockConfigurationBuilder
.withLockProvider(lockProvider)
.withPoolSize(10)
.withDefaultLockAtMostFor(Duration.ofMinutes(10))
.build();
}
My requirement is to run only single scheduler at only one instance in clustered enviroment. For this i am using shedlock, but problem is that at server startup i am getting the below exception, "java.lang.ClassCastException: net.javacrumbs.shedlock.spring.SpringLockableTaskSchedulerFactoryBean cannot be cast to org.springframework.scheduling.concurrent.ThreadPoolTaskScheduler"
Help me on this.
You can easily do this with dlock. You simply do the following and add registrar to your xml config.
Java Code
#TryLock(name = "doSomeWork", owner = "serviceA", lockFor = ONE_MINUTE)
public void doSomeWork() {
//...
}
XML Config
<!-- A bean for the lock implementation. Note that there should be only one global implementation-->
<bean id="postgresLock" class="com.yusufaytas.dlock.jdbc.PostgresIntervalLock">
<constructor-arg type="javax.sql.DataSource" ref="lockDataSource"/>
</bean>
<!-- The lock gets auto-registered to the registrar -->
<bean id="lockRegistrar" class="com.yusufaytas.dlock.spring.IntervalLockRegistrar"/>

spring receive emails without xml (using annotations only)

I need to periodically check about 30 mailboxes and want to do this with annotations only. I know how to do it with XML files, it looks like this:
<mail:inbound-channel-adapter id="ImapAdapter"
store-uri="imaps://${login}:${pass}#${host}:993/inbox"
channel="testReceiveEmailChannel"
should-delete-messages="false"
should-mark-messages-as-read="true"
auto-startup="true"
java-mail-properties="javaMailProperties">
<int:poller fixed-delay="200"
time-unit="SECONDS"
task-executor="asyncTaskExecutor"/>
</mail:inbound-channel-adapter>
<int:channel id="testReceiveEmailChannel">
<int:interceptors>
<int:wire-tap channel="logger"/>
</int:interceptors>
</int:channel>
<int:service-activator input-channel="testReceiveEmailChannel"
ref="testMailReceiverService"
method="receive"/>
<bean id="testMailReceiverService" class="com.myproject.email.EmailReceiverService">
<property name="mailBox" value="${login}"/>
</bean>
<int:logging-channel-adapter id="logger" level="DEBUG"/>
I know that Spring 4+ have #InboundChannelAdapter but I dont know how to use it. Actually I am new in Spring, so any helps very appreciated!
You are looking into the correct way - #InboundChannelAdapter. If you take a look to the Documentation properly, you'll see something like this:
#Bean
#InboundChannelAdapter(value = "testReceiveEmailChannel", poller = #Poller(fixedDelay = "200000", taskExecutor = "asyncTaskExecutor"))
public MessageSource<javax.mail.Message> mailMessageSource(MailReceiver mailReceiver) {
MailReceivingMessageSource mailReceivingMessageSource = new MailReceivingMessageSource(mailReceiver);
// other setters here
return mailReceivingMessageSource;
}
Where MailReceiver is something like this:
#Bean
public MailReceiver imapMailReceiver(#Value("imaps://${login}:${pass}#${host}:993/inbox") storeUrl) {
ImapMailReceiver imapMailReceiver = new ImapMailReceiver(storeUrl);
// other setters here
return imapMailReceiver;
}
and so with other #Beans for MessageChannel and #ServiceActivator for your EmailReceiverService.
Consider as a tool for Java Configuration the Spring Integration Java DSL.

Write a Spring batch custom item writer

I need to write a Spring batch custom item writer that uses a footer, but I can't use the delegate pattern.
Is there another way to write a Spring batch custom item writer?
Thank you in advance.
Create a custom ItemWriter that implements ItemStream (to manage restartability and footer writing) and overwrite the next methods:
ItemWrite.write(List<> items): write items and during writing perform necessary data calculation for footer
ItemStream.update(ExecutionContext): save calculated footer data in write() method
ItemStream.open(ExecutionContext): restore previously saved footer data
ItemStream.close(): do real footer writing (directly in your own writer or using a callback)
Check here
Basically you need to create a class that implements ItemWriter<YourModel> and FlatFileFooterCallback
In the write method, enter how data will be written and in the writeFooter the footer of the file.
Then declare your class as a bean and put it as a writer in your job.
I've found the solution. I can't write a custom itemwriter, but I created a bean out and I have overridden the toString method. In this method I have set the output to the file as needed. Then, I created a PassThroughLineAggregator type itemwriter. This itemwriter calls the toString method of the bean output. And that's all!!
Here's the code:
MOH_Diaria_Bean_Out.java:
package es.lac.absis.batch.app.percai.domain;
import java.util.ArrayList;
import java.util.List;
public class MOH_Diaria_Bean_Out {
List<MOH_Diaria_Bean> listaBeans = new ArrayList<MOH_Diaria_Bean>();
public List<MOH_Diaria_Bean> getListaBeans() {
return listaBeans;
}
public void setListaBeans(List<MOH_Diaria_Bean> listaBeans) {
this.listaBeans = listaBeans;
}
public void add (MOH_Diaria_Bean bean){
listaBeans.add(bean);
}
#Override
public String toString() {
// TODO Auto-generated method stub
String salida="";
for (int j=0; j<listaBeans.size(); j++) {
MOH_Diaria_Bean bean = listaBeans.get(j);
salida = salida + bean.toString();
if (j<(listaBeans.size()-1)) {
salida = salida + "\n";
}
}
return salida;
}
}
ItemWriter:
<bean id="MOH_FusionadoFicheros_Writer" class="es.lac.absis.batch.arch.internal.writer.AbsisFlatFileItemWriter">
<property name="resource">
<bean class="es.lac.absis.batch.arch.internal.util.AbsisFileSystemResource">
<constructor-arg ref="filePCA00020"></constructor-arg>
</bean>
</property>
<property name="encoding" value="ISO8859_1"></property>
<property name="lineAggregator">
<bean class="org.springframework.batch.item.file.transform.PassThroughLineAggregator">
</bean>
</property>
</bean>

How to process logically related rows after ItemReader in SpringBatch?

Scenario
To make it simple, let's suppose I have an ItemReader that returns me 25 rows.
The first 10 rows belong to student A
The next 5 belong to student B
and the 10 remaining belong to student C
I want to aggregate them together logically say by studentId and flatten them to end up with one row per student.
Problem
If I understand correctly, setting the commit interval to 5 will do the following:
Send 5 rows to the Processor (which will aggregate them or do any business logic I tell it to).
After Processed will write 5 rows.
Then it will do it again for the next 5 rows and so on.
If that is true, then for the next five I will have to check the already written ones, get them out aggregate them to the ones that I am currently processing and write them again.
I personally do no like that.
What is the best practice to handle a situation like this in Spring Batch?
Alternative
Sometimes I feel that it is much easier to write a regular Spring JDBC main program and then I have full control of what I want to do. However, I wanted to take advantage of of the job repository state monitoring of the job, ability to restart, skip, job and step listeners....
My Spring Batch Code
My module-context.xml
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:batch="http://www.springframework.org/schema/batch"
xsi:schemaLocation="http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch-2.1.xsd
http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd">
<description>Example job to get you started. It provides a skeleton for a typical batch application.</description>
<batch:job id="job1">
<batch:step id="step1" >
<batch:tasklet transaction-manager="transactionManager" start-limit="100" >
<batch:chunk reader="attendanceItemReader"
processor="attendanceProcessor"
writer="attendanceItemWriter"
commit-interval="10"
/>
</batch:tasklet>
</batch:step>
</batch:job>
<bean id="attendanceItemReader" class="org.springframework.batch.item.database.JdbcCursorItemReader">
<property name="dataSource">
<ref bean="sourceDataSource"/>
</property>
<property name="sql"
value="select s.student_name ,s.student_id ,fas.attendance_days ,fas.attendance_value from K12INTEL_DW.ftbl_attendance_stumonabssum fas inner join k12intel_dw.dtbl_students s on fas.student_key = s.student_key inner join K12INTEL_DW.dtbl_schools ds on fas.school_key = ds.school_key inner join k12intel_dw.dtbl_school_dates dsd on fas.school_dates_key = dsd.school_dates_key where dsd.rolling_local_school_yr_number = 0 and ds.school_code = ? and s.student_activity_indicator = 'Active' and fas.LOCAL_GRADING_PERIOD = 'G1' and s.student_current_grade_level = 'Gr 9' order by s.student_id"/>
<property name="preparedStatementSetter" ref="attendanceStatementSetter"/>
<property name="rowMapper" ref="attendanceRowMapper"/>
</bean>
<bean id="attendanceStatementSetter" class="edu.kdc.visioncards.preparedstatements.AttendanceStatementSetter"/>
<bean id="attendanceRowMapper" class="edu.kdc.visioncards.rowmapper.AttendanceRowMapper"/>
<bean id="attendanceProcessor" class="edu.kdc.visioncards.AttendanceProcessor" />
<bean id="attendanceItemWriter" class="org.springframework.batch.item.file.FlatFileItemWriter">
<property name="resource" value="file:target/outputs/passthrough.txt"/>
<property name="lineAggregator">
<bean class="org.springframework.batch.item.file.transform.PassThroughLineAggregator" />
</property>
</bean>
</beans>
My supporting classes for the Reader.
A PreparedStatementSetter
package edu.kdc.visioncards.preparedstatements;
import java.sql.PreparedStatement;
import java.sql.SQLException;
import org.springframework.jdbc.core.PreparedStatementSetter;
public class AttendanceStatementSetter implements PreparedStatementSetter {
public void setValues(PreparedStatement ps) throws SQLException {
ps.setInt(1, 7);
}
}
and a RowMapper
package edu.kdc.visioncards.rowmapper;
import java.sql.ResultSet;
import java.sql.SQLException;
import org.springframework.jdbc.core.RowMapper;
import edu.kdc.visioncards.dto.AttendanceDTO;
public class AttendanceRowMapper<T> implements RowMapper<AttendanceDTO> {
public static final String STUDENT_NAME = "STUDENT_NAME";
public static final String STUDENT_ID = "STUDENT_ID";
public static final String ATTENDANCE_DAYS = "ATTENDANCE_DAYS";
public static final String ATTENDANCE_VALUE = "ATTENDANCE_VALUE";
public AttendanceDTO mapRow(ResultSet rs, int rowNum) throws SQLException {
AttendanceDTO dto = new AttendanceDTO();
dto.setStudentId(rs.getString(STUDENT_ID));
dto.setStudentName(rs.getString(STUDENT_NAME));
dto.setAttDays(rs.getInt(ATTENDANCE_DAYS));
dto.setAttValue(rs.getInt(ATTENDANCE_VALUE));
return dto;
}
}
My processor
package edu.kdc.visioncards;
import java.util.HashMap;
import java.util.Map;
import org.springframework.batch.item.ItemProcessor;
import edu.kdc.visioncards.dto.AttendanceDTO;
public class AttendanceProcessor implements ItemProcessor<AttendanceDTO, Map<Integer, AttendanceDTO>> {
private Map<Integer, AttendanceDTO> map = new HashMap<Integer, AttendanceDTO>();
public Map<Integer, AttendanceDTO> process(AttendanceDTO dto) throws Exception {
if(map.containsKey(new Integer(dto.getStudentId()))){
AttendanceDTO attDto = (AttendanceDTO)map.get(new Integer(dto.getStudentId()));
attDto.setAttDays(attDto.getAttDays() + dto.getAttDays());
attDto.setAttValue(attDto.getAttValue() + dto.getAttValue());
}else{
map.put(new Integer(dto.getStudentId()), dto);
}
return map;
}
}
My concerns from code above
In the Processor, I create a HashMap and as I process the rows I check whether I already have that Student in the Map, if it's not there I add it. If it's already there I grab the it get the values that I am interested in and add them with the row that I am currently processing.
After that, Spring Batch Framework writes to a File according to my configuration
My question is as follows:
I do not want it to go to the writer. I want to process all the remaining rows. How do I keep this Map that I have created in memory for the next set of rows that need to go through this same Processor? Everytime, a row is processed through AttendanceProcessor the Map is initialized. Should I put the Map initialization in a static block?
In my application I created a CollectingJdbcCursorItemReader that extends the standard JdbcCursorItemReader and performs exactly what you need. Internally it uses my CollectingRowMapper: an extension of the standard RowMapper that maps multiple related rows to one object.
Here is the code of the ItemReader, the code of CollectingRowMapper interface, and an abstract implementation of it, is available in another answer of mine.
import java.sql.ResultSet;
import java.sql.SQLException;
import org.springframework.batch.item.ReaderNotOpenException;
import org.springframework.batch.item.database.JdbcCursorItemReader;
import org.springframework.jdbc.core.RowMapper;
/**
* A JdbcCursorItemReader that uses a {#link CollectingRowMapper}.
* Like the superclass this reader is not thread-safe.
*
* #author Pino Navato
**/
public class CollectingJdbcCursorItemReader<T> extends JdbcCursorItemReader<T> {
private CollectingRowMapper<T> rowMapper;
private boolean firstRead = true;
/**
* Accepts a {#link CollectingRowMapper} only.
**/
#Override
public void setRowMapper(RowMapper<T> rowMapper) {
this.rowMapper = (CollectingRowMapper<T>)rowMapper;
super.setRowMapper(rowMapper);
}
/**
* Read next row and map it to item.
**/
#Override
protected T doRead() throws Exception {
if (rs == null) {
throw new ReaderNotOpenException("Reader must be open before it can be read.");
}
try {
if (firstRead) {
if (!rs.next()) { //Subsequent calls to next() will be executed by rowMapper
return null;
}
firstRead = false;
} else if (!rowMapper.hasNext()) {
return null;
}
T item = readCursor(rs, getCurrentItemCount());
return item;
}
catch (SQLException se) {
throw getExceptionTranslator().translate("Attempt to process next row failed", getSql(), se);
}
}
#Override
protected T readCursor(ResultSet rs, int currentRow) throws SQLException {
T result = super.readCursor(rs, currentRow);
setCurrentItemCount(rs.getRow());
return result;
}
}
You can use it just like the classic JdbcCursorItemReader: the only requirement is that you provide it a CollectingRowMapper instead of the classic RowMapper.
I always follow this pattern:
I make my reader scope to be "step", and in #PostConstruct I fetch
the results, and put them in a Map
In processor, I convert the associatedCollection into writable list,
and send the writable list
In ItemWriter, I persist the writable item(s) depending on the case
because you changed your question i add a new answer
if the students are ordered then there is no need for list/map, you could use exactly one studentObject on the processor to keep the "current" and aggregate on it until there is a new one (read: id change)
if the students are not ordered you will never know when a specific student is "finished" and you'd have to keep all students in a map which can't be written until the end of the complete read sequence
beware:
the processor needs to know when the reader is exhausted
its hard to get it working with any commit-rate and "id" concept if you aggregate items that are somehow identical the processor just can't know if the currently processed item is the last one
basically the usecase is either solved at reader level completely or at writer level (see other answer)
private SimpleItem currentItem;
private StepExecution stepExecution;
#Override
public SimpleItem process(SimpleItem newItem) throws Exception {
SimpleItem returnItem = null;
if (currentItem == null) {
currentItem = new SimpleItem(newItem.getId(), newItem.getValue());
} else if (currentItem.getId() == newItem.getId()) {
// aggregate somehow
String value = currentItem.getValue() + newItem.getValue();
currentItem.setValue(value);
} else {
// "clone"/copy currentItem
returnItem = new SimpleItem(currentItem.getId(), currentItem.getValue());
// replace currentItem
currentItem = newItem;
}
// reader exhausted?
if(stepExecution.getExecutionContext().containsKey("readerExhausted")
&& (Boolean)stepExecution.getExecutionContext().get("readerExhausted")
&& currentItem.getId() == stepExecution.getExecutionContext().getInt("lastItemId")) {
returnItem = new SimpleItem(currentItem.getId(), currentItem.getValue());
}
return returnItem;
}
basically you talk about batch processing with changing IDs(1), where the batch has to keep track of the change
for spring/spring-batch we talk about:
ItemWriter which checks the list of items for an id change
before the change the items are stored in a temporary datastore(2) (List, Map, whatever), and are not written out
when the id changes, the aggregating/flattening business code runs on the items in the datastore and one item should be written, now the datastore can be used for the next items with the next id
this concept needs a reader which tells the step "i'm exhausted" to properly flush the temporary datastore on end of items (file/database)
here a rough and simple code example
#Override
public void write(List<? extends SimpleItem> items) throws Exception {
// setup with first sharedId at startup
if (currentId == null){
currentId = items.get(0).getSharedId();
}
// check for change of sharedId in input
// keep items in temporary dataStore until id change of input
// call delegate if there is an id change or if the reader is exhausted
for (SimpleItem item : items) {
// already known sharedId, add to tempData
if (item.getSharedId() == currentId) {
tempData.add(item);
} else {
// or new sharedId, write tempData, empty it, keep new id
// the delegate does the flattening/aggregating
delegate.write(tempData);
tempData.clear();
currentId = item.getSharedId();
tempData.add(item);
}
}
// check if reader is exhausted, flush tempData
if ((Boolean) stepExecution.getExecutionContext().get("readerExhausted")
&& tempData.size() > 0) {
delegate.write(tempData);
// optional delegate.clear();
}
}
(1)assuming the items are ordered by an ID (can be composite too)
(2)a hashmap spring bean for thread safety
Use Step Execution Listener and store the records as map to the StepExecutionContext , you can then group them in the writer or writer listener and write it at a time