I am creating a fixed length file I have to attached the number of files that are read in to the footer. I need to access the the stepExecution to get the write count, I followed this FlatFileFooterCallback - how to get access to StepExecution For Count. StepExecution is null??
FlatFileFooterCallback
public class LexisNexisRequestFileFooter implements FlatFileFooterCallback {
#Value("#{StepExecution}")
private StepExecution stepExecution;
int totalItemsWritten = 0;
#Override
public void writeFooter(Writer writer) throws IOException {
System.out.println(stepExecution.getWriteCount());
String julianDate = createJulianDate();
String SAT = "##!!SAT#"+julianDate+totalItemsWritten+" \r\n";
String SIT = "##!!SIT#"+julianDate+totalItemsWritten+" \r\n";
String footer = SAT+SIT;
writer.write(footer);
}
}
Configuration file
#Bean
#StepScope
public FlatFileFooterCallback customFooterCallback() {
return new LexisNexisRequestFileFooter();
}
Writer file
// Create writer instance
FlatFileItemWriter<LexisNexisRequestRecord> writer = new FlatFileItemWriter<>();
LexisNexisRequestFileFooter lexisNexisRequestFileFooter = new LexisNexisRequestFileFooter();
writer.setFooterCallback(lexisNexisRequestFileFooter);
// Set output file location
writer.setResource(new FileSystemResource("homeData.txt"));
// All job reptitions should append to same output file
writer.setAppendAllowed(true);
writer.setEncoding("ascii");
In your writer configuration, you are creating the footer callback manually here:
LexisNexisRequestFileFooter lexisNexisRequestFileFooter = new LexisNexisRequestFileFooter();
writer.setFooterCallback(lexisNexisRequestFileFooter);
and not injecting the step scoped bean. Your item writer bean definition method should be something like:
#Bean
public FlatFileItemWriter writer() {
// Create writer instance
FlatFileItemWriter<LexisNexisRequestRecord> writer = new FlatFileItemWriter<>();
writer.setFooterCallback(customFooterCallback());
// Set output file location
writer.setResource(new FileSystemResource("homeData.txt"));
// All job reptitions should append to same output file
writer.setAppendAllowed(true);
writer.setEncoding("ascii");
}
Related
My Springbatch Configuration class have following job :
public Job job(JobBuilderFactory jobBuilderFactory, StepBuilderFactory stepBuilderFactory) {
Step step = stepBuilderFactory.get("sampleSetp")
.<String, String>chunk(10)
.reader(new Reader())
.writer(new Writer())
.build();
}
And My Reader class is :
public class Reader implements ItemReader<String> {
#Override
public String read() {
FlatFileItemReader<String> reader = new FlatFileItemReader<String>();
reader.setResource(someResource);
reader.setLineMapper(lineMapper());
**return reader; //This should be string, but i have FlatFileItemReader<String> type.**
}
}
I am doing some work on reader(FlatFileItemReader) and want to return to job this reader but read() return type is String.
How should i send this reader to job present in BathConfig?
NOTE: IF i make do this FlatFileItemReader processing in springConfig file and pass it to job there only then it is working.
ex:
Step step = stepBuilderFactory.get("sampleSetp")
.<String, String>chunk(10)
.reader(flatFileReader())
.write()
.build();
where flatFileReader() is :
public FlatFileItemReader<String> flatFileReader(){
...
return FlatFileItemReader<String> type
}
SpringBatch read() function in ItemReader class return type is String and Want to return FlatFileItemReader , How should i Do it?
The read method should return the actual item (a String in your case) and not another reader. If you want a custom reader that delegates to a FlatFileItemReader, you can do something like:
public class Reader implements ItemReader<String> {
private FlatFileItemReader<String> delegate;
public Reader(FlatFileItemReader<String> delegate) {
this.delegate = delegate;
}
#Override
public String read() {
return delegate.read();
}
}
In Spring Batch, I am trying to figure out how to generate the footer record that contains a count of the records written. I have a two input files and they are aggregated into a single output file. Note that I am processing the input files in separate steps to filter out duplicates.
I got it to work with a custom FlatFileFooterCallback.
public class FooterCallback extends StepExecutionListenerSupport implements FlatFileFooterCallback {
private StepExecution stepExecution;
static private int totalCount = 0;
public void writeFooter(Writer writer) throws IOException {
int count = stepExecution.getWriteCount();
if (stepExecution.getStepName().equals("step1")) {
totalCount += count;
}
else { // last step
writer.write("T|" + (totalCount + count));
}
}
#Override
#BeforeStep
public void beforeStep(StepExecution stepExecution) {
this.stepExecution = stepExecution;
}
}
Then added a call to setFooterCallback() to the writer.
#Bean
public ItemWriter<OutputDetailRecord> firstFileItemWriter() {
FlatFileItemWriter<OutputDetailRecord> itemWriter = new FlatFileItemWriter<>();
HeaderWriterCallback headerWriterCallback = new HeaderWriterCallback();
itemWriter.setHeaderCallback(headerWriterCallback);
itemWriter.setFooterCallback(footerCallback);
itemWriter.setResource(new FileSystemResource("/data/outputFile.txt"));
DelimitedLineAggregator<OutputDetailRecord> delimitedLineAggregator = new DelimitedLineAggregator<>();
delimitedLineAggregator.setDelimiter("|");
BeanWrapperFieldExtractor<OutputDetailRecord> extractor = new BeanWrapperFieldExtractor<>();
extractor.setNames(new String[] {
...
});
delimitedLineAggregator.setFieldExtractor(extractor);
itemWriter.setLineAggregator(delimitedLineAggregator);
return itemWriter;
}
Then adding the listener() call to the step.
#Bean
public Step step1() {
return stepBuilderFactory.get("step1")
.<InputRecord1, OutputDetailRecord>chunk(10)
.listener(footerCallback)
.reader(firstFileItemReader())
.processor(firstFileItemProcessor())
.writer(firstFileItemWriter())
.build();
}
The second step in the job looks similar to the step above.
Based on my research, I know that Spring Batch provides API to handling many different kinds of data file formats.
But I need clarification on how do we supply multiple files of different format in one chunk / Tasklet.
For that, I know that there is MultiResourceItemReader can process multiple files but AFAIK all the files have to be of the same format and data structure.
So, the question is how can we supply multiple files of different data formats as input in a Tasklet ?
Asoub is right and there is no out-of-the-box Spring Batch reader that "reads it all!". However with just a handful of fairly simple and straight forward classes you can make a java config spring batch application that will go through different files with different file formats.
For one of my applications I had a similar type of use case and I wrote a bunch of fairly simple and straight forward implementations and extensions of the Spring Batch framework to create what I call a "generic" reader. So to answer your question: below you will find the code I used to go through different kind of file formats using spring batch. Obviously below you will find the stripped implementation, but it should get you going in the right direction.
One line is represented by a Record:
public class Record {
private Object[] columns;
public void setColumnByIndex(Object candidate, int index) {
columns[index] = candidate;
}
public Object getColumnByIndex(int index){
return columns[index];
}
public void setColumns(Object[] columns) {
this.columns = columns;
}
}
Each line contains multiple columns and the columns are separated by a delimiter. It does not matter if file1 contains 10 columns and/or if file2 only contains 3 columns.
The following reader simply maps each line to a record:
#Component
public class GenericReader {
#Autowired
private GenericLineMapper genericLineMapper;
#SuppressWarnings({ "unchecked", "rawtypes" })
public FlatFileItemReader reader(File file) {
FlatFileItemReader<Record> reader = new FlatFileItemReader();
reader.setResource(new FileSystemResource(file));
reader.setLineMapper((LineMapper) genericLineMapper.defaultLineMapper());
return reader;
}
}
The mapper takes a line and converts it to an array of objects:
#Component
public class GenericLineMapper {
#Autowired
private ApplicationConfiguration applicationConfiguration;
#SuppressWarnings({ "unchecked", "rawtypes" })
public DefaultLineMapper defaultLineMapper() {
DefaultLineMapper lineMapper = new DefaultLineMapper();
lineMapper.setLineTokenizer(tokenizer());
lineMapper.setFieldSetMapper(new CustomFieldSetMapper());
return lineMapper;
}
private DelimitedLineTokenizer tokenizer() {
DelimitedLineTokenizer tokenize = new DelimitedLineTokenizer();
tokenize.setDelimiter(Character.toString(applicationConfiguration.getDelimiter()));
tokenize.setQuoteCharacter(applicationConfiguration.getQuote());
return tokenize;
}
}
The "magic" of converting the columns to the record happens in the FieldSetMapper:
#Component
public class CustomFieldSetMapper implements FieldSetMapper<Record> {
#Override
public Record mapFieldSet(FieldSet fieldSet) throws BindException {
Record record = new Record();
Object[] row = new Object[fieldSet.getValues().length];
for (int i = 0; i < fieldSet.getValues().length; i++) {
row[i] = fieldSet.getValues()[i];
}
record.setColumns(row);
return record;
}
}
Using yaml configuration the user provides an input directory and a list of file names and ofcourse the appropriate delimiter and character to quote a column if the column contains the delimiter. Here is an exmple of such a yaml configuration:
#Component
#ConfigurationProperties
public class ApplicationConfiguration {
private String inputDir;
private List<String> fileNames;
private char delimiter;
private char quote;
// getters and setters ommitted
}
And then the application.yml:
input-dir: src/main/resources/
file-names: [yourfile1.csv, yourfile2.csv, yourfile3.csv]
delimiter: "|"
quote: "\""
And last but not least, putting it all together:
#Configuration
#EnableBatchProcessing
public class BatchConfiguration {
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Autowired
private GenericReader genericReader;
#Autowired
private NoOpWriter noOpWriter;
#Autowired
private ApplicationConfiguration applicationConfiguration;
#Bean
public Job yourJobName() {
List<Step> steps = new ArrayList<>();
applicationConfiguration.getFileNames().forEach(f -> steps.add(loadStep(new File(applicationConfiguration.getInputDir() + f))));
return jobBuilderFactory.get("yourjobName")
.start(createParallelFlow(steps))
.end()
.build();
}
#SuppressWarnings("unchecked")
public Step loadStep(File file) {
return stepBuilderFactory.get("step-" + file.getName())
.<Record, Record> chunk(10)
.reader(genericReader.reader(file))
.writer(noOpWriter)
.build();
}
private Flow createParallelFlow(List<Step> steps) {
SimpleAsyncTaskExecutor taskExecutor = new SimpleAsyncTaskExecutor();
// max multithreading = -1, no multithreading = 1, smart size = steps.size()
taskExecutor.setConcurrencyLimit(1);
List<Flow> flows = steps.stream()
.map(step -> new FlowBuilder<Flow>("flow_" + step.getName()).start(step).build())
.collect(Collectors.toList());
return new FlowBuilder<SimpleFlow>("parallelStepsFlow")
.split(taskExecutor)
.add(flows.toArray(new Flow[flows.size()]))
.build();
}
}
For demonstration purposes you can just put all the classes in one package. The NoOpWriter simply logs the 2nd column of my test files.
#Component
public class NoOpWriter implements ItemWriter<Record> {
#Override
public void write(List<? extends Record> items) throws Exception {
items.forEach(i -> System.out.println(i.getColumnByIndex(1)));
// NO - OP
}
}
Good luck :-)
I don't think there is an out-of-the-box Spring batch reader for multiple input format.
You'll have to build your own. Of course you can reuse already existing FileItemReader as delegates in your custom file reader, and for each file type/format, use the right one.
I am trying to create an application that uses the spring-batch-excel extension to be able to read Excel files uploaded through a web interface by it's users in order to parse the Excel file for addresses.
When the code runs, there is no error, but all I get is the following in my log. Even though I have log/syso throughout my Processor and Writer (these are never being called, and all I can imagine is it's not properly reading the file, and returning no data to process/write). And yes, the file has data, several thousand records in fact.
Job: [FlowJob: [name=excelFileJob]] launched with the following parameters: [{file=Book1.xlsx}]
Executing step: [excelFileStep]
Job: [FlowJob: [name=excelFileJob]] completed with the following parameters: [{file=Book1.xlsx}] and the following status: [COMPLETED]
Below is my JobConfig
#Configuration
#EnableBatchProcessing
public class AddressExcelJobConfig {
#Bean
public BatchConfigurer configurer(EntityManagerFactory entityManagerFactory) {
return new CustomBatchConfigurer(entityManagerFactory);
}
#Bean
Step excelFileStep(ItemReader<AddressExcel> excelAddressReader,
ItemProcessor<AddressExcel, AddressExcel> excelAddressProcessor,
ItemWriter<AddressExcel> excelAddressWriter,
StepBuilderFactory stepBuilderFactory) {
return stepBuilderFactory.get("excelFileStep")
.<AddressExcel, AddressExcel>chunk(1)
.reader(excelAddressReader)
.processor(excelAddressProcessor)
.writer(excelAddressWriter)
.build();
}
#Bean
Job excelFileJob(JobBuilderFactory jobBuilderFactory,
#Qualifier("excelFileStep") Step excelAddressStep) {
return jobBuilderFactory.get("excelFileJob")
.incrementer(new RunIdIncrementer())
.flow(excelAddressStep)
.end()
.build();
}
}
Below is my AddressExcelReader
The late binding works fine, there is no error. I have tried loading the resource given the file name, in addition to creating a new ClassPathResource and FileSystemResource. All are giving me the same results.
#Component
#StepScope
public class AddressExcelReader implements ItemReader<AddressExcel> {
private PoiItemReader<AddressExcel> itemReader = new PoiItemReader<AddressExcel>();
#Override
public AddressExcel read()
throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException {
return itemReader.read();
}
public AddressExcelReader(#Value("#{jobParameters['file']}") String file, StorageService storageService) {
//Resource resource = storageService.loadAsResource(file);
//Resource testResource = new FileSystemResource("upload-dir/Book1.xlsx");
itemReader.setResource(new ClassPathResource("/upload-dir/Book1.xlsx"));
itemReader.setLinesToSkip(1);
itemReader.setStrict(true);
itemReader.setRowMapper(excelRowMapper());
}
public RowMapper<AddressExcel> excelRowMapper() {
BeanWrapperRowMapper<AddressExcel> rowMapper = new BeanWrapperRowMapper<>();
rowMapper.setTargetType(AddressExcel.class);
return rowMapper;
}
}
Below is my AddressExcelProcessor
#Component
public class AddressExcelProcessor implements ItemProcessor<AddressExcel, AddressExcel> {
private static final Logger log = LoggerFactory.getLogger(AddressExcelProcessor.class);
#Override
public AddressExcel process(AddressExcel item) throws Exception {
System.out.println("Converting " + item);
log.info("Convert {}", item);
return item;
}
}
Again, this is never coming into play (no logs generated). And if it matters, this is how I'm launching my job from a FileUploadController from a #PostMapping("/") to handle the file upload, which first stores the file, then runs the job:
#PostMapping("/")
public String handleFileUpload(#RequestParam("file") MultipartFile file, RedirectAttributes redirectAttributes) {
storageService.store(file);
try {
JobParameters jobParameters = new JobParametersBuilder()
.addString("file", file.getOriginalFilename().toString()).toJobParameters();
jobLauncher.run(job, jobParameters);
} catch (JobExecutionAlreadyRunningException | JobRestartException | JobInstanceAlreadyCompleteException
| JobParametersInvalidException e) {
e.printStackTrace();
}
redirectAttributes.addFlashAttribute("message",
"You successfully uploaded " + file.getOriginalFilename() + "!");
return "redirect:/";
}
And last by not least
Here is my AddressExcel POJO
import lombok.Data;
#Data
public class AddressExcel {
private String address1;
private String address2;
private String city;
private String state;
private String zip;
public AddressExcel() {}
}
UPDATE (10/13/2016)
From Nghia Do's comments, I also created my own RowMapper instead of using the BeanWrapper to see if that was the issue. Still the same results.
public class AddressExcelRowMapper implements RowMapper<AddressExcel> {
#Override
public AddressExcel mapRow(RowSet rs) throws Exception {
AddressExcel temp = new AddressExcel();
temp.setAddress1(rs.getColumnValue(0));
temp.setAddress2(rs.getColumnValue(1));
temp.setCity(rs.getColumnValue(2));
temp.setState(rs.getColumnValue(3));
temp.setZip(rs.getColumnValue(4));
return temp;
}
}
All it seems I needed was to add the following to my ItemReader:
itemReader.afterPropertiesSet();
itemReader.open(new ExecutionContext());
I want to read read a text file to build a map and place it into the ExecutionContext for later reference.
I thought to start out using chunk-processng to read the file, the process it, but I don't need the FlatFileItemWriter to write to a file. However, bean initializing requires I set a resource on the writer.
Am I going about this wrong? Is chunk=process the wrong approach. Creating a tasklet my be wiser, but I liked that SpringBatch would read my file for me. With a tasklet, I'd have to write the code to open and process the text file. Right?
Advice on how to proceed would be greatly appreciated.
What I wound up doing (I'm new) was create a Tasklet, and have it also implement the StepExecutionListener interface. Worked like a charm. It's reading a comma-delimited file by lines, plucking out the second column. I created an 'enum' for my ExecutionContext map keys. Basically, this below:
public class ProcessTabcPermitsTasklet implements Tasklet, StepExecutionListener {
private Resource resource;
private int linesToSkip;
private Set<String> permits = new TreeSet<String>();
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
BufferedReader reader = new BufferedReader((new FileReader(resource.getFile())));
String line = null;
int lines = 0;
while ((line = reader.readLine()) != null) {
if (++lines <= linesToSkip)
continue;
String[] s = StringUtils.commaDelimitedListToStringArray(line);
permits.add(s[TABC_COLUMNS.PERMIT.ordinal()]);
}
return RepeatStatus.FINISHED;
}
/**
* #param file
* the file to set
*/
public void setResource(Resource resource) {
this.resource = resource;
}
/**
* #param linesToSkip
* the linesToSkip to set
*/
public void setLinesToSkip(int linesToSkip) {
this.linesToSkip = linesToSkip;
}
public ExitStatus afterStep(StepExecution stepExecution) {
stepExecution.getExecutionContext().put(EXECUTION_CONTEXT.TABC_PERMITS.toString(), permits);
return ExitStatus.COMPLETED;
}
}