Reading from streams instead of files in spring batch itemReader - spring-batch

I am getting a csv file as a webservice call which needs to be laoded. Right now I am saving it in temp directory to provide it as setResource to Reader.
Is there a way to provide stream(byte[]) as is instead of saving the file first?

The method setResource of the ItemReader takes a org.springframework.core.io.Resource as a parameter. This class has a few out-of-the-box implementations, among which you can find org.springframework.core.io.InputStreamResource. This class' constructor takes a java.io.InputStream which can be implemented by java.io.ByteArrayInputStream.
So technically, yes you can consume a byte[] parameter in an ItemReader.
Now, for how to actually do that, here are a few ideas :
1) Create your own FlatFileItemReader (since CSV is a flat file) and make it implement StepExecutionListener
public class CustomFlatFileItemReader<T> extends FlatFileItemReader<T> implements StepExecutionListener {
}
2) Override the beforeStep method, do your webservice call within and save the result in a variable
private byte[] stream;
#Override
public void beforeStep(StepExecution stepExecution) {
// your webservice logic
stream = yourWebservice.results();
}
3) Override the setResource method to pass this stream as the actual resource.
#Override
public void setResource(Resource resource) {
// Convert byte array to input stream
InputStream is = new ByteArrayInputStream(stream);
// Create springbatch input stream resource
InputStreamResource res = new InputStreamResource(is);
// Set resource
super.setResource(res);
}
Also, if you don't want to call your webservice within the ItemReader, you can simply store the byte array in the JobExecutionContext and get it in the beforeStep method with stepExecution.getJobExecution().getExecutionContext().get("key");

I am doing right now with FlaFileItemReader, reading a file from Google Storage. No needed to extends:
#Bean
#StepScope
public FlatFileItemReader<MyDTO> itemReader(#Value("#{jobParameters['filename']}") String filename) {
InputStream stream = googleStorageService.getInputStream(GoogleStorage.UPLOADS, filename);
return new FlatFileItemReaderBuilder<MyDTO>()
.name("myItemReader")
.resource(new InputStreamResource(stream)) //InputStream here
.delimited()
.names(FIELDS)
.lineMapper(lineMapper()) // Here is mapped like a normal File
.fieldSetMapper(new BeanWrapperFieldSetMapper<MyDTO>() {{
setTargetType(MyDTO.class);
}})
.build();
}

Related

Spring Batch with ItemReader for WebService data input (Chunk Approach)

I have created a simple Spring Batch and I have a Chunk implemented as follows:
public class ChunksConfig {
.......
.......
#Override
protected ItemWriter<List<Pratica>> getItemWriter() {
return praticaWriter;
}
#Override
protected ItemReader<List<GaranziaRichiesta>> getItemReader() {
return praticheReader;
}
#Override
protected ItemProcessor<List<GaranziaRichiesta>, List<Pratica>> getItemProcessor() {
return praticaProcessor;
}
}
Now my reader should retrieve data from SOAP WebServices and, each call, retrieve 100 elements. I try to implement my reader as follows:
#Component
public class PraticaReader implements ItemReader<List<GaranziaRichiesta>> {
#Autowired
SOAPServices soapServices;
#Override
public List<GaranziaRichiesta> read()
throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException {
GaranzieRichiesteRequest request = new GaranzieRichiesteRequest();
GaranzieRichiesteResponse response = soapServices.garanzieRichieste(request);
return response.getGaranzie(); // this contains exactly 100 elements
}
}
Now the problem is that I don't know how set chunk size and how let Spring understand how many element is eleaborating after reading operation.
In fact if I put chunk size to 1 the code flow passes to reader -> processor and writer correctly (but the job doesn't stop and after perform the writer operation starts reading again..). Instead if I put the chunk size to 100, the flow perform 100 reading action before arriving to processor.
I absolutely new to SpringBatch, I read the doc but 90% percent of example is about reading data from file. Can you suggest me if I can do a compliant implementation of Chunk on this particular case or I should try another way?
Thank you

Reading file dynamically in spring-batch

I am trying to transfer any files (video,txt etc) between different endpoints (pc, s3, dropbox, google drive) using spring-batch on a network. For that, I am getting json file containing list of files location(url) to be transferred (assume I can access those location).
So, how do I tell the reader to read the input once my controller is hit (in which job is created) and not at the time of starting spring-boot application?
I have tried adding "spring.batch.job.enabled=false" which stops spring-batch to start automatically but my concern is where should I write setting my resource line that will be provided to ItemReader :
FlatFileItemReader<String> reader = new FlatFileItemReader<String>();
reader.setResource(someResource);
Because during setting resources I am getting NullPointerException.
The Running Jobs from within a Web Container explains that with a code example. Here is an except:
#Controller
public class JobLauncherController {
#Autowired
JobLauncher jobLauncher;
#Autowired
Job job;
#RequestMapping("/jobLauncher.html")
public void handle() throws Exception{
jobLauncher.run(job, new JobParameters());
}
}
In your case, you need to extract the file name from the request and pass it as a job parameter, something like:
#RequestMapping("/jobLauncher.html")
public void handle() throws Exception{
URL url = // extract url from request
JobParameters parameters = new JobParametersBuilder()
.addString("url", url)
.toJobParameters();
jobLauncher.run(job, parameters);
}
Then make your reader step-scoped and dynamically extract the file from job parameters:
#StepScope
#Bean
public FlatFileItemReader flatFileItemReader(#Value("#{jobParameters['url']}") URL url) {
return new FlatFileItemReaderBuilder<String>()
.resource(new UrlResource(url))
// set other properties
.build();
}
This is explained in the Late Binding of Job and Step Attributes section.

FlatFileItemWriter write header only in case when data is present

have a task to write header to file only if some data exist, other words if reader return nothing file created by writer should be empty.
Unfortunately FlatFileItemWriter implementation, in version 3.0.7, has only private access fields and methods and nested class that store all info about writing process, so I cannot just take and overwrite write() method. I need to copy-paste almost all content of FlatFileItemWriter to add small piece of new functionality.
Any idea how to achieve this more elegantly in Spring Batch?
So, finally found a less-more elegant solution.
The solution is to use LineAggregators, and seems in the current implementation of FlatFileItemWriter this is only one approach that you can use safer when inheriting this class.
I use separate line aggregator only for a header, but the solution can be extended to use multiple aggregators.
Also in my case header is just predefined string, thus I use PassThroughLineAggregator by default that just return my string to FlatFileItemWriter.
public class FlatFileItemWriterWithHeaderOnData extends FlatFileItemWriter {
private LineAggregator lineAggregator;
private LineAggregator headerLineAggregator = new PassThroughLineAggregator();
private boolean applyHeaderAggregator = true;
#Override
public void afterPropertiesSet() throws Exception {
Assert.notNull(headerLineAggregator, "A HeaderLineAggregator must be provided.");
super.afterPropertiesSet();
}
#Override
public void setLineAggregator(LineAggregator lineAggregator) {
this.lineAggregator = lineAggregator;
super.setLineAggregator(lineAggregator);
}
public void setHeaderLineAggregator(LineAggregator headerLineAggregator) {
this.headerLineAggregator = headerLineAggregator;
}
#Override
public void write(List items) throws Exception {
if(applyHeaderAggregator){
LineAggregator initialLineAggregator = lineAggregator;
super.setLineAggregator(headerLineAggregator);
super.write(getHeaderItems());
super.setLineAggregator(initialLineAggregator);
applyHeaderAggregator = false;
}
super.write(items);
}
private List<String> getHeaderItems() throws ItemStreamException {
// your actual implementation goes here
return Arrays.asList("Id,Name,Details");
}
}
PS. This solution assumed that if method write() called then some data exist.
Try this in your writer
writer.setShouldDeleteIfEmpty(true);
If you have no data, there is no file.
In other case, you write your header and your items
I'm thinking of a way as below.
BeforeStep() (or a Tasklet) if there is no Data at all, you set a flag such as "noData" is 'true'. Otherwise will be 'false'
And you have 2 writers, one with Header and another one without Header. In this case you can have a base Writer acts as a parent and then 2 writers inherits it. The only difference between them is one with Header and one doesn't have HeaderCallBack.
Base on the flag, you can switch to either 'Writer with Header' or 'Writer without Header'
Thanks,
Nghia

How to use BeanWrapperFieldSetMapper to map a subset of fields?

I have a Spring batch application where BeanWrapperFieldSetMapper is used to map fields using a prototype object. However, the CSV file that is being read (via a FlatFileItemReader) contains one (indicator) field that determines the mapping of another field. If the indicator field has a value of Y, then the value of the another field should be mapped to property foo otherwise it should be mapped to property bar.
I know that I can use a custom FieldSetMapper to do this, but then I have to code the mapping all of the other fields (of which there are a quite a few). Alternatively, I could do this post reading via an ItemProcessor but then my domain (prototype) object must have a property representing the indicator field (which I prefer not to do since it is not really part of the business domain).
Is it possible to perhaps use a custom FieldSetMapper to only map these custom fields and delegate the other mappings to BeanWrapperFieldSetMapper? Or is there some other better way to solve for this?
Here is my current attempt to use a custom FieldSetMapper and delegate to BeanWrapperFieldSetMapper:
public class DelegatedFieldSetMapper extends BeanWrapperFieldSetMapper<MyProtoClass> {
#Override
public MyProtoClass mapFieldSet(FieldSet fieldSet) throws BindException {
String indicator = fieldSet.readString("indicator");
Properties fieldProperties = fieldSet.getProperties();
if (indicator.equalsIgnoreCase("y")) {
fieldProperties.put("test.foo", fieldSet.readString("value");
} else {
fieldProperties.put("test.bar", fieldSet.readString("value");
}
fieldProperties.remove("indicator");
Set<Object> keys = fieldProperties.keySet();
List<String> names = new ArrayList<String>();
List<String> values = new ArrayList<String>();
for (Object key : keys) {
names.add((String) key);
values.add((String) fieldProperties.getProperty((String) key));
}
DefaultFieldSet domainObjectFieldSet = new DefaultFieldSet(names.toArray(new String[names.size()]), values.toArray(new String[values.size()]));
return super.mapFieldSet(domainObjectFieldSet);
}
}
However, a FlatFileParseException is thrown. The relevant parts of the batch config class are as follows:
#Configuration
#EnableBatchProcessing
public class BatchConfiguration {
#Value("${file}")
private File file;
#Bean
#Scope("prototype")
public MyProtoClass () {
return new MyProtoClass();
}
#Bean
public ItemReader<MyProtoClass> reader(LineMapper<MyProtoClass> lineMapper) {
FlatFileItemReader<MyProtoClass> flatFileItemReader = new FlatFileItemReader<MyProtoClass>();
flatFileItemReader.setResource(new FileSystemResource(file));
final int NUMBER_OF_HEADER_LINES = 1;
flatFileItemReader.setLinesToSkip(NUMBER_OF_HEADER_LINES);
flatFileItemReader.setLineMapper(lineMapper);
return flatFileItemReader;
}
#Bean
public LineMapper<MyProtoClass> lineMapper(LineTokenizer lineTokenizer, FieldSetMapper<MyProtoClass> fieldSetMapper) {
DefaultLineMapper<MyProtoClass> lineMapper = new DefaultLineMapper<MyProtoClass>();
lineMapper.setLineTokenizer(lineTokenizer);
lineMapper.setFieldSetMapper(fieldSetMapper);
return lineMapper;
}
#Bean
public LineTokenizer lineTokenizer() {
DelimitedLineTokenizer lineTokenizer = new DelimitedLineTokenizer();
lineTokenizer.setNames(new String[] {"value", "test.bar", "test.foo", "indicator"});
return lineTokenizer;
}
#Bean
public FieldSetMapper<MyProtoClass> fieldSetMapper(PropertyEditor emptyStringToNullPropertyEditor) {
BeanWrapperFieldSetMapper<MyProtoClass> fieldSetMapper = new DelegatedFieldSetMapper();
fieldSetMapper.setPrototypeBeanName("myProtoClass");
Map<Class<String>, PropertyEditor> customEditors = new HashMap<Class<String>, PropertyEditor>();
customEditors.put(String.class, emptyStringToNullPropertyEditor);
fieldSetMapper.setCustomEditors(customEditors);
return fieldSetMapper;
}
Finally, the CSV flat file look like this:
value,bar,foo,indicator
abc,,,y
xyz,,,n
Let's say that BatchWorkObject is the class to be mapped.
Here's a sample code in Spring Boot style that needs only your custom logic to be added.
new BeanWrapperFieldSetMapper<BatchWorkObject>(){
{
this.setTargetType(BatchWorkObject.class);
}
#Override
public BatchWorkObject mapFieldSet(FieldSet fs)
throws BindException {
BatchWorkObject tmp= super.mapFieldSet(fs);
// your custom code here
return tmp;
}
});
The code actually accomplishes what is desired except for one issue that results in the FlatFileParseException. The DelegatedFieldSetMapper contains the issue as follows:
DefaultFieldSet domainObjectFieldSet = new DefaultFieldSet(names.toArray(new String[names.size()]), values.toArray(new String[values.size()]));
To resolve, change to:
DefaultFieldSet domainObjectFieldSet = new DefaultFieldSet(values.toArray(new String[values.size()]), names.toArray(new String[names.size()]));
Write your own FieldSetMapper with a set of prepared delegates inside.
Those delegates are pre-built for every different kind of fields mapping.
In your object route to correct delegate based on indicator field (with a Classifier, for example).
I can't see any other way, but this solution is quite easy and straightforward to maintain.
Processing based on the input format/data can be done using a custom implementation of ItemProcessor which is either changing values in the same entity (that was populated by IteamReader) or creates a new one output entity.

Spring Batch Custom Writer with FooterCallback and Header callback

I am trying to add a header and footer in a custom writer by implementing the header callback and footercallback in my custom writer call .
Write method is successful . But write header and writefooter are not called .
public class CustomOAFileItemWriter extends StepExecutionListenerSupport implements ItemWriter<OAExtract>,FlatFileHeaderCallback,FlatFileFooterCallback{
public void write(List<? extends OAExtract> oaExtractList) throws Exception {
FileOutputStream fs = new FileOutputStream("C:\\archivedFiles\\out.bin");
}
public void writeHeader(Writer writer) throws IOException {
System.out.println("Writing Header record");
}
public void writeFooter(Writer writer) throws IOException {
System.out.println("Writing Footerrecord");
}
Can someone with Spring batch experience help me with this?
Thanks,
Rai
You solution is opposite to SB philosophy: reuse and delegation. And you are using none of them.
You don't need a custom ItemWriter but
Create a FlatFileItemWriter with your custom header/footer callback
Create the listener you want (I see you extend StepExecutionListenerSupport) and attach to your step.
If you look at the source code for FlatFileItemWriter you will see that it calls the header callback method at doOpen() and footer callback method at doClose(). Since your not making use of the standard FlatFileItemWriter you will have write explicit code just like that in the FlatFileItemWriter.
http://grepcode.com/file/repo1.maven.org/maven2/org.springframework.batch/spring-batch-infrastructure/3.0.1.RELEASE/org/springframework/batch/item/file/FlatFileItemWriter.java#FlatFileItemWriter