Spring Batch Reading AWS S3 -stepExecutionContext is null - spring-batch

I am trying to read files from AWS s3 using spring batch but the file name becomes null in stepExecutionContext. Same code was working when i read the files from the windows mount but when we migrate the code and reading it from S3 it is becoming null.
#Bean
#JobScope
public CustomMultiResourcePartitioner partitioner() {
CustomMultiResourcePartitioner partitioner = new CustomMultiResourcePartitioner();
Set<String > filesToProcess= fileRepository.findAllFilesByFileState("NEW");
List<Resource> resourceList = new ArrayList<>();
for(String file:filesToProcess) {
Resource resource = getS3Resource(file);
resourceList.add(resource);
log.info("resourceList Size"+resourceList.size());
}
if(resourceList.size()>0 && resourceList.toArray()!=null) {
resources = resourceList.stream().toArray(Resource[]::new);
ExecutionContext executionContext = new ExecutionContext();
executionContext.put("FILE_NAME",filesToProcess);
}
else
{
resources = new Resource[0];
}
partitioner.setResources(resources);
return partitioner;
}
#Bean
#StepScope
public FlatFileItemReader<RosterInput> itemReader(#Value("#{stepExecutionContext[fileName]}") String filename) throws UnexpectedInputException, ParseException {
FlatFileItemReader<RosterInput> reader = new FlatFileItemReader<RosterInput>();
DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer();
tokenizer.setStrict(false);
Resource resource =getS3Resource(filename);
}

I was using ByteArrayResource in my getS3Resource() method it was causing the file name to null. After modifying the code to use below the problem is solved.
public class FileNameAwareByteArrayResource extends ByteArrayResource {
private String fileName;
public FileNameAwareByteArrayResource(String fileName, byte[] byteArray) {
super(byteArray);
this.fileName = fileName;
}
#Override
public String getFilename() {
return fileName;
}
}

Related

Spting Batch MultiResourceItemReader with non-FlatFileItemReader

Current flow:
1.BatchItemReader implements ItemReader<List<SingleJsonRowInput>>
2.BatchItemProcessor implements ItemProcessor<List<SingleJsonRowInput>>
3.BatchItemWriter implements ItemWriter<List<String>>
The input is a text file with each row represent a Json file. currently the program runs well with a single file, i would like to implement MultiResourceItemReader but since my reader doesn't imlement this ResourceAwareItemReaderItemStream - it cannot be applied to MultiResourceItemReader. i tried:
1. Implementing ResourceAwareItemReaderItemStream
2. Changing my reader to be FlatFileItemReader as advised here:
Spring Batch: How to setup a FlatFileItemReader to read a json file?
but failed to do so.
Reader:
public class BatchItemReader implements ItemReader<List<SingleJsonRowInput>>{
private int count = 0;
private FileManager fileManager;
private Gson gson = new Gson();
public List<SingleJsonRowInput> read() {
return readLine();
}
public BatchItemReader(FileManager fileManager) {
this.fileManager = fileManager;
}
private List<SingleJsonRowInput> readLine() {
List<String> result = fileManager.readTextJsonFile("C:\\Users\\orenl\\Desktop\\small.json");
List<SingleJsonRowInput> singles = new LinkedList<>();
SingleJsonRowInput singleJsonRowInput = null;
for (String line : result) {
System.out.println("#### Reading line: " + line);
singleJsonRowInput = gson.fromJson(line, SingleJsonRowInput.class);
singles.add(singleJsonRowInput);
}
if (count > 5) {
return null;
}
count++;
return singles;
}
}
MultiResourceItemReader:
#Bean
public MultiResourceItemReader<SingleJsonRowInput> multiResourceItemReader(){
Resource resources[]=new Resource[]{new FileSystemResource("small.json")};
MultiResourceItemReader<SingleJsonRowInput> multiResourceItemReader=new MultiResourceItemReader<>();
multiResourceItemReader.setResources(resources);
multiResourceItemReader.setDelegate(new FlatFileItemReader<>());
return multiResourceItemReader;
}

How get original message after get an errorHandler and write a file

I've been building a Spring integration Service Email using Java DSL.
This service must have a recovery policy in order to retry sending the emails but I'm not getting success.
A brief story: The application recieve a Payload and Header and try to send to email server. It tries 3 times and in case of failure, it creates a new file with Header and Body of message.
How could I get the original Message(Header and Payload) and put the information pair in a json file, in case of failure to send the email?
Thanks.
This is my beans and the service:
/**
* #################
* MESSAGE ENDPOINTS
* #################
*/
#Bean(name = PollerMetadata.DEFAULT_POLLER)
public PollerMetadata poller() {
return Pollers
.fixedRate(NumberUtils.createLong(QUEUE_RATE))
.maxMessagesPerPoll(NumberUtils.createLong(QUEUE_CAPACITY))
.errorHandler(e -> LOG.error("Exception : " + e.getMessage()))
.get();
}
#Bean
public MessageChannel recoveryChannel() {
return MessageChannels.direct().get();
}
#MessagingGateway
public static interface MailService {
#Gateway(requestChannel = "mail.input")
void sendMail(String body, #Headers Map<String,String> headers);
}
#Bean
public RetryPolicy retryPolicy() {
final Map<Class<? extends Throwable>, Boolean> map =
new HashMap<Class<? extends Throwable>, Boolean>() {
{
put(MailSendException.class,true);
put(RuntimeException.class, true);
}
private static final long serialVersionUID = -1L;
};
final RetryPolicy ret = new SimpleRetryPolicy(3, map, true);
return ret;
}
#Bean
public RetryTemplate retryTemplate() {
final RetryTemplate ret = new RetryTemplate();
ret.setRetryPolicy(retryPolicy());
ret.setThrowLastExceptionOnExhausted(false);
return ret;
}
#Bean
public Advice retryAdvice() {
final RequestHandlerRetryAdvice advice = new RequestHandlerRetryAdvice();
advice.setRetryTemplate(retryTemplate());
RecoveryCallback<Object> recoveryCallBack = new ErrorMessageSendingRecoverer(recoveryChannel());
advice.setRecoveryCallback(recoveryCallBack);
return advice;
}
private MailSendingMessageHandlerSpec mailOutboundAdapter(){
MailSendingMessageHandlerSpec msmhs =
Mail.outboundAdapter(emailServerHost())
.port(serverPort())
.credentials(MAIL_USER_NAME, MAIL_PASSWORD)
.protocol(emailProtocol())
.javaMailProperties(p -> p
.put("mail.debug", "true")
.put("mail.smtp.ssl.enable",enableSSL())
.put("mail.smtp.connectiontimeout", 5000)
.put("mail.smtp.timeout", 5000));
return msmhs;
}
#Bean
public FileWritingMessageHandler fileOutboundAdapter(){
FileWritingMessageHandler fwmhs = Files
.outboundAdapter(new File("logs/errors/"))
.autoCreateDirectory(true)
.get();
return fwmhs;
}
/**
* ################
* FLOWS
* ################
*/
#Bean
public IntegrationFlow smtp(){
return IntegrationFlows.from("mail.input")
.channel(MessageChannels.queue())
.handle(this.mailOutboundAdapter(),
e -> e.id("smtpOut")
.advice(retryAdvice())
)
.get();
}
#Bean
public IntegrationFlow errorFlow(){
return IntegrationFlows.from(recoveryChannel())
.transform(Transformers.toJson())
.enrichHeaders(c -> c.header(FileHeaders.FILENAME, "emailErrors"))
.handle(this.fileOutboundAdapter())
.get();
}
}
The error message has a payload MessagingException. It has two properties cause and failedMessage.
The failed message is the message at the point of failure, with headers and payload.

Spring Bean Scope for StringRedisConnection

I have the following two bean definitions for Spring Data Redis. I cant seem to find the relevant documentation to determine the scopes(singleton,request or session) of these beans for a web app.
#Bean
public StringRedisTemplate redisTemplate() throws Exception {
StringRedisTemplate redisTemplate = new StringRedisTemplate();
redisTemplate.setConnectionFactory(jedisConnectionFactory());
return redisTemplate;
}
#Bean
public StringRedisConnection stringRedisConnection() throws Exception {
return new DefaultStringRedisConnection(redisTemplate().getConnectionFactory().getConnection());
}
Thanks to #Christoph Strobl recommendation here is the implementation Iam currently using
public List<String> testAutoComplete(String key,String query, int limitCount){
StringRedisSerializer serializer = new StringRedisSerializer();
RedisZSetCommands.Range range = Range.range();
range.gt(query);
RedisZSetCommands.Limit limit = new RedisZSetCommands.Limit();
limit.count(limitCount);
return template.execute(new RedisCallback< List<String>>() {
public List<String> doInRedis(RedisConnection connection) {
Set<byte[]> results = connection.zRangeByLex(serializer.serialize(key), range,limit);
List<String> resultAsString = new ArrayList<String>();
for(byte[] result : results){
resultAsString.add(serializer.deserialize(result));
}
return resultAsString;
}
},false);
}

Spring batch : FlatFileItemWriter header never called

I have a weird issue with my FlatFileItemWriter callbacks.
I have a custom ItemWriter implementing both FlatFileFooterCallback and FlatFileHeaderCallback. Consequently, I set header and footer callbacks in my FlatFileItemWriter like this :
ItemWriter Bean
#Bean
#StepScope
public ItemWriter<CityItem> writer(FlatFileItemWriter<CityProcessed> flatWriter, #Value("#{jobExecutionContext[inputFile]}") String inputFile) {
CityItemWriter itemWriter = new CityItemWriter();
flatWriter.setHeaderCallback(itemWriter);
flatWriter.setFooterCallback(itemWriter);
itemWriter.setDelegate(flatWriter);
itemWriter.setInputFileName(inputFile);
return itemWriter;
}
FlatFileItemWriter Bean
#Bean
#StepScope
public FlatFileItemWriter<CityProcessed> flatFileWriterArchive(#Value("#{jobExecutionContext[outputFileArchive]}") String outputFile) {
FlatFileItemWriter<CityProcessed> flatWriter = new FlatFileItemWriter<CityProcessed>();
FileSystemResource isr;
isr = new FileSystemResource(new File(outputFile));
flatWriter.setResource(isr);
DelimitedLineAggregator<CityProcessed> aggregator = new DelimitedLineAggregator<CityProcessed>();
aggregator.setDelimiter(";");
BeanWrapperFieldExtractor<CityProcessed> beanWrapper = new BeanWrapperFieldExtractor<CityProcessed>();
beanWrapper.setNames(new String[]{
"country", "name", "population", "popUnder25", "pop25To50", "pop50to75", "popMoreThan75"
});
aggregator.setFieldExtractor(beanWrapper);
flatWriter.setLineAggregator(aggregator);
flatWriter.setEncoding("ISO-8859-1");
return flatWriter;
}
Step Bean
#Bean
public Step stepImport(StepBuilderFactory stepBuilderFactory, ItemReader<CityFile> reader, ItemWriter<CityItem> writer, ItemProcessor<CityFile, CityItem> processor,
#Qualifier("flatFileWriterArchive") FlatFileItemWriter<CityProcessed> flatFileWriterArchive, ExecutionContextPromotionListener executionContextListener) {
return stepBuilderFactory.get("stepImport").<CityFile, CityItem> chunk(10).reader(reader(null)).processor(processor).writer(writer).stream(flatFileWriterArchive)
.listener(executionContextListener).build();
}
I have the classic content in my writeFooter, writeHeader and write methods.
ItemWriter code
public class CityItemWriter implements ItemWriter<CityItem>, FlatFileFooterCallback, FlatFileHeaderCallback, ItemStream {
private FlatFileItemWriter<CityProcessed> writer;
private static int totalUnknown = 0;
private static int totalSup10000 = 0;
private static int totalInf10000 = 0;
private String inputFileName = "-";
public void setDelegate(FlatFileItemWriter<CityProcessed> delegate) {
writer = delegate;
}
public void setInputFileName(String name) {
inputFileName = name;
}
private Predicate<String> isNullValue() {
return p -> p == null;
}
#Override
public void write(List<? extends CityItem> cities) throws Exception {
List<CityProcessed> citiesCSV = new ArrayList<>();
for (CityItem item : cities) {
String populationAsString = "";
String less25AsString = "";
String more25AsString = "";
/*
* Some processing to get total Unknown/Sup 10000/Inf 10000
* and other data
*/
// Write in CSV file
CityProcessed cre = new CityProcessed();
cre.setCountry(item.getCountry());
cre.setName(item.getName());
cre.setPopulation(populationAsString);
cre.setLess25(less25AsString);
cre.setMore25(more25AsString);
citiesCSV.add(cre);
}
writer.write(citiesCSV);
}
#Override
public void writeFooter(Writer fileWriter) throws IOException {
String newLine = "\r\n";
String totalUnknown= "Subtotal:;Unknown;" + String.valueOf(nbUnknown) + newLine;
String totalSup10000 = ";Sum Sup 10000;" + String.valueOf(nbSup10000) + newLine;
String totalInf10000 = ";Sum Inf 10000;" + String.valueOf(nbInf10000) + newLine;
String total = "Total:;;" + String.valueOf(nbSup10000 + nbInf10000 + nbUnknown) + newLine;
fileWriter.write(newLine);
fileWriter.write(totalUnknown);
fileWriter.write(totalSup10000);
fileWriter.write(totalInf10000);
fileWriter.write(total );
}
#Override
public void writeHeader(Writer fileWriter) throws IOException {
String newLine = "\r\n";
String firstLine= "FILE PROCESSED ON: ;" + new SimpleDateFormat("MM/dd/yyyy").format(new Date()) + newLine;
String secondLine= "Filename: ;" + inputFileName + newLine;
String colNames= "Country;Name;Population...;...having less than 25;...having more than 25";
fileWriter.write(firstLine);
fileWriter.write(secondLine);
fileWriter.write(newLine);
fileWriter.write(colNames);
}
#Override
public void close() throws ItemStreamException {
writer.close();
}
#Override
public void open(ExecutionContext context) throws ItemStreamException {
writer.open(context);
}
#Override
public void update(ExecutionContext context) throws ItemStreamException {
writer.update(context);
}
}
When I run my batch, I only have the data for each city (write method part) and the footer lines. If I comment the whole content of write method and footer callback, I still don't have the header lines. I tried to add a System.out.println() text in my header callback, it looks like it's never called.
Here is an example of the CSV file produced by my batch :
France;Paris;2240621;Unknown;Unknown
France;Toulouse;439553;Unknown;Unknown
Spain;Barcelona;1620943;Unknown;Unknown
Spain;Madrid;3207247;Unknown;Unknown
[...]
Subtotal:;Unknown;2
;Sum Sup 10000;81
;Sum Inf 10000;17
Total:;;100
What is weird is that my header used to work before, when I added both footer and header callbacks. I didn't change them, and I don't see what I've done in my code to "broke" my header callback... And of course, I have no save of my first code. Because I see only now that my header has disappeared (I checked my few last files, and it looks like my header is missing for some time but I didn't see it), I can't just remove my modifications to see when/why it happens.
Do you have any idea to solve this problem ?
Thanks
When using Java config as you are, it's best to return the most specific type possible (the opposite of what you're normally told to do in java programming). In this case, your writer is returning ItemWriter, but is step scoped. Because of this a proxy is created that can only see the type that your java config returns which in this case is ItemWriter and does not expose the methods on the ItemStream interface. If you return CityItemWriter, I'd expect things to work.

CAS consumer not working as expected

I have a CAS consumer AE which is expected to iterates over CAS objects in a pipeline, serialize them and add the serialized CASs to an xml file.
public class DataWriter extends JCasConsumer_ImplBase {
private File outputDirectory;
public static final String PARAM_OUTPUT_DIRECTORY = "outputDir";
#ConfigurationParameter(name=PARAM_OUTPUT_DIRECTORY, defaultValue=".")
private String outputDir;
CasToInlineXml cas2xml;
public void initialize(UimaContext context) throws ResourceInitializationException {
super.initialize(context);
ConfigurationParameterInitializer.initialize(this, context);
outputDirectory = new File(outputDir);
if (!outputDirectory.exists()) {
outputDirectory.mkdirs();
}
}
#Override
public void process(JCas jCas) throws AnalysisEngineProcessException {
String file = fileCollectionReader.fileName;
File outFile = new File(outputDirectory, file + ".xmi");
FileOutputStream out = null;
try {
out = new FileOutputStream(outFile);
String xmlAnnotations = cas2xml.generateXML(jCas.getCas());
out.write(xmlAnnotations.getBytes("UTF-8"));
/* XmiCasSerializer ser = new XmiCasSerializer(jCas.getCas().getTypeSystem());
XMLSerializer xmlSer = new XMLSerializer(out, false);
ser.serialize(jCas.getCas(), xmlSer.getContentHandler());*/
if (out != null) {
out.close();
}
}
catch (IOException e) {
throw new AnalysisEngineProcessException(e);
}
catch (CASException e) {
throw new AnalysisEngineProcessException(e);
}
}
I am using it inside a pipeline after all my annotators, but it couldn't read CAS objects (I am getting NullPointerException at jCas.getCas()). It looks like I don't seem to understand the proper usage of CAS consumer. I appreciate any suggestions.