I have a requirement to implement in Spring batch,I need to read from a file and from a DB ,the data needs to be processed and written to an email
I have gone through the spring batch documentation but was unable to find a CHUNKtasklet which would read data from multiple readers
SO essentially I have to read from 2 different sources of data(one from file and another from DB,each will need to have its own mapper)
Regards
Tar
I see two options depending on how the data is structured:
Spring Batch relies heavily on composition when building batch components. One option would be to create a custom composite ItemReader that delegates to a other readers (ones Spring Batch provides or otherwise) and provides the logic to assemble a single object based on the results of those delegated ItemReaders.
You could use an ItemReader to provide the base information (say from a database) and use and ItemProcessor to enrich the item (say reading from a file).
Either of the above are normal ways to handle this type of input scenario.
Related
I'm working on a batch using spring-batch with one reader, one writer ,one processor. I have one CSV file as an input of my reader.
I wanted to use OpenCSV to convert one line to one bean but what i see from the documentation is that OpenCsv take one file and use the object CsvToBeanBuilder to map all the line of one file to a list of object.
I saw this post : Configuring openCSV instead of FlatFileItemReader in spring batch step
but there is no explanation on how to map one String line to a Bean object using opencsv. Do someone know if it's possible? thanks.
The explanation is in the comments. OpenCSV does the reading and the mapping. If you want to use OpenCSV in your Spring Batch app with a FlatFileItemReader, you only need the mapping part, ie a LineMapper implementation based on OpenCSV.
Now if OpenCSV does not provide a way to map a single line to a POJO, then it is probably not suitable to be used in Spring Batch in that way. In that case, you need to implement a custom ItemReader based on OpenCSV that does the reading and the mapping.
is it possible to implement batch insert using spring-data-jdbc somehow? Or can i get access to JDBCTemplate using this spring-data realization?
There is currently no support for batch operations.
There are two issues requesting that one might want to follow if one is interested in that feature: https://jira.spring.io/browse/DATAJDBC-328 and https://jira.spring.io/browse/DATAJDBC-314
If one is working with Spring Data JDBC there will always be a NamedParameterJdbcTemplate in the application context so one can get that injected in order to perform batch operations without any additional configuration.
I've been developing REST API using springboot and came across situation like API needs to maintain some reference data table eg : consumer will request with a particular key and that key will be interpreted inside the API by looking up reference data my question is from below methods which are the best way to handle this if none of these proper methods what are the suggested ways.
Have reference data in the database as a key-value pairs - modification is cons
Have it inside the project as an XML data file - not centralized modification on multiple places
3.Have it inside the JVM as the static data load on app bootstrap
further, this is open for you all to suggest me any additional ways or most optimum way
I am just wondering whether using a OneToManyResultSetExtractor or a ResultSetExtractor with Spring Batch's JdbcCursorItemReader?
The issue I have is that the expected RowMapper only deals with one object per row and I have a join sql query that returns many rows per object.
Out of the box, it does not support the use of a ResultSetExtractor. The reason for this is that the wrapping ItemReader is stateful and needs to be able to keep track of how many rows have been consumed (it wouldn't know otherwise). The way that type of functionality is typically done in Spring Batch is by using an ItemProcessor to enrich the object. Your ItemReader would return the one (of the one to many) and then the ItemProcessor would enrich the object with the many. This is a common pattern in batch processing called the driving query pattern. You can read more about it in the Spring Batch documentation here: http://docs.spring.io/spring-batch/trunk/reference/html/patterns.html
That being said, you could also wrap the JdbcCursorItemReader with your own implementation that performs the logic of aggregation for you.
I've used both spring-batch and drools on previous projects, separately. In my current project, I have a design where I need to process upto 500k xml objects, convert them to jaxB, apply rule on each of the object (the rule itself is fairly simple: compare properties and update two flags in a 'notification' object), and finally send an event so a spring web flow viewmodel (that can be a listener) will update itself. That's not the requirement for design but it's what I have implemented:
1) ItemReader (JaxB)
2) ItemProcessor:-maps to a ksession (stateful) and fires rules based on a drl file.
3) ItemWriter: prepares the necessary cleanup and raises appropriate events
Seems to me that the logic itself is straight forward, but when I added all the gluecode of batch job: itemReader, Itemprocessor, etc., a simple rule didn't work. Also, after reading several forums it seems RETE algo isn't going to scale well on batch applications.
In summary, is drools the best way to integrate a basic rules framework in spring-batch OR are there any light weight alternatives?
the rule itself is fairly simple: compare properties and update two flags in a 'notification' object
No need for any Rules Framework. That is what Spring Batch's ItemProcessor is for
from ItemProcessor JavaDocs:
"..an extension point which allows for the application of business logic in an item oriented processing scenario"
No need to complicate things with Drools or any other rules engine, unless you really need it => e.g. have dozens / hundreds of complex rules + that are not trivial to code.
usually the RETE algorithm is not a problem is a huge advantage. You need to design your solution with the assumption that it will be a batch process and it will work fine. You need to take into account that the big overhead in your scenario is creating all the 500k objects from the XML code. Once you get the objects if you design your business rules correctly it will perform correctly.