Clarification needed with Spring batch concepts - spring-batch

I am new at Spring batch and I am having an issue implementing my business use case with Spring batch.
Basically, I am reading data from a database i.e. a List of subscribers to a newsletter. I then need to send an email to each subscriber as well as to insert data into the database in order to know which subscriber the email was sent to.
I use an ItemProcessor implementation whose process method returns a MimeMessage and takes a subscriber as an argument; the writer associated with this processor is of type: org.springframework.batch.item.mail.javamail.MimeMessageItemWriter.
The issue is that I need another writer for the database inserts (possibly using a CompositeItemWriter) that takes a List of subscribers as an argument and all I have as input is a MimeMessage from the above ItemProcessor.
Can anyone please help?

From what you've said using the ItemProcessor interface to save the message to the database is conceptually not right. You need to use ItemWriter for that. You can implement writing to DB as ItemWriter and sending the mail message as ItemWriter and use CompositeItemWriter to combine them.
Subscriber is passed to these item writers.
The transformation of Subscriber to MimeMessage is done by 2nd writer internally before transferring to MimeMessageItemWriter (which is aggregated by this writer).
Sending the message to subscriber should be done after saving to the DB, as DB can be rolled back if something goes wrong with sending the message (if you need that functionality), and your batch size should be 1 (otherwise rollback will wrongly discard all notifications which have been successfully sent).

Related

MongoDb, RabbitMq and outbox pattern

I have a question about mongodb and outbox pattern (and i'm quite newbie in mongo).
I am working on application, that uses mongodb as primary database.
I have some use cases, in which i need to save document to the database and then publish some event to message broker (RabbitMq).
Saved document must be consistent with published event - this means that if I save document i MUST also send message (solution must be resilient to for example server shutdown between document save and message send) so I decided to use outbox pattern.
In relational (sql) database this problem is trivial: I just start new transaction, then persist/change new object, then persist some kind of database scheduler record (that sends message to RabbitMq after transaction is committed), and then commit.
How do I achieve this in mongodb (once again with emphasis on no data loss and no 'phantom' message send). Should I use mongodb transactions (transactions are strongly disadvised in mongo community) or there are other, better solutions?
you can try mongodb change streams at a collection level. It is similar to triggers on the RDBMS world. You can then listen on that stream for events of your choice, in your case - an insert event, post which you can send that event to downstream systems.

Sorting Service Bus Queue Messages

i was wondering if there is a way to implement metadata or even multiple metadata to a service bus queue message to be used later on in an application to sort on but still maintaining FIFO in the queue.
So in short, what i want to do is:
Maintaining Fifo, that s First in First Out structure in the queue, but as the messages are coming and inserted to the queue from different Sources i want to be able to sort from which source the message came from with for example metadata.
I know this is possible with Topics where you can insert a property to the message, but also i am unsure if it is possible to implement multiple properties into the topic message.
Hope i made my self clear on what i am asking is possible.
I assume you use .NET API. If this case you can use Properties dictionary to write and read your custom metadata:
BrokeredMessage message = new BrokeredMessage(body);
message.Properties.Add("Source", mySource);
You are free to add multiple properties too. This is the same for both Queues and Topics/Subscriptions.
i was wondering if there is a way to implement metadata or even multiple metadata to a service bus queue message to be used later on in an application to sort on but still maintaining FIFO in the queue.
To maintain FIFO in the queue, you'd have to use Message Sessions. Without message sessions you would not be able to maintain FIFO in the queue itself. You would be able to set a custom property and use it in your application and sort out messages once they are received out of order, but you won't receive message in FIFO order as were asking in your original question.
If you drop the requirement of having an order preserved on the queue, the the answer #Mikhail has provided will be suitable for in-process sorting based on custom property(s). Just be aware that in-process sorting will be not a trivial task.

Passing data from itemreader to processor

How is the data read is passed from reader to Itemprocessor in Spring batch? Is there a queue where it is put from ItemReader's read method which is consumed by ItemProcessor? I have to read 10 records at a time from a database and process 5 at a time in the ItemProcessor's process method. ItemProcessor is taking the records one by one and I want to change it to 5 records at a time in the process method.
Every item that is returned from the read method of a reader will be forwarded to the processor as one item.
If you want to collect a group of items and pass them as a group to the processor, you need a reader that groups them.
You could implement something like a group-wrapper.
I explained such an approach in another answer: Spring Batch Processor

How to design spring batch to avoid long queue of request and restart failed job

I am writing a project that will be generating reports. It would read all requests from a database by making a rest call, based on the type of request it will make a rest call to an endpoint, after getting the response it will save the response in an object and save it back to the database by making a call to an endpoint.
I am using spring-batch to handle the batch work. So far what I came up with is a single job (reader, processor, writer) that will do the whole things. I am not sure if this is the correct design considering
I do not want to queue up requests if some request is taking a long time to get a response back. [not sure yet]
I do not want to hold up saving response until all the responses are received. [using commit-internal will help]
If the job crashes for some reason, how can I restart the job [maybe using batch-admin will help but what are my other options]
By using chunk oriented processing Reader, Processor and Writer get executed in order until Reader has nothing to return.
If you can read one item at a time, process it and send it back to the endpoint that handles the persistence this approach is handy.
If you must read ALL the information at once the reader will get a big collection with all items and pass it to processor. The processor will process all the items and send the result to the writer. You cannot send just a few to the writer so you would have to do the persistence directly from processor and that would be against the design.
So, as I understand this, you have two options:
Design a reader that can read one item at a time. Use the chunk oriented processing that you already started to read one item, process it and send it back for persistence. Have a look at how other readers are implemented (like JdbcCursorItemReader).
You create a tasklet that reads the whole collection of items process it and sends them back for processing. You can break this in different tasklets.
commit-interval only controls after how many items transaction is commited. So it will not help you as all the processing and persistence is done by calling rest services.
I have figured out a design and I think it will work fine.
As for the questions that I asked, following are the answers:
Using asynchronous processors will help avoiding any queue.
http://docs.spring.io/spring-batch/trunk/reference/html/springBatchIntegration.html#asynchronous-processors
using commit-internal will solve it
This thread has the answer - Spring batch :Restart a job and then start next job automatically

How to use Spring-batch to read messages from Spring-integration channel?

My existing spring-integration application dumps POJO messages to a channel. How do I hook spring-batch so it will read messages in real time from this channel? Do I need to create a custom ItemReader or is there something out-of-the-box that I can use? A simple sample XML configuration would be helpful as well.
I am not aware of anything "out of the box" but it would be trivial to wrap a PollableChannel (usually Queuechannel) in an ItemReader, simply use channel.receive(timeout) in read().
When the timeout expires, the reader returns null; indicating the end of the batch.