Spring Batch skip exception and rollback in Tasklet - spring-batch

I want to achieve a tasklet that can skip exceptions and rollback the transaction properly and I don't see a way to accomplish both things.
My tasklet reads from a queue of ids that gets filled in the constructor. In each invocation of the execute method one id is processed and depending on if the queue still has elements to be processed or not a RepeatStatus.FINISHED or RepeatStatus.CONTINUABLE is returned. I am using a tasklet instead of a chunk because the processing of each element is fairly complicated and implies doing multiple queries, instantiation of a lot of objects that all gets written to the database later.
The main problem is if I define a try/catch block to wrap the implementation, I can skip exceptions without problems and still be able to re-execute the tasklet with the next element in the queue, but the problem is that everything gets saved in the database. On the other hand, even if the processing of an element is done correctly without problems, if the commit fails for whatever reason, as the error occurs outside the reach and control of my code, the exception is not caught by my code and the tasklet execution is finished without the possibility to skip and continue with the next element of the queue.
This is a simplified schema of my tasklet:
public MyTasklet() {
elementsIds = myRepo.findProcessableElements();
}
#Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) {
Long id = elementsIds.remove();
try {
// Business logic
} catch (Exception e) {
// is there a way to tell the framework to rollback ?
LOG.error("error ...", e);
}
if (elementsIds.isEmpty()) {
return RepeatStatus.FINISHED;
} else {
return RepeatStatus.CONTINUABLE;
}
}
Is there a way to achieve these two requirements with tasklets:
To be able to tell the framework to rollback the transaction if an exception is caught in the implementation of the execute method
To continue the execution (consecutive calls) of the tasklet if a commit fails

Related

Persistence of execution information in Axon Saga

We are using the Axon Framework to implement the Saga Pattern in Java. Axon uses two tables (ASSOCIATION_VALUE_ENTRY and SAGA_ENTRY) to store all the necessary information after each step of the saga. And at the end of the process (if it is correct, or, in case of error, all the compensations have been executed), it deletes the registers.
If for any reason, after an error, the compensations cannot be executed, we are able to resume the execution at the point where it failed, based on the stored information. Until here, everything is ok.
The issue came when we wanted to improve the resilience of the process and we checked what happened if the service died during the execution of a saga. According to the above, we expected the information of the execution to be persisted in the tables, but they were empty: the information only appeared when the process couldn't continue due to an error in a compensation (and no final delete action was executed).
Analyzing the source code of the Axon's JpaSagaStore class implementation, the interactions with the database (insert, update and delete) are persisted with a flush instead of a commit. The global commit is managed in the AbstractUnitOfWork class (as far as we understand). And here is where we have the doubts:
According to the literature, the flush writes in the database but the register is in a READ_UNCOMMITED state. The only way to see them in the database would be activating the READ_UNCOMMITED isolation level, with the problematic of the 'dirty reads', right? There would be any additional consideration/issue to have into account?
Does Axon have an alternative in order to ensure the persistence of the saga registers? Mainly if we couldn't activate the READ_UNCOMMITED mode (due to internal policies).
EDIT:
Summarizing it a lot, all starts with this method
public void startSaga(SagaWorkflow sagaWorkflow, Serializable sagaInput) {
StartSagaEvt startSagaEvt = StartSagaEvt.builder().sagaWorkflow(sagaWorkflow).sagaInput(sagaInput).build();
eventBus.publish(GenericEventMessage.asEventMessage(startSagaEvt));
}
Where:
eventBus is the Axon's internal one
sagaInput is simply a Serializable with some input values
SagaWorkflow is a Serializable that models the whole saga flow, whose main attribute is a LinkedList of nodes (the different steps of the saga, each one can have a different logic)
StartSagaEvt is just the POJO that models the event sent to the bus
After this, Axon performs all its 'magic' and finally arrives to the internal code:
AnnotatedSagaRepository.doCreateInstance --> AnnotatedSagaRepository.storeSaga --> [...] --> JpaSagaStore.insertSaga
public void insertSaga(Class<?> sagaType, String sagaIdentifier, Object saga, Set<AssociationValue> associationValues) {
EntityManager entityManager = entityManagerProvider.getEntityManager();
AbstractSagaEntry<?> entry = createSagaEntry(saga, sagaIdentifier, serializer);
entityManager.persist(entry);
for (AssociationValue associationValue : associationValues) {
storeAssociationValue(entityManager, sagaType, sagaIdentifier, associationValue);
}
if (logger.isDebugEnabled()) {
logger.debug("Storing saga id {} as {}", sagaIdentifier, serializedSagaAsString(entry));
}
if (useExplicitFlush) {
entityManager.flush();
}
}
The same applies for the update and delete phases. As far as I know, all the handle of the commit/rollback is performed in the class AbstractUnitOfWork, that intervenes just at the end of the complete saga flow.
This leads me to the following considerations/questions:
what sense has to keep the transaction open during the whole process instead of committing after each step? If for any reason the process fails, goes down, the database is not accessible,... all the saved information is lost.
There must be a design reason for this behavior, but I'm not able to see it. Or maybe there is a configuration to change it (hopefully, although I doubt it).
Thanks in advance for any comment!
EDIT 2
Effectively, we are using it as a kind of state machine, where the saga flow is a sequence of steps, each one with an action and a compensation, and we jump from one to another until reach an "END" status.
#Saga
class GenericSaga {
private EventBus eventBus;
private CustomCommandGateway commandGateway;
[...]
#StartSaga
#SagaEventHandler(associationProperty = "sagaId")
public void startStep(StartSagaEvt startSagaEvt) {
// Initializes de GenericSaga and associate several properties with SagaLifecycle.associateWith(key, value);
[...]
// Transit to the next (first) step
eventBus.publish(GenericEventMessage.asEventMessage(new StepSagaEvt(startSagaEvt)));
}
#SagaEventHandler(associationProperty = "sagaId")
public void nextStep(StepSagaEvt stepSagaEvt) {
// Identifies what is the next step in the defined flow, considering if it should be executed sequentially or concurrently, or if it is the end of the flow and then call the SagaLifecycle.end()
[...]
// Also checks if it has to execute the compensation logic of the step
[...]
// Execute
Serializable actionOutput = commandGateway.sendAndWaitEx(stepAction.getActionInput());
}
#SagaEventHandler(associationProperty = "sagaId")
public void resumeSaga(ResumeSagaEvt resumeSagaEvt) {
// Recover information from the execution that we want to resume
[...]
// Transit to the next step
eventBus.publish(GenericEventMessage.asEventMessage(new StepSagaEvt(resumeSagaEvt)));
}
}
As you can see, we don't have an endSaga annotation, and maybe that's the problem. But in our current situation we have kicked forward, and be have defined our custom implementation of the JpaSagaStore, in order to force a local transaction in the insertSaga and updateSaga methods.
Based on my understanding, I think you are somehow misusing the Saga component from Axon Framework. I assume from your question that you are trying to build a form of a 'state machine' using your own SagaWorkflow object. If that is the case, I have to say this is not how Axon intends the usage of Sagas.
To add to that, let me give you a pseudo-sample of what a Saga should look like.
#Saga
class SagaWorkflow {
private transient CommandGateway commandGateway;
#StartSaga
#SagaEventHandler(associationProperty = "yourProperty")
public void on(SagaInputEvent event) {
// validate, associate with another property and fire a command
SagaLifecycle.associateWith("associationPropertyKey", "associationPropertyValue");
commandGateway.send(new GivenCommand());
}
#SagaEventHandler(associationProperty = "associationPropertyValue")
public void on(AnotherEvent event) {
// validate and fire a command or finish the saga
SagaLifecycle.end();
}
#EndSaga
#SagaEventHandler(associationProperty = "anyProperty")
public void on(FinishSagaEvent event) {
// check if you need to fire extra commands to tell others it's finished or just do it silently
}
}
#Saga Annotation will make sure Axon Framework handles the whole Saga process for you, storing (serializing) it to the database when each (Saga)EventHandler is executed
#SagaEventHandler will make sure the 'Event Handling method' reacts to a given Event, only if it contains the associationProperty as part of the Event (to understand it better, I will share our docs link)
#EndSaga will tell Axon Framework to finalize the Saga after the execution of the method (finalizing means deleting it from the database)
SagaLifecycle provides several 'utilities' methods to interact with the Saga's lifecycle and associations
In the example, I made the CommandGateway transient because the Saga is serialized and stored on the database. You would not Axon to serializer any external component, like the gateway, as well
Of course, there is more to it.
You can check Axon's docs for that. But I hope this gives you enough material and ideas to use Sagas within Axon Framework better!
KR

In spring-batch, how can I get the exception when there is a chunk error?

Spring batch provides a listener for capturing when a chunk error has occurred (either the #AfterChunkError annotation or the ChunkListener.afterChunkError interface). Both receive the ChunkContext and the API says:
Parameters:
context - the chunk context containing the exception that caused the
underlying rollback.
However, I don't see anything on the ChunkContext interface that would get me to the exception. How do I get from the ChunkContext to the relevant exception?
Exception is located in ChunkListener.ROLLBACK_EXCEPTION_KEY attribute:
context.getAttribute(ChunkListener.ROLLBACK_EXCEPTION_KEY)
Instead of implementing a ChunkListener you could implement one or more of ItemReadListener, ItemProcessListener and ItemWriteListener (or more simply ItemListenerSupport).
They respectively give acces to :
void onReadError(java.lang.Exception ex)
void onProcessError(T item, java.lang.Exception e)
void onWriteError(java.lang.Exception exception, java.util.List<? extends S> items)
I know this forces you to implement multiple methods to manage every aspect of a chunk, but you can write a custom method which takes an Exception and call it from these 3 methods.

Spring Batch: Listener event when Tasklet throws an exception

I'm using a tasklet and a StepExecutionListener but it seems there's no listener callback for a case where my tasklet throws an exception. For various other listener types – ChunkListener, ItemProcessListener, etc. – there is but none of those listeners work with tasklets.
All I want is an event after my tasklet executes regardless of whether it threw an exception or not. Is it possible to do that? It doesn't appear to be supported in the API.
Edit: Responding to #danidemi I'm registering the listener and tasklet using the programmatic API like this:
steps.get(name)
.listener(listener)
.tasklet(tasklet)
.build()
Where steps is an instance of StepBuilderFactory.
You can
manage exception into tasklet
store error into execution-context/external bean
manage error from stepExecutionListener
or in StepExecutionListener.afterStep(StepExecution stepExecution) lookup into stepExecution.getFailureExceptions()
I guess its too late now. But I recently came across this problem. It is easier to handle exception with ChunkListener, however exception handling can be done in Tasklet's RepeatStatus execute(StepContribution s, ChunkContext chunkContext) method (well, it is the only method that Tasklet interface has ^^). What you need is a try/catch block to catch the exceptions. However you will need to throw the caught exception again in order to roll back the transaction.
Here is a code-snippet. In my case, I had to stop the job if some data could not be read due to Database server shutdown.
#Override
public RepeatStatus execute(StepContribution s, ChunkContext chunkContext){
try{
getAllUsersFromDb(); // some operation that could throw an exception
// doesn't hurt to put all suspicious codes in this block tbh
}catch(Exception e){
if(e instanceof NonSkippableReadException){
chunkContext.getStepContext().getStepExecution().getJobExecution().stop();
}
throw e;
}
return RepeatStatus.FINISHED;
}

How is the skipping implemented in Spring Batch?

I was wondering how I could determine in my ItemWriter, whether Spring Batch was currently in chunk-processing-mode or in the fallback single-item-processing-mode. In the first place I didn't find the information how this fallback mechanism is implemented anyway.
Even if I haven't found the solution to my actual problem yet, I'd like to share my knowledge about the fallback mechanism with you.
Feel free to add answers with additional information if I missed anything ;-)
The implementation of the skip mechanism can be found in the FaultTolerantChunkProcessor and in the RetryTemplate.
Let's assume you configured skippable exceptions but no retryable exceptions. And there is a failing item in your current chunk causing an exception.
Now, first of all the whole chunk shall be written. In the processor's write() method you can see, that a RetryTemplate is called. It also gets two references to a RetryCallback and a RecoveryCallback.
Switch over to the RetryTemplate. Find the following method:
protected <T> T doExecute(RetryCallback<T> retryCallback, RecoveryCallback<T> recoveryCallback, RetryState state)
There you can see that the RetryTemplate is retried as long as it's not exhausted (i.e. exactly once in our configuration). Such a retry will be caused by a retryable exception. Non-retryable exceptions will immediately abort the retry mechanism here.
After the retries are exhausted or aborted, the RecoveryCallback will be called:
e = handleRetryExhausted(recoveryCallback, context, state);
That's where the single-item-processing mode will kick-in now!
The RecoveryCallback (which was defined in the processor's write() method!) will put a lock on the input chunk (inputs.setBusy(true)) and run its scan() method. There you can see, that a single item is taken from the chunk:
List<O> items = Collections.singletonList(outputIterator.next());
If this single item can be processed by the ItemWriter correctly, than the chunk will be finished and the ChunkOrientedTasklet will run another chunk (for the next single items). This will cause a regular call to the RetryCallback, but since the chunk has been locked by the RecoveryTemplate, the scan() method will be called immediately:
if (!inputs.isBusy()) {
// ...
}
else {
scan(contribution, inputs, outputs, chunkMonitor);
}
So another single item will be processed and this is repeated, until the original chunk has been processed item-by-item:
if (outputs.isEmpty()) {
inputs.setBusy(false);
That's it. I hope you found this helpful. And I even more hope that you could find this easily via a search engine and didn't waste too much time, finding this out by yourself. ;-)
A possible approach to my original problem (the ItemWriter would like to know, whether it's in chunk or single-item mode) could be one of the following alternatives:
Only when the passed chunk is of size one, any further checks have to be done
When the passed chunk is a java.util.Collections.SingletonList, we would be quite sure, since the FaultTolerantChunkProcessor does the following:
List items = Collections.singletonList(outputIterator.next());
Unfortunately, this class is private and so we can't check it with instanceOf.
In reverse, if the chunk is an ArrayList we could also be quite sure, since the Spring Batch's Chunk class uses it:
private List items = new ArrayList();
One blurring left would be buffered items read from the execution context. But I'd expect those to be ArrayLists also.
Anyway, I still find this method too vague. I'd rather like to have this information provided by the framework.
An alternative would be to hook my ItemWriter in the framework execution. Maybe ItemWriteListener.onWriteError() is appropriate.
Update: The onWriteError() method will not be called if you're in single-item mode and throw an exception in the ItemWriter. I think that's a bug a filed it: https://jira.springsource.org/browse/BATCH-2027
So this alternative drops out.
Here's a snippet to do the same without any framework means directly in the writer
private int writeErrorCount = 0;
#Override
public void write(final List<? extends Long> items) throws Exception {
try {
writeWhatever(items);
} catch (final Exception e) {
if (this.writeErrorCount == 0) {
this.writeErrorCount = items.size();
} else {
this.writeErrorCount--;
}
throw e;
}
this.writeErrorCount--;
}
public boolean isWriterInSingleItemMode() {
return writeErrorCount != 0;
}
Attention: One should rather check for the skippable exceptions here and not for Exception in general.

GWT app getting java.util.ConcurrentModificationException from MVC pattern

I am getting this error everytime my Observers are traversed.
#Override
public void notifyObservers(ModelViewInterface model) {
for(Observer<ModelViewInterface> o : this.observers)
o.notify(model);
}
GWT does not have threads, so it is not a synchronization issue.
It seems to happen after I press a button, any ideas of how to avoid this error?
From the javadoc of ConcurrentModificationException:
Note that this exception does not always indicate that an object has been concurrently modified by a different thread. If a single thread issues a sequence of method invocations that violates the contract of an object, the object may throw this exception. For example, if a thread modifies a collection directly while it is iterating over the collection with a fail-fast iterator, the iterator will throw this exception.
So in your case, it seems that o.notify(model) modifies this.observers - directly or indirectly. This is a common phenomenon when modifying the collection you're iterating over.
To avoid concurrent modification, you can operate on a copy of the collection like this:
for(Observer<ModelViewInterface> o :
new ArrayList<ModelViewInterface>(this.observers)) {
o.notify(model);
}
However, sometimes this is not what you want - the current behaviour of o.notify could also indicate a bug.