How is the skipping implemented in Spring Batch? - spring-batch

I was wondering how I could determine in my ItemWriter, whether Spring Batch was currently in chunk-processing-mode or in the fallback single-item-processing-mode. In the first place I didn't find the information how this fallback mechanism is implemented anyway.
Even if I haven't found the solution to my actual problem yet, I'd like to share my knowledge about the fallback mechanism with you.
Feel free to add answers with additional information if I missed anything ;-)

The implementation of the skip mechanism can be found in the FaultTolerantChunkProcessor and in the RetryTemplate.
Let's assume you configured skippable exceptions but no retryable exceptions. And there is a failing item in your current chunk causing an exception.
Now, first of all the whole chunk shall be written. In the processor's write() method you can see, that a RetryTemplate is called. It also gets two references to a RetryCallback and a RecoveryCallback.
Switch over to the RetryTemplate. Find the following method:
protected <T> T doExecute(RetryCallback<T> retryCallback, RecoveryCallback<T> recoveryCallback, RetryState state)
There you can see that the RetryTemplate is retried as long as it's not exhausted (i.e. exactly once in our configuration). Such a retry will be caused by a retryable exception. Non-retryable exceptions will immediately abort the retry mechanism here.
After the retries are exhausted or aborted, the RecoveryCallback will be called:
e = handleRetryExhausted(recoveryCallback, context, state);
That's where the single-item-processing mode will kick-in now!
The RecoveryCallback (which was defined in the processor's write() method!) will put a lock on the input chunk (inputs.setBusy(true)) and run its scan() method. There you can see, that a single item is taken from the chunk:
List<O> items = Collections.singletonList(outputIterator.next());
If this single item can be processed by the ItemWriter correctly, than the chunk will be finished and the ChunkOrientedTasklet will run another chunk (for the next single items). This will cause a regular call to the RetryCallback, but since the chunk has been locked by the RecoveryTemplate, the scan() method will be called immediately:
if (!inputs.isBusy()) {
// ...
}
else {
scan(contribution, inputs, outputs, chunkMonitor);
}
So another single item will be processed and this is repeated, until the original chunk has been processed item-by-item:
if (outputs.isEmpty()) {
inputs.setBusy(false);
That's it. I hope you found this helpful. And I even more hope that you could find this easily via a search engine and didn't waste too much time, finding this out by yourself. ;-)

A possible approach to my original problem (the ItemWriter would like to know, whether it's in chunk or single-item mode) could be one of the following alternatives:
Only when the passed chunk is of size one, any further checks have to be done
When the passed chunk is a java.util.Collections.SingletonList, we would be quite sure, since the FaultTolerantChunkProcessor does the following:
List items = Collections.singletonList(outputIterator.next());
Unfortunately, this class is private and so we can't check it with instanceOf.
In reverse, if the chunk is an ArrayList we could also be quite sure, since the Spring Batch's Chunk class uses it:
private List items = new ArrayList();
One blurring left would be buffered items read from the execution context. But I'd expect those to be ArrayLists also.
Anyway, I still find this method too vague. I'd rather like to have this information provided by the framework.
An alternative would be to hook my ItemWriter in the framework execution. Maybe ItemWriteListener.onWriteError() is appropriate.
Update: The onWriteError() method will not be called if you're in single-item mode and throw an exception in the ItemWriter. I think that's a bug a filed it: https://jira.springsource.org/browse/BATCH-2027
So this alternative drops out.
Here's a snippet to do the same without any framework means directly in the writer
private int writeErrorCount = 0;
#Override
public void write(final List<? extends Long> items) throws Exception {
try {
writeWhatever(items);
} catch (final Exception e) {
if (this.writeErrorCount == 0) {
this.writeErrorCount = items.size();
} else {
this.writeErrorCount--;
}
throw e;
}
this.writeErrorCount--;
}
public boolean isWriterInSingleItemMode() {
return writeErrorCount != 0;
}
Attention: One should rather check for the skippable exceptions here and not for Exception in general.

Related

Persistence of execution information in Axon Saga

We are using the Axon Framework to implement the Saga Pattern in Java. Axon uses two tables (ASSOCIATION_VALUE_ENTRY and SAGA_ENTRY) to store all the necessary information after each step of the saga. And at the end of the process (if it is correct, or, in case of error, all the compensations have been executed), it deletes the registers.
If for any reason, after an error, the compensations cannot be executed, we are able to resume the execution at the point where it failed, based on the stored information. Until here, everything is ok.
The issue came when we wanted to improve the resilience of the process and we checked what happened if the service died during the execution of a saga. According to the above, we expected the information of the execution to be persisted in the tables, but they were empty: the information only appeared when the process couldn't continue due to an error in a compensation (and no final delete action was executed).
Analyzing the source code of the Axon's JpaSagaStore class implementation, the interactions with the database (insert, update and delete) are persisted with a flush instead of a commit. The global commit is managed in the AbstractUnitOfWork class (as far as we understand). And here is where we have the doubts:
According to the literature, the flush writes in the database but the register is in a READ_UNCOMMITED state. The only way to see them in the database would be activating the READ_UNCOMMITED isolation level, with the problematic of the 'dirty reads', right? There would be any additional consideration/issue to have into account?
Does Axon have an alternative in order to ensure the persistence of the saga registers? Mainly if we couldn't activate the READ_UNCOMMITED mode (due to internal policies).
EDIT:
Summarizing it a lot, all starts with this method
public void startSaga(SagaWorkflow sagaWorkflow, Serializable sagaInput) {
StartSagaEvt startSagaEvt = StartSagaEvt.builder().sagaWorkflow(sagaWorkflow).sagaInput(sagaInput).build();
eventBus.publish(GenericEventMessage.asEventMessage(startSagaEvt));
}
Where:
eventBus is the Axon's internal one
sagaInput is simply a Serializable with some input values
SagaWorkflow is a Serializable that models the whole saga flow, whose main attribute is a LinkedList of nodes (the different steps of the saga, each one can have a different logic)
StartSagaEvt is just the POJO that models the event sent to the bus
After this, Axon performs all its 'magic' and finally arrives to the internal code:
AnnotatedSagaRepository.doCreateInstance --> AnnotatedSagaRepository.storeSaga --> [...] --> JpaSagaStore.insertSaga
public void insertSaga(Class<?> sagaType, String sagaIdentifier, Object saga, Set<AssociationValue> associationValues) {
EntityManager entityManager = entityManagerProvider.getEntityManager();
AbstractSagaEntry<?> entry = createSagaEntry(saga, sagaIdentifier, serializer);
entityManager.persist(entry);
for (AssociationValue associationValue : associationValues) {
storeAssociationValue(entityManager, sagaType, sagaIdentifier, associationValue);
}
if (logger.isDebugEnabled()) {
logger.debug("Storing saga id {} as {}", sagaIdentifier, serializedSagaAsString(entry));
}
if (useExplicitFlush) {
entityManager.flush();
}
}
The same applies for the update and delete phases. As far as I know, all the handle of the commit/rollback is performed in the class AbstractUnitOfWork, that intervenes just at the end of the complete saga flow.
This leads me to the following considerations/questions:
what sense has to keep the transaction open during the whole process instead of committing after each step? If for any reason the process fails, goes down, the database is not accessible,... all the saved information is lost.
There must be a design reason for this behavior, but I'm not able to see it. Or maybe there is a configuration to change it (hopefully, although I doubt it).
Thanks in advance for any comment!
EDIT 2
Effectively, we are using it as a kind of state machine, where the saga flow is a sequence of steps, each one with an action and a compensation, and we jump from one to another until reach an "END" status.
#Saga
class GenericSaga {
private EventBus eventBus;
private CustomCommandGateway commandGateway;
[...]
#StartSaga
#SagaEventHandler(associationProperty = "sagaId")
public void startStep(StartSagaEvt startSagaEvt) {
// Initializes de GenericSaga and associate several properties with SagaLifecycle.associateWith(key, value);
[...]
// Transit to the next (first) step
eventBus.publish(GenericEventMessage.asEventMessage(new StepSagaEvt(startSagaEvt)));
}
#SagaEventHandler(associationProperty = "sagaId")
public void nextStep(StepSagaEvt stepSagaEvt) {
// Identifies what is the next step in the defined flow, considering if it should be executed sequentially or concurrently, or if it is the end of the flow and then call the SagaLifecycle.end()
[...]
// Also checks if it has to execute the compensation logic of the step
[...]
// Execute
Serializable actionOutput = commandGateway.sendAndWaitEx(stepAction.getActionInput());
}
#SagaEventHandler(associationProperty = "sagaId")
public void resumeSaga(ResumeSagaEvt resumeSagaEvt) {
// Recover information from the execution that we want to resume
[...]
// Transit to the next step
eventBus.publish(GenericEventMessage.asEventMessage(new StepSagaEvt(resumeSagaEvt)));
}
}
As you can see, we don't have an endSaga annotation, and maybe that's the problem. But in our current situation we have kicked forward, and be have defined our custom implementation of the JpaSagaStore, in order to force a local transaction in the insertSaga and updateSaga methods.
Based on my understanding, I think you are somehow misusing the Saga component from Axon Framework. I assume from your question that you are trying to build a form of a 'state machine' using your own SagaWorkflow object. If that is the case, I have to say this is not how Axon intends the usage of Sagas.
To add to that, let me give you a pseudo-sample of what a Saga should look like.
#Saga
class SagaWorkflow {
private transient CommandGateway commandGateway;
#StartSaga
#SagaEventHandler(associationProperty = "yourProperty")
public void on(SagaInputEvent event) {
// validate, associate with another property and fire a command
SagaLifecycle.associateWith("associationPropertyKey", "associationPropertyValue");
commandGateway.send(new GivenCommand());
}
#SagaEventHandler(associationProperty = "associationPropertyValue")
public void on(AnotherEvent event) {
// validate and fire a command or finish the saga
SagaLifecycle.end();
}
#EndSaga
#SagaEventHandler(associationProperty = "anyProperty")
public void on(FinishSagaEvent event) {
// check if you need to fire extra commands to tell others it's finished or just do it silently
}
}
#Saga Annotation will make sure Axon Framework handles the whole Saga process for you, storing (serializing) it to the database when each (Saga)EventHandler is executed
#SagaEventHandler will make sure the 'Event Handling method' reacts to a given Event, only if it contains the associationProperty as part of the Event (to understand it better, I will share our docs link)
#EndSaga will tell Axon Framework to finalize the Saga after the execution of the method (finalizing means deleting it from the database)
SagaLifecycle provides several 'utilities' methods to interact with the Saga's lifecycle and associations
In the example, I made the CommandGateway transient because the Saga is serialized and stored on the database. You would not Axon to serializer any external component, like the gateway, as well
Of course, there is more to it.
You can check Axon's docs for that. But I hope this gives you enough material and ideas to use Sagas within Axon Framework better!
KR

Async Issue for DbContext used in constructor of objects created via DI

I wonder if someone can clarify when to await and when not to. Consider this code
public Task<List<User>> GetUsersForParent(int someParentId)
{
var qry = Context.Users.Where(u=>u.parent = someParentId)
.OrderBy(u=>u.Surname)
return FilterActive(qry);
}
//Actually in a generic base class, but not important (I don't think)
protected Task<List<T>> FilterActive(IQueryable<T> query) where T: BaseEntity
{
return query.Where( q=>q.Active == true ).ToListAsync();
}
Then it is used like this
var users = await DbHandler.GetUsersForParent(1);
So the calling method is awaited, but the others are not. Is this correct?
Should the method calling the ToListAsync() be awaited? (this I assume is now doing the work)
My reason for this is I am getting the DbContext is being used by a second thread dreaded exception. I am running out of places to look. My understanding is the methods are building up the whole task which is executed, but could this be messing with the dbContext?
Edit re DbContext error
Having narrowed down the potential locations for the issue, via Debug.Print and SQL Query profiling (just in case that helps anyone else) I can see one statement being profiled (the next in profile is logging the exception) and I can see two methods being run via the debug print.
One of these methods is a PermissionsManager which, when constructed, initialises itself and loads the user data. This is constructed when requested via the DI framework.
The other method is the single query on the OnGet() method for the page. It is running a single query to get an entity by ID, it is awaited correctly.
My working theory at the moment is that the Thread running the DI construction and another thread running the Page initialise are colliding.
When I made the PermissionManager just _person = new Person() // await db.users.get(userid) the issue goes away. I could replicate the issue 1 in 2 or 3 times of refresh, and with that commented I could not replicate, despite refreshing the page 30+ times.
So my real question with async / await is probably more about DI injection and is that construction running on a different thread? if so, any best practice to avoid?
So the calling method is awaited, but the others are not. Is this correct?
I generally recommend using the async and await keywords, and only return the tasks directly if the method is extremely simple.
My reason for this is I am getting the DbContext is being used by a second thread dreaded exception. I am running out of places to look. My understanding is the methods are building up the whole task which is executed, but could this be messing with the dbContext?
No. At least, the code you posted cannot cause that exception. Whether the async/await keywords are used, or whether the tasks are returned directly, the methods are asynchronous and they do not attempt to do more than one thing on the dbcontext at once.
It's possible that your problem is further up the stack. Task.WhenAll is a good thing to search for when tracking this down.
Should the method calling the ToListAsync() be awaited? (this I assume is now doing the work)
If you await the contents of either method you will be returning the result type, not Task of result type which means the execution cannot be deferred.
Your error will be coming up because you either have multiple threads interacting with the same instance of DbContext, awaited or no this would cause problems, that or you have some code calling the ToListAsync()-containing method, or another async DbContext operation without awaiting.
Writing an EF data access layer returning Task is fairly dangerous which can shoot you in the foot very easily.
Given your code structure I would recommend a couple small changes:
public async Task<List<User>> GetUsersForParent(int someParentId)
{
var qry = Context.Users.Where(u=>u.parent = someParentId)
.OrderBy(u=>u.Surname);
qry = FilterActive(qry);
return await qry.ToListAsync();
}
protected IQueryable<T> FilterActive(IQueryable<T> query) where T: BaseEntity
{
return query.Where( q=> q.Active == true );
}
Notably here I would avoid returning Task to reduce risks of improper use and potentially intermittent bugs. The base-class method for FilterActive can return IQueryable<T> to apply the filter without triggering the execution of the operation. This way FilterActive can be applied whether you want a List, a Count, or simply do an Exists check.
Overall I would recommend exploring patterns that return IQueryable<TEntity> rather than List<TEntity> etc. as the later results in either a lot of limitations for performance and flexibility, or requires a lot of boiler-plate code to handle things like:
Sorting,
Pagination,
Getting just a Count,
Performing an Exists check,
Configurable filtering,
Selectively eager loading related data, or
Projection to generate efficient queries
Doing this with methods that return List<TEntity> either results in very complex code to support some of the above considerations, has these operations applied post-execution leading to heavier queries than would otherwise be needed, or requires a lot of near-duplicate code to handle each scenario.
So the constructor thing was a red herring. It was a missing await after all, just not where expected and in code that was unchanged.
I tracked down the culprit. There was a method in the basePage which hooked into the Filter of MVC pages. It took the user and loaded their permissions, however, since this loading of user permissions was made async, this method did not get awaited (it didn't need it before as was synchronous). I moved it to one of the async events on the page life cycle and all seems happy now (with a suitable await!). So it was a missing await, but the moral of the story is any time you make a sync method async, check what the heck is actually using it!

Spring Batch skip exception and rollback in Tasklet

I want to achieve a tasklet that can skip exceptions and rollback the transaction properly and I don't see a way to accomplish both things.
My tasklet reads from a queue of ids that gets filled in the constructor. In each invocation of the execute method one id is processed and depending on if the queue still has elements to be processed or not a RepeatStatus.FINISHED or RepeatStatus.CONTINUABLE is returned. I am using a tasklet instead of a chunk because the processing of each element is fairly complicated and implies doing multiple queries, instantiation of a lot of objects that all gets written to the database later.
The main problem is if I define a try/catch block to wrap the implementation, I can skip exceptions without problems and still be able to re-execute the tasklet with the next element in the queue, but the problem is that everything gets saved in the database. On the other hand, even if the processing of an element is done correctly without problems, if the commit fails for whatever reason, as the error occurs outside the reach and control of my code, the exception is not caught by my code and the tasklet execution is finished without the possibility to skip and continue with the next element of the queue.
This is a simplified schema of my tasklet:
public MyTasklet() {
elementsIds = myRepo.findProcessableElements();
}
#Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) {
Long id = elementsIds.remove();
try {
// Business logic
} catch (Exception e) {
// is there a way to tell the framework to rollback ?
LOG.error("error ...", e);
}
if (elementsIds.isEmpty()) {
return RepeatStatus.FINISHED;
} else {
return RepeatStatus.CONTINUABLE;
}
}
Is there a way to achieve these two requirements with tasklets:
To be able to tell the framework to rollback the transaction if an exception is caught in the implementation of the execute method
To continue the execution (consecutive calls) of the tasklet if a commit fails

Testing GWTP presenter with asynchronous calls

I'm using GWTP, adding a Contract layer to abstract the knowledge between Presenter and View, and I'm pretty satisfied of the result with GWTP.
I'm testing my presenters with Mockito.
But as time passed, I found it was hard to maintain a clean presenter with its tests.
There are some refactoring stuff I did to improve that, but I was still not satisfied.
I found the following to be the heart of the matter :
My presenters need often asynchronous call, or generally call to objects method with a callback to continue my presenter flow (they are usually nested).
For example :
this.populationManager.populate(new PopulationCallback()
{
public void onPopulate()
{
doSomeStufWithTheView(populationManager.get());
}
});
In my tests, I ended to verify the population() call of the mocked PopulationManager object. Then to create another test on the doSomeStufWithTheView() method.
But I discovered rather quickly that it was bad design : any change or refactoring ended to broke a lot of my tests, and forced me to create from start others, even though the presenter functionality did not change !
Plus I didn't test if the callback was effectively what I wanted.
So I tried to use mockito doAnswer method to do not break my presenter testing flow :
doAnswer(new Answer(){
public Object answer(InvocationOnMock invocation) throws Throwable
{
Object[] args = invocation.getArguments();
((PopulationCallback)args[0]).onPopulate();
return null;
}
}).when(this.populationManager).populate(any(PopulationCallback.class));
I factored the code for it to be less verbose (and internally less dependant to the arg position) :
doAnswer(new PopulationCallbackAnswer())
.when(this.populationManager).populate(any(PopulationCallback.class));
So while mocking the populationManager, I could still test the flow of my presenter, basically like that :
#Test
public void testSomeStuffAppends()
{
// Given
doAnswer(new PopulationCallbackAnswer())
.when(this.populationManager).populate(any(PopulationCallback.class));
// When
this.myPresenter.onReset();
// Then
verify(populationManager).populate(any(PopulationCallback.class)); // That was before
verify(this.myView).displaySomething(); // Now I can do that.
}
I am wondering if it is a good use of the doAnswer method, or if it is a code smell, and a better design can be used ?
Usually, my presenters tend to just use others object (like some Mediator Pattern) and interact with the view. I have some presenter with several hundred (~400) lines of code.
Again, is it a proof of bad design, or is it normal for a presenter to be verbose (because its using others objects) ?
Does anyone heard of some project which uses GWTP and tests its presenter cleanly ?
I hope I explained in a comprehensive way.
Thank you in advance.
PS : I'm pretty new to Stack Overflow, plus my English is still lacking, if my question needs something to be improved, please tell me.
You could use ArgumentCaptor:
Check out this blog post fore more details.
If I understood correctly you are asking about design/architecture.
This is shouldn't be counted as answer, it's just my thoughts.
If I have followed code:
public void loadEmoticonPacks() {
executor.execute(new Runnable() {
public void run() {
pack = loadFromServer();
savePackForUsageAfter();
}
});
}
I usually don't count on executor and just check that methods does concrete job by loading and saving. So the executor here is just instrument to prevent long operations in the UI thread.
If I have something like:
accountManager.setListener(this);
....
public void onAccountEvent(AccountEvent event) {
....
}
I will check first that we subscribed for events (and unsubscribed on some destroying) as well I would check that onAccountEvent does expected scenarios.
UPD1. Probably, in example 1, better would be extract method loadFromServerAndSave and check that it's not executed on UI thread as well check that it does everything as expected.
UPD2. It's better to use framework like Guava Bus for events processing.
We are using this doAnswer pattern in our presenter tests as well and usually it works just fine. One caveat though: If you test it like this you are effectively removing the asynchronous nature of the call, that is the callback is executed immediately after the server call is initiated.
This can lead to undiscovered race conditions. To check for those, you could make this a two-step process: when calling the server,the answer method only saves the callback. Then, when it is appropriate in your test, you call sometinh like flush() or onSuccess() on your answer (I would suggest making a utility class for this that can be reused in other circumstances), so that you can control when the callback for the result is really called.

GWT app getting java.util.ConcurrentModificationException from MVC pattern

I am getting this error everytime my Observers are traversed.
#Override
public void notifyObservers(ModelViewInterface model) {
for(Observer<ModelViewInterface> o : this.observers)
o.notify(model);
}
GWT does not have threads, so it is not a synchronization issue.
It seems to happen after I press a button, any ideas of how to avoid this error?
From the javadoc of ConcurrentModificationException:
Note that this exception does not always indicate that an object has been concurrently modified by a different thread. If a single thread issues a sequence of method invocations that violates the contract of an object, the object may throw this exception. For example, if a thread modifies a collection directly while it is iterating over the collection with a fail-fast iterator, the iterator will throw this exception.
So in your case, it seems that o.notify(model) modifies this.observers - directly or indirectly. This is a common phenomenon when modifying the collection you're iterating over.
To avoid concurrent modification, you can operate on a copy of the collection like this:
for(Observer<ModelViewInterface> o :
new ArrayList<ModelViewInterface>(this.observers)) {
o.notify(model);
}
However, sometimes this is not what you want - the current behaviour of o.notify could also indicate a bug.