How multithreaded Steps works internally while reading the flatfile? - spring-batch

I am looking to read Flatfile which is in 10 GB. For that, I chose to use ThreadPoolTaskExecutor to make my step multi-threded.
I am wondering how these 4 worker threads are working internally? How one thread doesn't read the data read by another thread. If someone can explain how it's working internally, that will be great help.
#Bean
#StepScope
public FlatFileItemReader<Transaction> fileTransactionReader(#Value("#{jobParameters['inputFlatFile']}") Resource resource) {
return new FlatFileItemReaderBuilder<Transaction>()
.saveState(false)
.resource(resource)
.delimited()
.names(new String[] {"account", "amount", "timestamp"})
.fieldSetMapper(fieldSet -> {
Transaction transaction = new Transaction();
transaction.setAccount(fieldSet.readString("account"));
transaction.setAmount(fieldSet.readBigDecimal("amount"));
transaction.setTimestamp(fieldSet.readDate("timestamp", "yyyy-MM-dd HH:mm:ss"));
return transaction;
})
.build();
}
Code -
#Bean
public Job multithreadedJob() {
return this.jobBuilderFactory.get("multithreadedJob")
.start(step1())
.build();
}
#Bean
public Step step1() {
ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
taskExecutor.setCorePoolSize(4);
taskExecutor.setMaxPoolSize(4);
taskExecutor.afterPropertiesSet();
return this.stepBuilderFactory.get("step1")
.<Transaction, Transaction>chunk(100)
.reader(fileTransactionReader(null))
.writer(writer(null))
.taskExecutor(taskExecutor)
.build();
}

FlatFileItemReader is not in itself thread-safe as it extends AbstractItemCountingItemStreamItemReader whose javadoc states Subclasses are inherently not thread-safe. So strictly speaking, you should wrap it in a SynchronizedItemStreamReader. See also: Can I use FlatfileItemReader with Taskexecutor?
Having said that, if you
don't care about restartability,
don't care about the line numbers,
don't use a mapping that would require state,
set saveState to false,
and don't change the reader's default bufferedReaderFactory,
then the reader is just a thin wrapper around
a BufferedReader whose method readLine is called for each FlatFileItemReader::read,
and a LineMapper that maps each line to the target type
And BufferedReader is thread-safe which makes your reader effectively safe to call in a multi-threaded step.
But beware: The Spring Batch API makes no promises about the thread-safety of the reader. Quite the opposite, actually. So, the multi-threaded behavior is at least in theory up to change in future versions. Furthermore, there are a lot of conditions listed above which someday may no longer hold for your implementation. Thus, using a SynchronizedItemStreamReader is really recommended.
See also Can spring batch multi-threaded step be used safely if number of items in file are very less?

Related

Persistence of execution information in Axon Saga

We are using the Axon Framework to implement the Saga Pattern in Java. Axon uses two tables (ASSOCIATION_VALUE_ENTRY and SAGA_ENTRY) to store all the necessary information after each step of the saga. And at the end of the process (if it is correct, or, in case of error, all the compensations have been executed), it deletes the registers.
If for any reason, after an error, the compensations cannot be executed, we are able to resume the execution at the point where it failed, based on the stored information. Until here, everything is ok.
The issue came when we wanted to improve the resilience of the process and we checked what happened if the service died during the execution of a saga. According to the above, we expected the information of the execution to be persisted in the tables, but they were empty: the information only appeared when the process couldn't continue due to an error in a compensation (and no final delete action was executed).
Analyzing the source code of the Axon's JpaSagaStore class implementation, the interactions with the database (insert, update and delete) are persisted with a flush instead of a commit. The global commit is managed in the AbstractUnitOfWork class (as far as we understand). And here is where we have the doubts:
According to the literature, the flush writes in the database but the register is in a READ_UNCOMMITED state. The only way to see them in the database would be activating the READ_UNCOMMITED isolation level, with the problematic of the 'dirty reads', right? There would be any additional consideration/issue to have into account?
Does Axon have an alternative in order to ensure the persistence of the saga registers? Mainly if we couldn't activate the READ_UNCOMMITED mode (due to internal policies).
EDIT:
Summarizing it a lot, all starts with this method
public void startSaga(SagaWorkflow sagaWorkflow, Serializable sagaInput) {
StartSagaEvt startSagaEvt = StartSagaEvt.builder().sagaWorkflow(sagaWorkflow).sagaInput(sagaInput).build();
eventBus.publish(GenericEventMessage.asEventMessage(startSagaEvt));
}
Where:
eventBus is the Axon's internal one
sagaInput is simply a Serializable with some input values
SagaWorkflow is a Serializable that models the whole saga flow, whose main attribute is a LinkedList of nodes (the different steps of the saga, each one can have a different logic)
StartSagaEvt is just the POJO that models the event sent to the bus
After this, Axon performs all its 'magic' and finally arrives to the internal code:
AnnotatedSagaRepository.doCreateInstance --> AnnotatedSagaRepository.storeSaga --> [...] --> JpaSagaStore.insertSaga
public void insertSaga(Class<?> sagaType, String sagaIdentifier, Object saga, Set<AssociationValue> associationValues) {
EntityManager entityManager = entityManagerProvider.getEntityManager();
AbstractSagaEntry<?> entry = createSagaEntry(saga, sagaIdentifier, serializer);
entityManager.persist(entry);
for (AssociationValue associationValue : associationValues) {
storeAssociationValue(entityManager, sagaType, sagaIdentifier, associationValue);
}
if (logger.isDebugEnabled()) {
logger.debug("Storing saga id {} as {}", sagaIdentifier, serializedSagaAsString(entry));
}
if (useExplicitFlush) {
entityManager.flush();
}
}
The same applies for the update and delete phases. As far as I know, all the handle of the commit/rollback is performed in the class AbstractUnitOfWork, that intervenes just at the end of the complete saga flow.
This leads me to the following considerations/questions:
what sense has to keep the transaction open during the whole process instead of committing after each step? If for any reason the process fails, goes down, the database is not accessible,... all the saved information is lost.
There must be a design reason for this behavior, but I'm not able to see it. Or maybe there is a configuration to change it (hopefully, although I doubt it).
Thanks in advance for any comment!
EDIT 2
Effectively, we are using it as a kind of state machine, where the saga flow is a sequence of steps, each one with an action and a compensation, and we jump from one to another until reach an "END" status.
#Saga
class GenericSaga {
private EventBus eventBus;
private CustomCommandGateway commandGateway;
[...]
#StartSaga
#SagaEventHandler(associationProperty = "sagaId")
public void startStep(StartSagaEvt startSagaEvt) {
// Initializes de GenericSaga and associate several properties with SagaLifecycle.associateWith(key, value);
[...]
// Transit to the next (first) step
eventBus.publish(GenericEventMessage.asEventMessage(new StepSagaEvt(startSagaEvt)));
}
#SagaEventHandler(associationProperty = "sagaId")
public void nextStep(StepSagaEvt stepSagaEvt) {
// Identifies what is the next step in the defined flow, considering if it should be executed sequentially or concurrently, or if it is the end of the flow and then call the SagaLifecycle.end()
[...]
// Also checks if it has to execute the compensation logic of the step
[...]
// Execute
Serializable actionOutput = commandGateway.sendAndWaitEx(stepAction.getActionInput());
}
#SagaEventHandler(associationProperty = "sagaId")
public void resumeSaga(ResumeSagaEvt resumeSagaEvt) {
// Recover information from the execution that we want to resume
[...]
// Transit to the next step
eventBus.publish(GenericEventMessage.asEventMessage(new StepSagaEvt(resumeSagaEvt)));
}
}
As you can see, we don't have an endSaga annotation, and maybe that's the problem. But in our current situation we have kicked forward, and be have defined our custom implementation of the JpaSagaStore, in order to force a local transaction in the insertSaga and updateSaga methods.
Based on my understanding, I think you are somehow misusing the Saga component from Axon Framework. I assume from your question that you are trying to build a form of a 'state machine' using your own SagaWorkflow object. If that is the case, I have to say this is not how Axon intends the usage of Sagas.
To add to that, let me give you a pseudo-sample of what a Saga should look like.
#Saga
class SagaWorkflow {
private transient CommandGateway commandGateway;
#StartSaga
#SagaEventHandler(associationProperty = "yourProperty")
public void on(SagaInputEvent event) {
// validate, associate with another property and fire a command
SagaLifecycle.associateWith("associationPropertyKey", "associationPropertyValue");
commandGateway.send(new GivenCommand());
}
#SagaEventHandler(associationProperty = "associationPropertyValue")
public void on(AnotherEvent event) {
// validate and fire a command or finish the saga
SagaLifecycle.end();
}
#EndSaga
#SagaEventHandler(associationProperty = "anyProperty")
public void on(FinishSagaEvent event) {
// check if you need to fire extra commands to tell others it's finished or just do it silently
}
}
#Saga Annotation will make sure Axon Framework handles the whole Saga process for you, storing (serializing) it to the database when each (Saga)EventHandler is executed
#SagaEventHandler will make sure the 'Event Handling method' reacts to a given Event, only if it contains the associationProperty as part of the Event (to understand it better, I will share our docs link)
#EndSaga will tell Axon Framework to finalize the Saga after the execution of the method (finalizing means deleting it from the database)
SagaLifecycle provides several 'utilities' methods to interact with the Saga's lifecycle and associations
In the example, I made the CommandGateway transient because the Saga is serialized and stored on the database. You would not Axon to serializer any external component, like the gateway, as well
Of course, there is more to it.
You can check Axon's docs for that. But I hope this gives you enough material and ideas to use Sagas within Axon Framework better!
KR

Sleuth tracing is not working for transactional Kafka producers

Currently, we are using transactional Kafka producers. What we have noticed is that the tracing aspect of Kafka is missing which means we don't get to see the instrumentation of Kafka producers thereby missing the b3 headers.
After going through the code, we found that the post processors are not invoked for transactional producers which means the TracingProducer is never created by the TraceProducerPostProcessor. Is there a reason for that? Also, what is the work around for enabling tracing for the transactional producers? It seems there is not a single place easily to create a tracing producer (DefaultKafkaProducerFactory #doCreateTxProducer is private)
Screen shot attached(DefaultKafkaProducerFactory class). In the screenshot you can see the post processors are invoked only for raw producer not for the case for transactional producer.
Your help will be much appreciated.
Thanks
DefaultKafkaProducerFactory#createRawProducer
??
createRawProducer() is called for both transactional and non-transactional producers:
Something else is going on.
EDIT
The problem is that sleuth replaces the producer with a different one, but factory discards that and uses the original.
https://github.com/spring-projects/spring-kafka/issues/1778
EDIT2
Actually, it's a good thing that we discard the tracing producer here; Sleuth also wraps the factory in a proxy and wraps the CloseSafeProducer in a TracingProducer; but I see the same result with both transactional and non-transactional producers...
#SpringBootApplication
public class So67194702Application {
public static void main(String[] args) {
SpringApplication.run(So67194702Application.class, args);
}
#Bean
public ApplicationRunner runner(ProducerFactory<String, String> pf) {
return args -> {
Producer<String, String> prod = pf.createProducer();
prod.close();
};
}
}
Putting a breakpoint on the close()...
Thanks Gary Russell for the very quick response. The createRawConsumer is effectivly called for both transactional and non transactional consumers.
Sleuth is using the TraceConsumerPostProcessor to wrap a Kafka consumer into a TracingConsumer. As the ProducerPostProcessor interface extends the Function interface, we may suppose the result of the function could/should be used but the createRawConsumer method of the DefaultKafkaProducerFactory is applying the post processors without using the return type. Causing the issue in this specific case.
So, couldn't we modify the implementation of the createRawConsumer to assign the result of the post processor. If not, wouldn't it be better to have post processors extending a Consumer instead of a Function?
Successful test made by overriding the createRawConsumer method as follow
#Override
protected Producer<K, V> createRawProducer(Map<String, Object> rawConfigs) {
Producer<K, V> kafkaProducer = new KafkaProducer<>(rawConfigs, getKeySerializerSupplier().get(), getValueSerializerSupplier().get());
for (ProducerPostProcessor<K, V> pp : getPostProcessors()) {
kafkaProducer = pp.apply(kafkaProducer);
}
return kafkaProducer;
}
Thank you for your help.

How to correctly use suspend functions with coroutines on webflux?

I'm new to reactive programming and because I've already used kotlin with spring-web in the past, I decided to go to spring-webflux on this new project I'm working on. Then I discovered Mono and Flux apis and decided to use spring-data-r2dbc to keep full reactive stack (I'm aware I don't know how far this new project could be from meeting all reactive expectations, I'm doing this to learn a new tool, not because this is the perfect scenario for this new tool)
then I noticed I could replace all reactive streams apis from webflux with kotlin's native coroutines. I also opted by coroutines simply to learn and have less 'external frameworky' code
my application is quite simple (it's an url shortener):
1. parse some url out of http request's body into 3 parts
2. exchange each part to its postgres id on each respective table
3. concat these 3 ids into a new url, sending an 200 http response with this new url
my reactive controller is
#Configuration
class UrlRouter {
#Bean
fun urlRoutes(
urlHandler: UrlHandler,
redirectHandler: RedirectHandler
) = coRouter {
POST("/e", urlHandler::encode)
GET("/{*url}", redirectHandler::redirect)
}
}
as you can imagine, UrlHandler is responsible for the steps numbered above and RedirectHandler does the oposite: receiving an encoded url, it redirects to the right url received on number 1.
question 1: checking on coRouter, I assumed that for each http call, spring will start a new coroutine to resolve that call(oposing to a new thread on traditional spring-web), and each of these can create and depend on several other sub coroutines. Is this right? Does this hierarchy exist?
here's my UrlHandler fragment:
#Component
class UrlHandler(
private val cache: CacheService,
#Value("\${redirect-url-prefix}") private val prefix: String
) {
companion object {
val mapper = jacksonObjectMapper()
}
suspend fun encode(serverRequest: ServerRequest): ServerResponse =
try {
val bodyMap: Map<String, String> = mapper.readValue(serverRequest.awaitBody<String>())
// parseUrl being a string extension function just splitting
// that could throw IndexOutOfBoundsException
val (host, path, query) = bodyMap["url"]!!.parseUrl()
val hostId: Long = cache.findIdFromHost(host)
val pathId: Long? = cache.findIdFromPath(path)
val queryId: Long? = cache.findIdFromQuery(query)
val encodedUrl = "$prefix/${someOmmitedStringConcatenation(hostId, pathId, queryId)}"
ok().bodyValueAndAwait(mapOf("url" to encodedUrl))
} catch (e: IndexOutOfBoundsException) {
ServerResponse.badRequest().buildAndAwait()
}
all three findIdFrom*** calls try to retrieve an existing id and if it doesn't exist, save new entity and return new id from postgres sequence. This is done by CoroutineCrudRepository interfaces. Since my methods should always suspend, all 3 findIdFrom*** also suspend:
#Repository
interface HostUrlRepo : CoroutineCrudRepository<HostUrl, Long> {
suspend fun findByHost(host: String): HostUrl?
}
question 2: looking here I've found either invoke reactive query methods or have native suspended functions. Since I've read methods should always suspend, I've decided to keep myself using suspend. Is this bad/wrong in any way?
these 3 findIdFrom*** are independent and could be called to run in parallel and then only at someOmmitedStringConcatenation I should wait for any unfinished calls to actually build my encoded url
question 3: since every single method has the suspend modifier, it will run exactly as on traditional imperative sequential paradigm (wasting any benefit from parallel programming) ?
question 4: is this a valid scenario for coroutines usage? If so, how should I change my code to best fit the parallelism I want above?
possible solutions I've found for question 4:
question 4.1: source 1 inside each findIdFrom*** wrap it with withContext(Dispatchers.IO){ /*actual code here*/ } and then on encode function:
coroutineScope {
val hostIdDeferred = async { findIdFrom***() }
val pathIdDeferred = async { findIdFrom***() }
val queryIdDeferred = async { findIdFrom***() }
}
and when I want to use them, just use hostIdDeferred.await() to get the value. If I'm using Dispatchers.IO scope to run code inside new children coroutines, why coroutineScope is necessary? Is this the correct way, specifying a scope to the new coroutine child and then using coroutineScope to have a deferred val?
question 4.2: source 2 val resultOne = Async(Dispatchers.IO) { function1() } Intellij wasn't able to recognize/import any Async expression. How can I use this one and how it differs from previous one?
I'm open to improve and clarify any point on this question
I'll try to answer some of your questions:
q2: No, nothing wrong with it. Suspend methods can propagate all the way back to a controller. If your controllers are reactive, i.e. if you use RSocket with org.springframework.messaging.handler.annotation.MessageMapping, then even even controller methods can be suspend.
q3: right, but each method is still your source code is much simpler
q4.2: I wouldn't consider that website as a trustworthy source. There is an official documentation with examples: async

Can I use SpringData by itself [duplicate]

I'm trying to wire up Spring Data JPA objects manually so that I can generate DAO proxies (aka Repositories) - without using a Spring bean container.
Inevitably, I will be asked why I want to do this: it is because our project is already using Google Guice (and on the UI using Gin with GWT), and we don't want to maintain another IoC container configuration, or pull in all the resulting dependencies. I know we might be able to use Guice's SpringIntegration, but this would be a last resort.
It seems that everything is available to wire the objects up manually, but since it's not well documented, I'm having a difficult time.
According to the Spring Data user's guide, using repository factories standalone is possible. Unfortunately, the example shows RepositoryFactorySupport which is an abstract class. After some searching I managed to find JpaRepositoryFactory
JpaRepositoryFactory actually works fairly well, except it does not automatically create transactions. Transactions must be managed manually, or nothing will get persisted to the database:
entityManager.getTransaction().begin();
repositoryInstance.save(someJpaObject);
entityManager.getTransaction().commit();
The problem turned out to be that #Transactional annotations are not used automatically, and need the help of a TransactionInterceptor
Thankfully, the JpaRepositoryFactory can take a callback to add more AOP advice to the generated Repository proxy before returning:
final JpaTransactionManager xactManager = new JpaTransactionManager(emf);
final JpaRepositoryFactory factory = new JpaRepositoryFactory(emf.createEntityManager());
factory.addRepositoryProxyPostProcessor(new RepositoryProxyPostProcessor() {
#Override
public void postProcess(ProxyFactory factory) {
factory.addAdvice(new TransactionInterceptor(xactManager, new AnnotationTransactionAttributeSource()));
}
});
This is where things are not working out so well. Stepping through the debugger in the code, the TransactionInterceptor is indeed creating a transaction - but on the wrong EntityManager. Spring manages the active EntityManager by looking at the currently executing thread. The TransactionInterceptor does this and sees there is no active EntityManager bound to the thread, and decides to create a new one.
However, this new EntityManager is not the same instance that was created and passed into the JpaRepositoryFactory constructor, which requires an EntityManager. The question is, how do I make the TransactionInterceptor and the JpaRepositoryFactory use the same EntityManager?
Update:
While writing this up, I found out how to solve the problem but it still may not be the ideal solution. I will post this solution as a separate answer. I would be happy to hear any suggestions on a better way to use Spring Data JPA standalone than how I've solve it.
The general principle behind the design of JpaRepositoryFactory and the according Spring integration JpaRepositoryFactory bean is the following:
We're assuming you run your application inside a managed JPA runtime environment, not caring about which one.
That's the reason we rely on injected EntityManager rather than an EntityManagerFactory. By definition the EntityManager is not thread safe. So if dealt with an EntityManagerFactory directly we would have to rewrite all the resource managing code a managed runtime environment (just like Spring or EJB) would provide you.
To integrate with the Spring transaction management we use Spring's SharedEntityManagerCreator that actually does the transaction resource binding magic you've implemented manually. So you probably want to use that one to create EntityManager instances from your EntityManagerFactory. If you want to activate the transactionality at the repository beans directly (so that a call to e.g. repo.save(…) creates a transaction if none is already active) have a look at the TransactionalRepositoryProxyPostProcessor implementation in Spring Data Commons. It actually activates transactions when Spring Data repositories are used directly (e.g. for repo.save(…)) and slightly customizes the transaction configuration lookup to prefer interfaces over implementation classes to allow repository interfaces to override transaction configuration defined in SimpleJpaRepository.
I solved this by manually binding the EntityManager and EntityManagerFactory to the executing thread, before creating repositories with the JpaRepositoryFactory. This is accomplished using the TransactionSynchronizationManager.bindResource method:
emf = Persistence.createEntityManagerFactory("com.foo.model", properties);
em = emf.createEntityManager();
// Create your transaction manager and RespositoryFactory
final JpaTransactionManager xactManager = new JpaTransactionManager(emf);
final JpaRepositoryFactory factory = new JpaRepositoryFactory(em);
// Make sure calls to the repository instance are intercepted for annotated transactions
factory.addRepositoryProxyPostProcessor(new RepositoryProxyPostProcessor() {
#Override
public void postProcess(ProxyFactory factory) {
factory.addAdvice(new TransactionInterceptor(xactManager, new MatchAlwaysTransactionAttributeSource()));
}
});
// Create your repository proxy instance
FooRepository repository = factory.getRepository(FooRepository.class);
// Bind the same EntityManger used to create the Repository to the thread
TransactionSynchronizationManager.bindResource(emf, new EntityManagerHolder(em));
try{
repository.save(someInstance); // Done in a transaction using 1 EntityManger
} finally {
// Make sure to unbind when done with the repository instance
TransactionSynchronizationManager.unbindResource(getEntityManagerFactory());
}
There must be be a better way though. It seems strange that the RepositoryFactory was designed to use EnitiyManager instead of an EntityManagerFactory. I would expect, that it would first look to see if an EntityManger is bound to the thread and then either create a new one and bind it, or use an existing one.
Basically, I would want to inject the repository proxies, and expect on every call they internally create a new EntityManager, so that calls are thread safe.

How is the skipping implemented in Spring Batch?

I was wondering how I could determine in my ItemWriter, whether Spring Batch was currently in chunk-processing-mode or in the fallback single-item-processing-mode. In the first place I didn't find the information how this fallback mechanism is implemented anyway.
Even if I haven't found the solution to my actual problem yet, I'd like to share my knowledge about the fallback mechanism with you.
Feel free to add answers with additional information if I missed anything ;-)
The implementation of the skip mechanism can be found in the FaultTolerantChunkProcessor and in the RetryTemplate.
Let's assume you configured skippable exceptions but no retryable exceptions. And there is a failing item in your current chunk causing an exception.
Now, first of all the whole chunk shall be written. In the processor's write() method you can see, that a RetryTemplate is called. It also gets two references to a RetryCallback and a RecoveryCallback.
Switch over to the RetryTemplate. Find the following method:
protected <T> T doExecute(RetryCallback<T> retryCallback, RecoveryCallback<T> recoveryCallback, RetryState state)
There you can see that the RetryTemplate is retried as long as it's not exhausted (i.e. exactly once in our configuration). Such a retry will be caused by a retryable exception. Non-retryable exceptions will immediately abort the retry mechanism here.
After the retries are exhausted or aborted, the RecoveryCallback will be called:
e = handleRetryExhausted(recoveryCallback, context, state);
That's where the single-item-processing mode will kick-in now!
The RecoveryCallback (which was defined in the processor's write() method!) will put a lock on the input chunk (inputs.setBusy(true)) and run its scan() method. There you can see, that a single item is taken from the chunk:
List<O> items = Collections.singletonList(outputIterator.next());
If this single item can be processed by the ItemWriter correctly, than the chunk will be finished and the ChunkOrientedTasklet will run another chunk (for the next single items). This will cause a regular call to the RetryCallback, but since the chunk has been locked by the RecoveryTemplate, the scan() method will be called immediately:
if (!inputs.isBusy()) {
// ...
}
else {
scan(contribution, inputs, outputs, chunkMonitor);
}
So another single item will be processed and this is repeated, until the original chunk has been processed item-by-item:
if (outputs.isEmpty()) {
inputs.setBusy(false);
That's it. I hope you found this helpful. And I even more hope that you could find this easily via a search engine and didn't waste too much time, finding this out by yourself. ;-)
A possible approach to my original problem (the ItemWriter would like to know, whether it's in chunk or single-item mode) could be one of the following alternatives:
Only when the passed chunk is of size one, any further checks have to be done
When the passed chunk is a java.util.Collections.SingletonList, we would be quite sure, since the FaultTolerantChunkProcessor does the following:
List items = Collections.singletonList(outputIterator.next());
Unfortunately, this class is private and so we can't check it with instanceOf.
In reverse, if the chunk is an ArrayList we could also be quite sure, since the Spring Batch's Chunk class uses it:
private List items = new ArrayList();
One blurring left would be buffered items read from the execution context. But I'd expect those to be ArrayLists also.
Anyway, I still find this method too vague. I'd rather like to have this information provided by the framework.
An alternative would be to hook my ItemWriter in the framework execution. Maybe ItemWriteListener.onWriteError() is appropriate.
Update: The onWriteError() method will not be called if you're in single-item mode and throw an exception in the ItemWriter. I think that's a bug a filed it: https://jira.springsource.org/browse/BATCH-2027
So this alternative drops out.
Here's a snippet to do the same without any framework means directly in the writer
private int writeErrorCount = 0;
#Override
public void write(final List<? extends Long> items) throws Exception {
try {
writeWhatever(items);
} catch (final Exception e) {
if (this.writeErrorCount == 0) {
this.writeErrorCount = items.size();
} else {
this.writeErrorCount--;
}
throw e;
}
this.writeErrorCount--;
}
public boolean isWriterInSingleItemMode() {
return writeErrorCount != 0;
}
Attention: One should rather check for the skippable exceptions here and not for Exception in general.