I have an ItemProcessor that writes to the database and an ItemWriter that writes to a file. I want to be able to rollback in the ItemProcessor and still be able to pass by the ItemWriter.
To be more specific, my logic takes the object received from a FlatFileItemReader and uses it to do some maintenance to the database, if every thing goes good some properties are set in the object and if I face any database problem, I catch the exception and I set some other properties in the object. The processed object is later written to a file through a FlatFileItemWriter. I tried to extend the FlatFileItemWriter to throw an Exception after writing to trigger a rollback but doing so stops the things from going to the file.
I have found that the FlatFileItemWriter is working with a TransactionAwareBufferedWriter, so there is no way to rollback the transaction and still write to the file on the disk as the TransactionAwareBufferedWriter will only flush to the disk if the transaction accomplishes without problems. So I did implement a pure Writer that implements the ItemWriter interface and I just throw an Exception just after writing to the file, flushing and closing the stream and it's working, it does rollback the transaction and in the same time it writes to the file on the disk.
I will not accept my answer for the moment to see if someone can come up with a better approach, maybe without loosing the benefits of having a FlatFileItemWriter.
Related
I'm using EclipseLink (JPA). I've got a service that renames some files and calls a DAO which updates database tables accordingly. This happens in a transaction, and there might be a rollback. So that the DB stays in sync with the file system, I want to at least attempt to revert the file rename, and log an error when that can't be done. Is there some way to register a Runnable (or similar) with the transaction so that it executes on rollback?
EclipseLink has SessionEventListener which can respond to pre (preRollbackTransaction) and post rollback events, and I can get the transaction from the EntityManager. So I was thinking I could create some kind of global registry. However, I can't figure out how to match up the SessionEvent available in the listener with the EntityTransaction available from EntityManager.getTransaction(). Is there some way to do this, or is there a more straightforward way to provide a callback on rollback?
I could simply use a try-catch and execute the revert code in the catch, but it is possible the rollback will occur after the session method is returned (i.e. higher up the stack).
I have two transactions.
In first I select an entity, do validations, upload provided by client file to S3 and then update this entity with info about S3 file.
Second transaction is simply deleting this entity.
Now, assume that someone called first transaction and immediately second. Second one will proceed faster and first one will throw DbUpdateConcurrencyException, as selected entity no longer exists on update query.
I get DbUpdateConcurrencyException, when my transaction has IsolationLevel.ReadCommited. But if I set IsolationLevel.Serializable it throws InvalidOperationException with 40001 postgres code. Could someone explain why do I get different errors, because it seems to me that outcome should be the same, as both errors invoked by updating non-existing entity?
The 40001 error corresponds to the SQLSTATE serialization_failure (see the table of error codes).
It's generated by the database engine in serializable isolation level when it detects that there are concurrent transactions and this transaction may have produced a result that could not have been obtained if the concurrent transactions had been run serially.
When using IsolationLevel.ReadCommited, it's impossible to obtain this error, because choosing this level of isolation precisely means that the client-side doesn't want to have these isolation checks being done by the database.
On the other hand, the DbUpdateConcurrencyException is probably not generated by the database engine. It's generated by the entity framework. The database itself is fine with an UPDATE updating zero row, it's not an error at the SQL level.
I think you get the serialization failure if the database errors out first, and the DbUpdateConcurrencyException error if the database doesn't error out, but the second layer in the order of layering (the EF) does.
The typical way to deal with serialization failures, at the serializable isolation level, is for the client-side to retry the transaction when it gets a 40001 error. The retried transaction will have a fresh view of the data and hopefully will pass (otherwise, loop on retrying).
The typical way to deal with concurrency at lesser isolation levels like Read Committed it to explicitly lock objets before accessing them to force the serialization of concurrent transactions.
I found following information from the spec. But it's not clear enough for me who is not an english native.
The PostPersist and PostRemove callback methods are invoked for an entity after the entity has been made persistent or removed. These callbacks will also be invoked on all entities to which these operations are cascaded. The PostPersist and PostRemove methods will be invoked after the database insert and delete operations respectively. These database operations may occur directly after the persist, merge, or remove operations have been invoked or they may occur directly after a flush operation has occurred (which may be at the end of the transaction). Generated primary key values are available in the PostPersist method.
My question is any transaction related jobs can be rolled back after #PostRemove?
Let's say my entity deletes some offline files on #PostRemove
class MyEntity {
#PostRemove
private void onPostRemove() {
// delete offline files related to this entity
// not restorable!
}
}
Is it possible that those offline files deleted from the storage and the entity still left in the database? (by rollback?)
Yes, it is possible that your files are deleted and your entites are still left in db after a rollback. #PostRemove is in transaction.
If you want to be absolutely sure that your files are deleted if and only if the transaction is successfully completed then you should delete the files after the commit() succeeds not using the callback methods. But if you also need to be sure that the entity is removed if and only if the file is deleted, then you have a problem. You need a transactional way of accessing the file system.
For a simple solution move your files into a to_be_deleted-folder during the db-transaction. Therefore you can use the callback methods. The files are finally deleted when commit() succeeds and restored on failure.
If you want to elaborate it a bit more and your application is running in a java EE container, then you might want to look at CDI events or even at a jca adapter. If you are using Spring you can register a TransactionSynchronizationAdapter see this answer.
It depends.
If you're using multiple flushes (EntityManager#flush()), the transaction could still be rolled back. Otherwise, any callbacks prefixed with Post are executed after the database transaction is complete.
From the Spring Guides:
For starters, the #EnableBatchProcessing annotation adds many critical
beans that support jobs and saves you a lot of leg work. This example
uses a memory-based database (provided by #EnableBatchProcessing),
meaning that when it’s done, the data is gone.
How can I make the execution state backed by a database (or some other persistent record) so that, in case the application crashes, the job is resumed from the previous state?
My solution, until now, is having my ItemReader be an JdbcCursorItemReader which reads records from a table whose column X is not NULL, and my ItemWriter be a JdbcBatchItemWriter which updates the record with data on column X, making it non-null (so that it won't be picked on the next execution). However, this seems really hackish and I believe there's a more elegant way. Can anyone please shed some light?
When using the #EnableBatchProcessing annotation, if you provide a DataSource bean definition called dataSoure, Spring Batch will use that database for the job repository store instead of the in memory map. You can read more about this functionality in the documentation here: http://docs.spring.io/spring-batch/trunk/apidocs/org/springframework/batch/core/configuration/annotation/EnableBatchProcessing.html
We have implemented the ItemProcessListener and the SkipListener in the Batch job, which is using the Spring batch.
We are able to log the skipped items in the database, without creating a separate transaction. But the when the onProcessError method is invoked in the ItemProcessListener, the transaction is rolled back, due to the corresponding Runtime Exception.
We used #Transactional and propagation as REQUIRES_NEW, on the service interface for DB update, but it still rolled back the transaction.
Our objective is to log the exception details in the database whenever there is an error in process or writer components and the batch fails. As explained above, the logging is not working when we fire a DB insert from the onProcessError method or onWriteError method in the overridden listener. The transaction is rolled back.
We tried creating a new transaction using annotation on onProcessError , but it failed. Kindly provide some inputs for the same.
Hope this makes the problem clear.
The Spring configuration requires us to enable the annotations.
The annotations can be enabled by using the tx schema in the applicationContext.xml
As per spring documentation,
http://docs.spring.io/spring/docs/2.5.6/reference/transaction.html#transaction-declarative-annotation. We must include the following namespaces,
xmlns:tx="http://www.springframework.org/schema/tx"
http://www.springframework.org/schema/tx http://www.springframework.org/schema/tx/spring-tx-2.5.xsd
Since the itemProcessorListener's onProcessError method is executed in the same transaction as the chunk which was being processed, the method is invoked before the RollBack. Handling the transaction using #Transactional (propagation = Propagation.REQUIRES_NEW) causes the new transaction to be created and the data persisted in the database.