Good design to Insert multiple records

Good design to Insert multiple records - jpa

I am working on a program that read from a file and insert line by line into Oracle 11g database using JTA/EclipseLink 2.3.x JPA with container managed transaction.
I've developed the code below, but I'm bugged by the fact that the failed lines need to be known and being fixed manually.
public class CreateAccount {
#PersistenceContext(unitName="filereader")
private EntityManager em;
private ArrayList<String> unprocessed;
public void upload(){
//reading the file into unprocessed
for (String s : unprocessed) {
this.process(s);
}
}
private void process(String s){
//Setting the entity with appropriate properties.
//Validate the entity
em.persist(account);
}
}
This first version takes a few seconds to commit 5000 rows to database, as it seems taking advantage of caching the prepared statement. This works fine when all entities to persist are valid. However, I am concerning that even if I validate the entity, it is still possible to fail due to various unexpected reason, and when any entity throw an exception during commit, I cannot find the particular record that caused it, and all entities had been rolled back.
I had tried another approach that start a new transaction and commit for each line without using managed transaction using the following code in process(String s).
for (String s : unprocessedLines) {
try {
em.getTransaction().begin();
this.process(s);
em.getTransaction().commit();
} catch (Exception e) {
// Any exception that a line caused can be caught here
e.printStackTrace();
}
}
The second version works well for logging erroneous line as exception caused by individual lines were caught and handled, but it takes over 300s to commit the same 5000 lines to database. The time it takes is not reasonable when a large file is being processed.
Is there any workaround that I could check and insert record quickly and at the same time being notified of any failed lines?

Well this is more likely a guess, but why don't you try to keep the transaction and commiting it in batch, then you'll keep the rollback exception at the same time will keep the speed:
try {
em.getTransaction().begin();
for (String s : unprocessedLines) {
this.process(s);
}
em.getTransaction().commit();
} catch (RollbackException exc) {
// here you have your rollback reason
} finally {
if(em.getTransaction.isActive()) {
em.getTransaction.rollback(); // well of course you should declare em.getTransaction as a varaible above instead of constantly invoking it as I do :-)
}
}

My solution turned out to be a binary search, and start with a block of reasonable number, e.g. last = first + 1023 to minimize the depth of the tree.
However, note that this work only if the error is deterministic, and is worse than committing each record once if the error rate is very high.
private boolean batchProcess(int first, int last){
try {
em.getTransaction().begin();
for (String s : unprocessedLines.size(); i++) {
this.process(s);
}
em.getTransaction().commit();
} catch (Exception e) {
e.printStackTrace();
if(em.getTransaction.isActive()) {
em.getTransaction.rollback();
}
if( first == last ){
failedLine.add(unprocessedLines(first));
} else {
int mid = (first + last)/2+1
batchProcess(first, mid-1);
batchProcess(mid, last);
}
}
}
For container managed transaction, one may need to do the binary search out of the context of the transaction, otherwise there will be RollbackException because the container had already decided to rollback this transaction.

Related

Skip exceptions in spring-batch and commit error in database

I'm using Spring batch to write a batch process and I'm having issues handling the exceptions.
I have a reader that fetches items from a database with an specific state. The reader passes the item to the processor step that can launch the exception MyException.class. When this exception is thrown I want to skip the item that caused that exception and continue reading the next one.
The issue here is that I need to change the state of that item in the database so it's not fetched again by the reader.
This is what I tried:
return this.stepBuilderFactory.get("name")
.<Input, Output>chunk(1)
.reader(reader())
.processor(processor())
.faultTolerant()
.skipPolicy(skipPolicy())
.writer(writer())
.build();
In my SkipPolicy class I have the next code:
public boolean shouldSkip(Throwable throwable, int skipCount) throws SkipLimitExceededException {
if (throwable instanceof MyException.class) {
// log the issue
// update the item that caused the exception in database so the reader doesn't return it again
return true;
}
return false;
}
With this code the exception is skipped and my reader is called again, however the SkipPolicy didn't commit the change or did a rollback, so the reader fetches the item and tries to process it again.
I also tried with an ExceptionHandler:
return this.stepBuilderFactory.get("name")
.<Input, Output>chunk(1)
.reader(reader())
.processor(processor())
.faultTolerant()
.skip(MyException.class)
.exceptionHandler(myExceptionHandler())
.writer(writer())
.build();
In my ExceptionHandler class I have the next code:
public void handleException(RepeatContext context, Throwable throwable) throws Throwable {
if (throwable.getCause() instanceof MyException.class) {
// log the issue
// update the item that caused the exception in database so the reader doesn't return it again
} else {
throw throwable;
}
}
With this solution the state is changed in the database, however it doesn't call the reader, instead it calls the method process of the processor() again, getting in an infinite loop.
I imagine I can use a listener in my step to handle the exceptions, but I don't like that solution because I will have to clone a lot of code asumming this exception could be launched in different steps/processors of my code.
What am I doing wrong?
EDIT: After a lot of tests and using different listeners like SkipListener, I couldn't achieve what I wanted, Spring Batch is always doing a rollback of my UPDATE.
Debugging this is what I found:
Once my listener is invoked and I update my item, the program enters the method write in the class FaultTolerantChunkProcessor (line #327).
This method will try the next code (copied from github):
try {
doWrite(outputs.getItems());
} catch (Exception e) {
status = BatchMetrics.STATUS_FAILURE;
if (rollbackClassifier.classify(e)) {
throw e;
}
/*
* If the exception is marked as no-rollback, we need to
* override that, otherwise there's no way to write the
* rest of the chunk or to honour the skip listener
* contract.
*/
throw new ForceRollbackForWriteSkipException(
"Force rollback on skippable exception so that skipped item can be located.", e);
}
The method doWrite (line #151) inside the class SimpleChunkProcessor will try to write the list of output items, however, in my case the list is empty, so in the line #159 (method writeItems) will launch an IndexOutOfBoundException, causing the ForceRollbackForWriteSkipException and doing the rollback I'm suffering.
If I override the class FaultTolerantChunkProcessor and I avoid writing the items if the list is empty, then everything works as intended, the update is commited and the program skips the error and calls the reader again.
I don't know if this is actually a bug or it's caused by something I'm doing wrong in my code.

A SkipListener is better suited to your use case than an ExceptionHandler in my opinion, as it gives you access to the item that caused the exception. With the exception handler, you need to carry the item in the exception or the repeat context.
Moreover, the skip listener allows you to know in which phase the exception happened (ie in read, process or write), while with the exception handler you need to find a way to detect that yourself. If the skipping code is the same for all phases, you can call the same method that updates the item's status in all the methods of the listener.

JPA : Parallel Thread : Inserting same record : SQLIntegrityConstraintViolationException

Using JPARepository, we are trying to persist department and student details if not already exists. It works fine in single threaded environment.
But, it's failing with multiple threads.
Caused by: java.sql.SQLIntegrityConstraintViolationException: Duplicate entry 'DEP12' for key 'departmentId'
Code Snippet :
#Transactional
public void persistDetails(String departmentName, String studentName)
{
Department dep= departmentRepository.findByDepartmentName(departmentName);
if (dep== null) {
dep= createDepartmentObject(departmentName);
departmentRepository.save(dep);
}
...
}
How to achieve this in multi-threaded environment. We don't have to fail, instead use existing record and perform other operations.
Also, tried to catch exception and make select query inside it. But, in that case it fetches from cache object, not from DB.
Catching Exception : Code Snippet :
#Transactional
public void persistDetails(String departmentName, String studentName)
{
Department dep= departmentRepository.findByDepartmentName(departmentName);
try{
if (dep== null) {
dep= createDepartmentObject(departmentName);
departmentRepository.save(dep);
}
}
catch(Exception e)
{
dep= departmentRepository.findByDepartmentName(departmentName);
}
...
}

Implement your departmentRepository.save in such way that it uses saveOrUpdate (if you are using Hibernate directly) or merge (if you are using JPA API).

You are catching exception on a wrong place. The kind of catch you show here should be done outside of the transaction. Only then you can be sure you have consistent entities in the session.

spring data jpa transaction not rollbacking

I am Using Spring Data Jpa and adding inserting into 2 table when something happen while adding into second table the first transaction is not rollbacking
and first insert is commiting immidiatally after insert
#Override
#Transactional(propagation = Propagation.REQUIRED, rollbackFor =
Exception.class)
public void addVehicleType(Map<String, Object> model)throws Exception {
VehicleType vehicleType = null;
VehicleStatus vehicleStatus = null;
try {
vehicleType = (VehicleType) model.get("vehicleType");
vehicleStatus = (VehicleStatus) model.get("vehicleStatus");
vehicleStatusRepository.save(vehicleStatus);
vehicleTypeRepository.save(vehicleType);
} catch (Exception e) {
throw e;
}
VehicleTypeRepository.java
public interface VehicleTypeRepository extends JpaRepository<VehicleType, Long> {
#Override
void delete(VehicleType role);
long count();
}

If you use mysql, you must have InnoDB Engine.
Second, problem could be if you are testing on local pc.
Uncomment in my.ini default_tmp_storage_engine=MYISAM
; The default storage engine that will be used when create new tables
; default-storage-engine=MYISAM
; New for MySQL 5.6 default_tmp_storage_engine if skip-innodb enable
default_tmp_storage_engine=MYISAM

The only exceptions that set a transaction to rollback state by default are the unchecked exceptions (like RuntimeException).
Please note that the Spring Framework's transaction infrastructure code will, by default, only mark a transaction for rollback in the case of runtime, unchecked exceptions; that is, when the thrown exception is an instance or subclass of RuntimeException. (Errors will also - by default - result in a rollback.) Checked exceptions that are thrown from a transactional method will not result in the transaction being rolled back.

Sqlite EF global lock write

I have an Sqlite database mapped in an Entity Framework context.
I write on this database from several threads (bad idea, i know). However i tried using a global lock for my application like this:
partial class MyDataContext : ObjectContext
{
public new int SaveChanges()
{
lock (GlobalWriteLock.Lock)
{
try
{
int result = base.SaveChanges();
Log.InfoFormat("fff Save changes performed for {0} entries", result);
return result;
}
catch (UpdateException e)
{
throw e;
}
}
}
}
Still, i get the database file locked exception all the way down from sqlite itself. How can this be possible?
The only explanation I can see is that the base.SaveChanges method returns before the database gets unlocked and continues work asynchronously after returning.
Is this the case? If yes, how can I overcome this issue?
Note: My commits are usually updates of 1-100 entries and/or inserts of about 1-100 entries at a time.

Getting access to newly inserted Identity ID before SaveChanges method will be called

I'm using the LINQ Entity Framework and I've came across the scenario where I need to access the newly inserted Identity record before performing multiple operations using procedure.
Following is the code sinppet:
public void SaveQuote(Domain.Quote currentQuote)
{
try
{
int newQuoteId;
//Add quote and quoteline details to db
if (currentQuote != null)
{
using (QuoteContainer quoteContainer = new QuoteContainer())
{
**quoteContainer.AddToQuote(currentQuote);**
newQuoteId = currentQuote.QuoteId;
}
}
else return;
// Execution of some stored Procedure by using above newly generated QuoteId
}
catch (Exception ex)
{
throw ex;
}
}
In the next function
quoteContainer.SaveChanges(); will get called to commit the DB changes.
Can any one suggest whether the above approach is correct?

correct so far.
remember: you cannot get IDENTITY until insert has occured! on an update, your entity already holds the IDENTITY (mainly PK)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Good design to Insert multiple records - jpa

Related

Skip exceptions in spring-batch and commit error in database

JPA : Parallel Thread : Inserting same record : SQLIntegrityConstraintViolationException

spring data jpa transaction not rollbacking

Sqlite EF global lock write

Getting access to newly inserted Identity ID before SaveChanges method will be called

Categories

Resources