When and Why does ScalarDB throw UnknownTransactionStatusException - scalardb

From the example https://github.com/indetail-blockchain/getting-started-with-scalardb, transaction.commit can throw two exceptions
• CommitException which indicates a commit has failed. In that case, it is recommended to roll back the transaction using transaction.abort()
• UnknownTransactionStatusException indicates the transaction commit is in an unknown status. It may or not have been committed.
Does abort guarantee rollback?
In what cases is UnknownTransactionStatusException thrown?
What is the remedy if UnknownTransactionStatusException is thrown? Shall I call abort? Would that guarantee rollback to previous consistent state?

Does abort guarantee rollback?
Actually in the current implementation based on Snapshot Isolation,
abort() is not really needed, so as you can see the code, abort() is doing nothing.
https://github.com/scalar-labs/scalardb/blob/master/src/main/java/com/scalar/db/transaction/consensuscommit/ConsensusCommit.java#L126
To abort, only thing you need to do is throwing away a transaction object before calling commit().
In that case, nothing happens in the storage, so it looks like it is properly rollbacked.
If you have already called commit(), whether or not it will really commit or abort depends on the transaction's mutations and storage availability.
If everything goes well, it will be committed. If it faces some conflicts or failure, it will be aborted.
In any cases, it will be committed (roll-forwarded) or aborted (roll-backed) eventually.
In what cases is UnknownTransactionStatusException thrown?
There is a case where it can not identify if a transaction is committed or aborted , for example, due to some catastrophic failure in a system, and UnknownTransactionStatusException is thrown in such case.
What is the remedy if UnknownTransactionStatusException is thrown? Shall I call abort? Would that guarantee rollback to previous consistent state?
When UnknownTransactionStatusException is thrown, there is nothing you can do except for waiting until the status is settled.
You can call TransactionService.getState() to check if the transaction is committed or aborted eventually.
https://github.com/scalar-labs/scalardb/blob/master/src/main/java/com/scalar/db/service/TransactionService.java#L70

Related

Is it possible that both transactions rollback during a deadlock or serialization error?

In PostgreSQL (and other MVCC databases), transactions can rollback due to a deadlock or serialization error. Assume two transactions are currently running, is it ever possible that both, instead of just one, transaction will fail due to this kind of errors?
The reason why I am asking is that I am writing a retry implementation. If both transactions can fail, we might end up in a never-ending loop of retries if both retry immediately. If only one transaction can fail, I don't see any harm in retrying as soon as possible.
Yes. A deadlock can involve more than two transactions. In this case more than one may be terminated. But this is an extremely rare condition. Normally.
If just two transactions deadlock, one survives. The manual:
PostgreSQL automatically detects deadlock situations and resolves them by aborting one of the transactions involved, allowing the other(s) to complete.
Serialization failures only happen in REPEATABLE READ or SERIALIZABLE transaction isolation. I wouldn't know of any particular limit to how many serialization failures can happen concurrently. But I also never heard of any necessity to delay retrying.
I would retry as soon as possible either way.

Bitronix transaction appears to be committing prematurely

We have a spring-batch process that uses the bitronix transaction manager. On the first pass of a particular step, we see the expected commit behavior - data is only committed to the target database when the transaction boundary is reached.
However, on the second and subsequent passes, rows are committed as soon as they are written. That is, they do not wait for the commit point.
We have confirmed that the bitronix commit is only called at the expected points.
Has anyone experienced this behavior before? What kind of bug am I looking for?
Java XA is designed in such a way that connections cannot be reused across transactions. Once the transaction is committed, the connection property is changed to autocommit=true, and the connection cannot be used in another transaction until it is returned to the connection pool and retrieved by the XA code again.

Can committing an transaction in PostgreSQL fail?

If I execute some SQL inside a transaction successfully, can it happens that the commit will fail? And what are possible causes? Can it fail related to the executed queries, or just due to some DB side issues?
The question comes up because I need to judge if it makes sense to commit transactions inside tests or if it is "safe enough" to just rollback after each test case.
If I execute some SQL inside a transaction successfully, can it happens that the commit will fail?
Yes.
And what are possible causes?
DEFERRABLE constraints with SET CONSTRAINTS DEFERRED or in a one-statement autocommit transaction. (Can't happen unless you use DEFERRABLE constraints)
SERIALIZABLE transaction with serialization failure detected at commit time. (Can't happen unless you use SERIALIZABLE transactions)
Asynchronous commit where the DB crashes or is shut down. (Can't happen if synchronous_commit = on, the default)
Disk I/O error, filesystem error, etc
Out-of-memory error
Network error leading to session disconnect after you send the commit but before you get confirmation of success. In this case you don't know for sure if it committed or not.
... probably more
Can it fail related to the executed queries, or just due to some DB side issues?
Either. A serialization failure, for example, is definitely related to the queries run.
If you're using READ COMMITTED isolation with no deferred constraints then commits are only likely to fail due to underlying system errors.
The question comes up because I need to judge if it makes sense to commit transactions inside tests or if it is "safe enough" to just rollback after each test case.
Any sensible test suite has to cover multiple concurrent transactions interacting, committing in different orders, etc.
If all you test is single standalone transactions you're not testing the real system.
So the question is IMO moot, because a decent suite of tests has to commit anyway.

Does MongoDB fail silently if I don't check error codes?

I'm wondering if any persistence failure will go undetected if I don't check error codes? If so, what's the right way to write fast (asynchronously) while still detecting errors?
If you don't check for errors, your update is only fireAndForget. You'll indeed miss all errors which could arise. Please see MongoDB WriteConcerns for the available write modes in MongoDB (sorry I always fail to find the official, non driver related documentation, I really should bookmark it).
So with NORMAL you'll get at least connectivity errors, with NONE no exceptions at all. If you want to be informed of exceptions you have to use one of the other modes, which differ only in the persistence guarantee they give you.
You can't detect errors when running asynchronous, as this is against the intention. Your connection which sent the write operation, may be already closed or reused, so you can't sent it through that connection. Further more only your actual code knows what to do if it fails. As mongoDB doesn't offer some remote procedure call to asynchronous inform you of updates you'll have to wait until the write finished to a given stage.
So the fastest, but most unrelieable is SAFE, where the write only happened to memory. JOURNAL gives you the security that it was written at least to disk. With FSYNC you'll have those changes persisted on your db on disk. REPLICA that a least two replicas have written it, and MAJORITY that more than half of your replicas have written it(by three replicas which should be the default this doesn't differ).
The only chance I see to have something like asynchronous, is to have a separate Thread who is performing all write operations synchronous. This thread you could handle the actual update as well as a class which is called in case of a failure to perform the needed operations to handle this failure. But I don't think that this is good application design.
Yes, depending on the error, it can fail silently if you don't check the returned error code. It's necessary to wait for error checking. Your only other option would be for your app to occasionally tell the user "oops, remember when I acted like I saved your data a moment ago? Well, not really."

Unexpected rollbackCount and calls of shouldSkip() during item write

Spring documentation (Pg. 46, Section: 5.1.7) says:
By default, regardless of retry or skip, any exceptions thrown from the ItemWriter will cause the transaction controlled by the Step to rollback. If skip is configured as described above, exceptions thrown from the ItemReader will not cause a rollback.
My commit interval is set to 10. So my understanding of above paragraph is, if their is error in reading 7th record out of the chunk of 10, the item will be skipped and the correct 9 records will be sent ahead by itemReader.
However, if the 7th record is in error during writing - none of the 10 records will be written and a rollback will happen.
However, when I am including the error thrown in my skipPolicy, itemWriter IS writing the remaining 9 records to the database skipping the errored one. This is contradictory to what is mentioned above.
Can any one please explain the concept of "skip during item writing".
Also even though single error is thrown I am getting the following:
SkipCount as -1 twice, then as 0 once, and again -1 once in my shouldSkip(Object, Throwable) method. -- I am not getting this behavior.
Also rollback count is 2 -- what does it mean ? why it is 2 ?
#michael Would it be possible for you to explain the behavior using some scenario!!
like "i am reading 20 records from a file and writing to a database after some processing. I have a skip policy set for some exception. what will happen if the exception occurrs during - read, process, write -- how the chunks will be committed, how default retry works, how the counts will be updated, etc. etc...."
It will really be a big help for me, as I am still confused with the behavior..
From your usecase description it seems, that you mix different concepts.
You describe a skip scenario but you seem to expect skip should work like a no-rollback scenario.
from the spring batch documentation
skip:
errors encountered while processing should not result in Step failure,
but should be skipped instead
vs no-rollback:
If skip is configured as described above, exceptions thrown from the
ItemReader will not cause a rollback.
in my own words skip means:
If the step encounters an error during read/process/write, the current chunk will be rollbacked and each item of the chunk is read/processed/written individually - without the bad item. Basically Spring Batch falls back to commit-rate 1 for the bad chunk and goes back to the specified commit-rate after the bad chunk.
Also rollback count is 2 -- what does it mean ? why it is 2 ?
from B.5. BATCH_STEP_EXECUTION
ROLLBACK_COUNT: The number of rollbacks during this execution. Note
that this count includes each time rollback occurs,
including rollbacks for retry and those in the skip recovery procedure.
(emphasize mine)
Also even though single error is thrown I am getting the following:
SkipCount as -1 twice, then as 0 once, and again -1 once in my
shouldSkip(Object, Throwable) method. -- I am not getting this
behavior.
i tried a simple skip job with both configuration styles, skip-policy and skip-limit with skippable-exception, both worked identically in relation to rollback and skip counts
(step metadata is allright but shouldSkip(...) seems to be called a lot more than expected)
I'd like to explain one issue you mentioned:
SkipCount as -1 twice, then as 0 once, and again -1 once in my shouldSkip(Object, Throwable) method. -- I am not getting this behavior.
I don't know to which signature of the shouldSkip() method you refer to, but in my SkipPolicy interface there is only one method with the following signature:
boolean shouldSkip(Throwable t, int skipCount) throws SkipLimitExceededException;
This method decides whether the Exception e given the skipCount should be skipped or not.
Unfortunately, the programmers of Spring Batch misuse this method to test, whether an exception is skippable in general regardless of the current skip count. That's why there are several calls to this method with the skipCount parameter set to -1.
So just don't wonder about the behaviour you saw.