PostgreSQL: update operation synchronization - postgresql

For example I have query
UPDATE foo_table SET viewed_count = coalesce(viewed_count, 0) + 1
suppose 100 clients execute this query at same time
Is there any guarantees that viewed_count will be incremented by 100?

Yes.
After obtaining the lock on the row, a transaction in READ COMMITTED isolation level will re-read the current version of the row.
See the documentation:
UPDATE, DELETE, SELECT FOR UPDATE, and SELECT FOR SHARE commands behave the same as SELECT in terms of searching for target rows: they will only find target rows that were committed as of the command start time. However, such a target row might have already been updated (or deleted or locked) by another concurrent transaction by the time it is found. In this case, the would-be updater will wait for the first updating transaction to commit or roll back (if it is still in progress). If the first updater rolls back, then its effects are negated and the second updater can proceed with updating the originally found row. If the first updater commits, the second updater will ignore the row if the first updater deleted it, otherwise it will attempt to apply its operation to the updated version of the row.

Related

Postgres Lock(For Update) row level locking returns value post lock

I am trying to implement rowlevel lock using below query
BEGIN;
SELECT * FROM public.student where student_code=1 For update;
The above query i ran in pgadmin.
Now through the application if i try to do a select the select returns a value. Ideally this should not
What am i doing wrong here?
This is a feature, not a bug; see the documentation:
Read Committed Isolation Level
[...]
UPDATE, DELETE, SELECT FOR UPDATE, and SELECT FOR SHARE commands behave the same as SELECT in terms of searching for target rows: they will only find target rows that were committed as of the command start time. However, such a target row might have already been updated (or deleted or locked) by another concurrent transaction by the time it is found. In this case, the would-be updater will wait for the first updating transaction to commit or roll back (if it is still in progress). If the first updater rolls back, then its effects are negated and the second updater can proceed with updating the originally found row. If the first updater commits, the second updater will ignore the row if the first updater deleted it, otherwise it will attempt to apply its operation to the updated version of the row. The search condition of the command (the WHERE clause) is re-evaluated to see if the updated version of the row still matches the search condition. If so, the second updater proceeds with its operation using the updated version of the row. In the case of SELECT FOR UPDATE and SELECT FOR SHARE, this means it is the updated version of the row that is locked and returned to the client.
After all, that is the meaning of “READ COMMITTED” – you see that latest committed version of the row..
If you want to avoid that, you can use the REPEATABLE READ isolation level. Then you get read stability, so that you only see the same state of the database for the whole transaction. In that case, you will receive a serialization error, since the row version that can be locked (the latest one) is not the one you see.

firebird - deadlock update conflicts with concurrent update

I'm maintaining an old software (Firebird 2.5 and C#.net). Recently we get a lot of "deadlock update conflicts with concurrent update" errors. I checked the transaction settings. It doesn't set Wait time out option:
public override IDbTransaction BeginTransaction(IDbConnection conn)
{
FbTransaction trans = null;
if (conn.State != ConnectionState.Open)
conn.Open();
FbTransactionOptions op = new FbTransactionOptions();
op.TransactionBehavior = FbTransactionBehavior.ReadCommitted | FbTransactionBehavior.RecVersion;
trans = ((FbConnection)conn).BeginTransaction(op);
return trans;
}
So, why do we get timeout? Shouldn't it wait for one transaction to be committed to commit the next one?
A "deadlock update conflicts with concurrent update" happens when multiple transaction want to modify the same row. Only one updater can really change the row and commit. As long as the first transaction hasn't committed, the update in the second transaction will wait (indefinitely or until the configured timeout). As soon as the first transaction commits, the update in the second transaction will end with this error (if instead the first transaction had rolled back, the second would have continued).
If this started happening recently, you need to identify what changed. Did another tool also start writing to the database, did the number of users increase, did you upgrade something (eg Firebird, or the Firebird ado.net provider version, etc), did you make a change that resulted in long running transactions performing updates?
Your application code will need to be changed to automatically retry on this error. Also make sure that your transactions aren't too 'long' time-wise (the longer transactions are, the more chance of these types of errors occurring). In addition, you could try and change the transaction behavior from FbTransactionBehavior.RecVersion to FbTransactionBehavior.NoRecVersion, but that can introduce waiting when reading records that are currently updated by concurrent transactions, and it may actually increase occurrence of update conflicts if the record was updated (and committed) by a transaction with a newer transaction id.
See also http://www.firebirdfaq.org/faq151/, Transactions in Firebird: ACID, Isolation levels, Deadlocks, and Resolution of update conflicts and Transaction Statements.

Suspend transaction in Postgres

I have seen another database system that offers to suspend transaction. The current transaction is kept intact but put on hold while your code is allowed to work with the database to effect immediate permanent changes to rows. Then you can resume transaction, continuing where you left off with the same locks and other transaction protections in place as if you'd never interrupted it.
For example, say an customer is placing an order, in a transaction. During that transaction, the customer notices their phone number needs to be updated, so we change that data. Next, customer decides to cancel the not-yet-completed order. A rollback of the order has the unintended consequence of also undoing the phone number change. So it would be nice if we could:
Suspend the transaction for the order.
Update the phone number, committed to the database.
Resume the transaction for the order.
Is there some way to suspend a transaction in Postgres? In JDBC?
If a transaction cannot continue, it must roll back.
If your transaction has a point at which you don't know how to carry on, then your transaction logic is flawed, you need to reorganize it - either split into multiple transactions (or sub-transactions, aka save points), or take out the parts that do not belong to the transaction logic.
Is there some way to suspend a transaction in Postgres?
No, no such thing. And the data integrity principle is unconditional as to time.
No.
The closest things are
prepared transactions: this allows (with some conditions) for a transaction to be saved, and then later rolled back or committed.
savepoints: this allows for "nested transactions", where portions of transactions can be rolled back .
Neither of these fit exactly what you are looking for. It seems that our example has two operations that do not need to be part of the same transaction at all, since the phone number update appears to be unrelated to the success of the order. (Also, a long-running transaction is a bad idea....your order should probably be a state machine implemented without long-running transaction.)
Workaround – open second connection
In JDBC, you could just open a second connection to the database.
Do your separate work on that second connection and close. The first connection is still open and remains in its same state. Any active transaction in that first connection remains.

What will happen if two processes modify data in two transactions at the same time and there is a unique constraint on the table?

I am thinking about a race condition in a production system I am working on. Database is PostgreSQL. Application is written in Java, but this is not relevant.
There is a table called "versions", which contains columns "entity_ID" and "version" (and some other fields). This table contains versions of a certain entity.
There is an application where user can modify those entities.
Every modification of an entity creates a new version to the tabel "versions" (using a trigger). This trigger finds the last version in the same table "versions" and inserts a new row with the same entity_ID, but version = (last version + 1).
There is a nightly job that is run in PostgreSQL every 4:00 that also changes those entities and therefore updates data in the table "versions". This procedure was designed to finish its work by the morning (before users of the application start to use it), but unfortunately runs into the day. As this procedure is run in a function, then it is one big transaction. Therefore the changes done by it are not visible to the application.
The nightly job uses the following workflow:
Set "failed_counter" = 0
Iterate over entities that need to be modified
Do modifications to the entity inside a BEGIN .. EXCEPTION .. END block
If there is an EXCEPTION, increase the "failed_counter". Log the exception and the failed entity to a log table.
If "failed_counter" > 10, cancel work.
End work
This has caused the following race condition to happen a few times (lets assume that X is the last version of entity A):
Nightly job starts
Nightly job modifies entity A, creating version X+1
Application is used to also modify entity A, creating also version X+1 (because the nightly job transaction has not COMMITed, so the version X+1 is not visible to the application)
Nightly job ends, causing COMMIT
There are now two versions with version number X+1, which causes application to break.
I thought that I could just solve the problem by using an UNIQUE CONSTRAINT over fields (entity_ID, version). I thought that it would cause the application to receive an error (due to violating the UNIQUE CONSTRAINT) at race condition step 3. But I am not sure how does the unique constraint work in this situation. In race condition step 3, when the application adds a version, does the database check the UNIQUE CONSTRAINT? I suppose not, since the transaction of the nightly process has not been completed. If I am correct and the UNIQUE CONSTRAINT is checked only at race condition step 4, when COMMIT is made, then this causes the whole nightly procedure to fail, which is not desired result.
So, the question is the following.
When is the UNIQUE CONSTRAINT checked: At race condition step 3 or race condition step 4?
If the answer to the last question is "race condition 4", then how could I change the design of the system to avoid the above-mentioned problems?
By default, unique constraints in PostgreSQL are checked at the end of each statement. It's easy to test the behavior using psql.
Some big, red flags . . .
As this procedure is run in a function, then it is one big transaction.
It's not one, big transaction because you're running a function. It's one, big transaction because you haven't run the function several times over smaller subsets of the data. Whether you can run the function over subsets is application-dependent.
Iterate over entities that need to be modified
Rough rule of thumb for SQL databases: iteration is always a mistake.
SQL is a set-oriented language. Dealing with sets is usually faster than iteration, and often by several orders of magnitude.
If "failed_counter" > 10, cancel work.
This looks suspicious. Why are nine failures ok? Why are any failures ok?
I thought that I could just solve the problem by using an UNIQUE CONSTRAINT over fields (entity_ID, version).
That you don't already have a unique constraint on those two columns is a big, waving red flag. Fix this first.
The fact that an application should apparently be waiting for a batch job to finish, but isn't waiting, might or might not be a system design issue. (It smells like a system design issue.)
There is a nightly job that is run in PostgreSQL every 4:00 ...
Did you think of starting at 3:00?
Test this, but not on your production server.
Drop the trigger.
Add a column of type timestamp with time zone.
Set that column's default value. Most applications will use current_timestamp, but you might want clock_timestamp() instead. Docs
Add a unique constraint on {entity_id, new timestamp column}.
Eliminating the trigger might speed things up enough for you.

Transaction & Locks Problem

with in do transaction, i defined a label and in this label i am accessing a table with exclusive-lock.and at the end of label i have done all the changes in that table. bt now i am with in transaction block.
Now, i tried to access that same table in another session.then it show an error, Table used by another user. So is it possible that, can we release teh table with in transaction,so another user can access it.
For example:
Session 1)
DO TRANSACTION:
---
---
loopb:
REPEAT:
--
--
---------------------> control is here right now.
END. /*repeat*/
--
--
END. /*do transaction*/
Session 2)
I tried to access same table, but it show an error, that table locked by another user.
All those records you touched in the loop using EXCLUSIVE-LOCK will not be available to be locked by another user until the TRANSACTION is complete. There is no getting around this. If the second process needs to lock those records, then all you can do is decrease your TRANSACTION scope in the first process. This is a safety feature so that if an error happens later on in the TRANSACTION, all the changes made during the TRANSACTION will be rolled back. Another way to look at it is if you could release some record locks during a TRANSACTION, you would lose the atomicity (all-or-nothingness) that is part of the definition of a TRANSACTION.
It should be noted that if you don't really need to lock those records in the second process but just need to see their updated value, that is possible. Once the updated records are no longer in the record buffer (or the record lock status is downgraded to a NO-LOCK in the TRANSACTION), they will become limbo locks and you can view their updated values using a NO-LOCK. To make the last record in the loop become a limbo lock, you can either do this
FIND CURRENT tablerecord NO-LOCK.
Or this, if you do not need to access the record buffer any longer:
RELEASE tablerecord.
Other sessions can do a "dirty read" of the record using NO-LOCK. But they will not be able to lock it or update it until the transaction is committed (or rolled back). And that won't happen until the repeat block iterates or you leave it.