Subtransactions in functions that can commit - postgresql

I read that I can use a BEGIN-EXCEPTION block to have a subtransaction in a FUNCTION that can be rolled back. But why is it not possible to commit this subtransaction?
How can I circumvent the "all-or-nothing" transaction behavior of functions written in PL/pgSQL? Is it possible to have the function make commits using subtransactions while the outer transaction could be rolled back?

You can't circumvent it at the time of writing (PG 9.3).
Or more precisely, not directly. You can mimic autonomous subtransactions by using dblink, but be wary that doing so is a can of worms: what's supposed to happen, for instance, if your outer transaction is rolled back?
For background and references to discussions related to the topic on the PG-Hackers list, see:
http://wiki.postgresql.org/wiki/Autonomous_subtransactions
http://www.postgresql.org/message-id/20111218082812.GA14355#leggeri.gi.lan

Related

Is there a way to SELECT and read from the table such that no COMMIT or ROLLBACK is necessary?

In fact, the PostgreSQL documentation states that all interactions with the database are performed within a transaction.
PostgreSQL actually treats every SQL statement as being executed within a transaction. If you do not issue a BEGIN command, then each individual statement has an implicit BEGIN and (if successful) COMMIT wrapped around it. A group of statements surrounded by BEGIN and COMMIT is sometimes called a transaction block.
Given this, even something like SELECT name FROM users will initiate a COMMIT. Is there a way to avoid this? In other words—is there a means of looking at the data in a table without issuing a COMMIT or a ROLLBACK? In the case of statements whose sole purpose is to fetch some data, it seems like superfluous overhead.
I read this and recognize that having SELECT statements be within a transaction is important; it allows one to take a snapshot of the data and thus remain consistent about what rows are where and what data they contain—but, then, is there a way to end a transaction without the overhead of COMMIT or ROLLBACK (in the case where neither is actually necessary)?
I recognize that having SELECT statements be within a transaction is important; it allows one to take a snapshot of the data and thus remain consistent
Good.
but, then, is there a way to end a transaction without the overhead of COMMIT or ROLLBACK?
Committing a transaction that did only read data does not have any overhead. All you need to do is drop the transaction handle and the resources allocated for it.
The "implicit COMMIT" just means that the transaction is getting closed/exited/completed - with or without actually writing anything. You cannot have the transaction without ending it.

does sql statement ensure atomicity in postgres

I have a simple bug in my program that uses multi user support. I'm using knex to build sql queries, and I have a pseudocode that depicts the scenerio:
const value = queryBuilder().readDataFromTheDatabase();//executes this
//do some other work and get value
queryBuilder.writeValueToTheDatabase(updateValue(value));
This piece of code is being use in sort of a middleware function. And as you can see, this is a possible race condition i.e. when multiple users access the thing, one of them gets a stale value when they try to execute this at roughly the same amount of time.
My solution
So, I was think a possible solution would be create a single queryBuilder statement:
queryBuilder().readAndUpdateValueInTheDatabase();
So, I'll probably have to use a little bit of plpgsql. I was wondering if this solution will be sufficient. Will the statement be executed atomically? i.e. When one request reads and doesn't finish his write, does another request wait around to both read and write or just waits to write but, reads the stale value?
I think what you are looking for here is isolation, not atomicity. You could set all transactions to the highest isolation level, serializable (which is higher than the usual default level). With that level, if data that a transaction read (and presumably relied upon) is changed, then when it tries to commit it might get a serialization failure error. I say "might", because the system could conclude the situation would be consistent with the data change having happened after the commit, in which case the commit is allowed to stand.
To avoid a race condition with such a setup, you must run both the read and the write in the same database transaction.
There are two ways to do that:
Use the default READ COMMITTED isolation level and lock the rows when you read them:
SELECT ... FROM ... FOR NO KEY UPDATE;
That locks the rows against concurrent modifications, and the lock is held until the end of the transaction.
Use the REPEATABLE READ isolation level and don't lock anything. Then your UPDATE will receive a serialization error (SQLSTATE 40001) if somebody modified the row concurrently. In that case, you roll the transaction back and try again in a new REPEATABLE READ transaction.
The first solution is usually better if you expect conflicts frequently, while the second is better if conflicts are rare.
Note that you should keep the database transaction as short as possible in both cases to keep the risk of conflicts low.
Transaction in PostgreSQL use an optimistic locking model when accessing to tables, while some other DBMS do pessimistic locking (IBM Db2) or the two locking model (MS SQL Server).
Optimistic locking snapshot the data on which you are working, and the modifications are done on the snapshot until the transaction ended. When the transaction finishes, the snapshot modifications are postponed on the real database (table rows), but if some other user had made a change between the moment of the snapshot capture and the commit, then the commit cannot apply and the COMMIT is rejected as a ROLLBACK.
You can try to raise the ISOLATION LEVEL (REPEATABLE READ or SERIALIZABLE) to avoid the trouble.

Why is not possible to use Commit and rollback in a PostgreSQL procedure?

I am creating procedures on a PostgreSQL database. I read about is not possible to use rollback inside these procedures.
Why?
Is it possible to use commit?
I guess that this is related with ACID properties, but what if we have two insert operations in a procedure. If the second fails the second one gets a rollback?
Thank you.
Postgres' overview gives a hint by explaining how their functions are different than traditional stored procedures:
Functions created with PL/pgSQL can be used anywhere that built-in functions could be used. For example, it is possible to create complex conditional computation functions and later use them to define operators or use them in index expressions.
This makes it awkward to support transactions within functions for every possible situation. From the docs:
Functions and trigger procedures are always executed within a transaction established by an outer query... However, a block containing an EXCEPTION clause effectively forms a subtransaction that can be rolled back without affecting the outer transaction.
If you were to have two INSERTs in a function, the first can be wrapped in an EXCEPTION block to catch any errors and decide if the second should be executed.
You are correct. You cannot rollback transactions that were started prior to the procedure from within the procedure. In addition, a transaction cannot be created within the procedure either. You would receive this error:
ERROR: cannot begin/end transactions in PL/pgSQL
SQL state: 0A000
Hint: Use a BEGIN block with an EXCEPTION clause instead.
As this error states, and as Matt mentioned, you could use an exception block to essentially perform a rollback. From the help:
When an error is caught by an EXCEPTION clause, the local variables of the PL/pgSQL function remain as they were when the error occurred, but all changes to persistent database state within the block are rolled back.
Alternatively, you could call the procedure from within a transaction, and roll that back if necessary.

Modify Trigger in Postgresql

I need to modify a Trigger (which use a particular FUNCTION) already defined and it is being in use. If i modify it using CREATE OR REPLACE FUNCTION, what is the behaviour of Postgres? will it "pause" the old trigger while it is updating the function?. As far as i know, Postgres should execute all the REPLACE FUNCTION in one transaction (so the tables are locked and so the triggers being modify while it is updating, then next transactions locked will use the new FUNCTION not the old one. is it correct?
Yes. According to the documentation:
http://www.postgresql.org/docs/9.0/static/explicit-locking.html
Also, most PostgreSQL commands automatically acquire locks of appropriate modes to ensure that referenced tables are not dropped or modified in incompatible ways while the command executes. (For example, ALTER TABLE cannot safely be executed concurrently with other operations on the same table, so it obtains an exclusive lock on the table to enforce that.)
will it "pause" the old trigger while it is updating the function?
It should continue executing the old trigger functions when calls are in progress (depending on the isolation level, subsequent calls in the same transaction should use the old definition too; I'm not 100% sure the default level would do so, however), block new transactions that try to call the function while it's being updated, and execute the new function once it's replaced.
As far as i know, Postgres should execute all the REPLACE FUNCTION in one transaction (so the tables are locked and so the triggers being modify while it is updating, then next transactions locked will use the new FUNCTION not the old one. is it correct?
Best I'm aware the function associated to the trigger doesn't lock the table when it's updated.
Please take this with a grain of salt, though: the two above statements amount to what I'd intuitively expect mvcc to do, rather than knowing this area of Postgres' source code off the top of my head. (A few core contributors periodically come to SO, and might eventually chime in with a more precise answer.)
Note that this is relatively straightforward to test, that being said: open two psql sessions, open two transactions, and see what happens...

How to wait during SELECT that pending INSERT commit?

I'm using PostgreSQL 9.2 in a Windows environment.
I'm in a 2PC (2 phase commit) environment using MSDTC.
I have a client application, that starts a transaction at the SERIALIZABLE isolation level, inserts a new row of data in a table for a specific foreign key value (there is an index on the column), and vote for completion of the transaction (The transaction is PREPARED). The transaction will be COMMITED by the Transaction Coordinator.
Immediatly after that, outside of a transaction, the same client requests all the rows for this same specific foreign key value.
Because there may be a delay before the previous transaction is really commited, the SELECT clause may return a previous snapshot of the data. In fact, it does happen sometimes, and this is problematic. Of course the application may be redesigned but until then, I'm looking for a lock solution. Advisory Lock ?
I already solved the problem while performing UPDATE on specific rows, then using SELECT...FOR SHARE, and it works well. The SELECT waits until the transaction commits and return old and new rows.
Now I'm trying to solve it for INSERT.
SELECT...FOR SHARE does not block and return immediatley.
There is no concurrency issue here as only one client deals with a specific set of rows. I already know about MVCC.
Any help appreciated.
To wait for a not-yet-committed INSERT you'd need to take a predicate lock. There's limited predicate locking in PostgreSQL for the serializable support, but it's not exposed directly to the user.
Simple SERIALIZABLE isolation won't help you here, because SERIALIZABLE only requires that there be an order in which the transactions could've occurred to produce a consistent result. In your case this ordering is SELECT followed by INSERT.
The only option I can think of is to take an ACCESS EXCLUSIVE lock on the table before INSERTing. This will only get released at COMMIT PREPARED or ROLLBACK PREPARED time, and in the mean time any other queries will wait for the lock. You can enforce this via a BEFORE trigger to avoid the need to change the app. You'll probably get the odd deadlock and rollback if you do it that way, though, because INSERT will take a lower lock then you'll attempt lock promotion in the trigger. If possible it's better to run the LOCK TABLE ... IN ACCESS EXCLUSIVE MODE command before the INSERT.
As you've alluded to, this is mostly an application mis-design problem. Expecting to see not-yet-committed rows doesn't really make any sense.