PostgreSQL BEFORE INSERT trigger locking behavior in a concurrent environment - postgresql

I have a general function that can manipulate the sequence of any table (why is irrelevant to my question). It reads the current value, works out the new value, sets it, and returns its calculation, which is what's inserted. This is obviously a multi-step process.
I call it from a BEFORE INSERT trigger on tables where I need it.
All I need to know is am I guaranteed that the function will be called by only one caller at a time in a multi-user environment?
Specifically, does the BEFORE INSERT trigger have to complete before it is called again by another caller?
Logically, I would assume yes, but one never knows what may be going on under the hood.
If the answer is no, what minimal locking would I need on the function to guarantee I can read and write the sequence in a "thread-safe" manner?
I'm using PG 10.
EDIT
Here is the function updated with a lock:
CREATE OR REPLACE FUNCTION public.uts_set()
RETURNS TRIGGER AS
$$
DECLARE
sv int8;
seq text := format('%I.%I_uts_seq', tg_table_schema, tg_table_name);
BEGIN
EXECUTE format('LOCK TABLE %I IN ROW EXCLUSIVE MODE;', tg_table_name);
EXECUTE 'SELECT last_value+1 FROM ' || seq INTO sv; -- currval(seq) isn't useable
PERFORM setval(seq, GREATEST(sv, (EXTRACT(epoch FROM localtimestamp) * 1000000)::int8), false);
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
However, a SELECT already acquires ROW EXCLUSIVE, so this statement may be redundant and a stronger lock may be needed. Or, conversely, it may mean no lock is needed.
UPDATE
If I am reading this SO question correctly, my original version without the LOCK should work since the trigger acquires the same lock my updated function is redundantly taking.

All I need to know is am I guaranteed that the function will be called by only one caller at a time in a multi-user environment?
No. Not related to calling functions itself, but you can achieve this behaviour with SERIALIZABLE transaction isolation level:
This level emulates serial transaction execution for all committed
transactions; as if transactions had been executed one after another,
serially, rather than concurrently
But this approach would introduce several tradeoffs, such preparing your application to retry transactions with serialization failure.
Maybe a missed something, but I really believe that you just need NEXTVAL, something like below:
CREATE OR REPLACE FUNCTION public.uts_set()
RETURNS TRIGGER AS
$$
DECLARE
sv int8;
-- First, use %I wildcard for identifiers instead of %s
seq text := format('%I.%I', tg_table_schema, tg_table_name || '_uts_seq');
BEGIN
-- Second, you couldn't call CURRVAL on a session
-- that you didn't issued NEXTVAL before
sv := NEXTVAL(seq);
-- Do your logic here...
-- Result is ignored since this is an STATEMENT trigger
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
Remember that CURRVAL acts on session local scope and NEXTVAL on global scope, so you have a reliable thread-safe mechanism in hands.

The sequence itself handles thread safety with concurrent sessions. So it real comes down to the code that is interacting with the sequence. The following code is thread safe:
SELECT nextval('myseq');
If the sequence is doing much fancier things like setval and currval, I would be more worried about that being done in a high transaction/multi-user environment. Even so, the sequence itself should be locked from other queries while the sequence is being manipulated.

Related

Can postgres insert triggers and/or check be ran without inserting

I would love to be able to validate objects representing table rows using the database's existing constraints (triggers that raise exceptions and checks) without actually inserting them into the database.
Is there currently a way one could do this in postgres? At least with BEFORE INSERT triggers and CHECK, I assume it makes no sense with AFTER INSERT triggers.
The easiest way I can think or right now would be to:
Lock the table
Insert a new row
If exception raise to the API / else DELETE the row and call it valid
Unlock
But I can see several issues with this.
A simpler way is to insert within a transaction and not commit:
BEGIN;
INSERT INTO tbl(...) VALUES (...);
-- see effects ...
ROLLBACK;
No need for additional locking. The row is never visible to any other transaction with default transacton isolation level READ COMMITTED. (You might be stalling concurrent writes that confict with the tested row.)
Notable side-effect: Sequences of serial or IDENTITY columns are advanced even if the INSERT is never committed. But gaps in sequential numbers are to be expected anyway and nothing to worry about.
Be wary of triggers with side-effects. All "transactional" SQL effects are rolled back, even most DDL commands. But some special operations (like advancing sequences) are never rolled back.
Also, DEFERRED constraints do not kick in. The manual:
DEFERRED constraints are not checked until transaction commit.
If you need this a lot, work with a copy of your table, or even your database.
Strictly speaking, while any trigger / constraint / concurrent event is allowed, there is no other way to "validate objects" than to insert them into the actual target table in the actual target database at the actual point in time. Triggers, constraints, even default values, can interact with the current state of the whole DB. The more possibilities are ruled out and requirements are reduced, the more options we might have to emulate the test.
CREATE FUNCTION validate_function ( )
RETURNS trigger LANGUAGE plpgsql
AS $function$
DECLARE
valid_flag boolean := 't';
BEGIN
--Validation code
if valid_flag = 'f' then
RAISE EXCEPTION 'This record is not valid id %', id
USING HINT = 'Please enter valid record';
RETURN NULL;
else
RETURN NEW;
end if;
END;
$function$
CREATE TRIGGER validate_rec BEFORE INSERT OR UPDATE ON some_tbl
FOR EACH ROW EXECUTE FUNCTION validate_function();
With this function and trigger you validate inside the trigger. If the new record fails validation you set the valid_flag to false and then use that to raise exception. The RETURN NULL; is probably redundant and I am not sure it will be reached, but if it is it will also abort the insert or update. If the record is valid then you RETURN NEW and the insert/update completes.

Which explicit lock to use for a trigger?

I am trying to understand which type of a lock to use for a trigger function.
Simplified function:
CREATE OR REPLACE FUNCTION max_count() RETURNS TRIGGER AS
$$
DECLARE
max_row INTEGER := 6;
association_count INTEGER := 0;
BEGIN
LOCK TABLE my_table IN ROW EXCLUSIVE MODE;
SELECT INTO association_count COUNT(*) FROM my_table WHERE user_id = NEW.user_id;
IF association_count > max_row THEN
RAISE EXCEPTION 'Too many rows';
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE CONSTRAINT TRIGGER my_max_count
AFTER INSERT OR UPDATE ON my_table
DEFERRABLE INITIALLY DEFERRED
FOR EACH ROW
EXECUTE PROCEDURE max_count();
I initially was planning to use EXCLUSIVE but it feels too heavy. What I really want is to ensure that during this function execution no new rows are added to the table with concerned user_id.
If you want to prevent concurrent transactions from modifying the table, a SHARE lock would be correct. But that could lead to a deadlock if two such transactions run at the same time — each has modified some rows and is blocked by the other one when it tries to escalate the table lock.
Moreover, all table locks that conflict with SHARE UPDATE EXCLUSIVE will lead to autovacuum cancelation, which will cause table bloat when it happens too often.
So stay away from table locks, they are usually the wrong thing.
The better way to go about this is to use no explicit locking at all, but to use the SERIALIZABLE isolation level for all transactions that access this table.
Then you can simply use your trigger (without lock), and no anomalies can occur. If you get a serialization error, repeat the transaction.
This comes with a certain performance penalty, but allows more concurrency than a table lock. It also avoids the problems described in the beginning.

How to lock(exclusive) a record during a function call in pgsql?

In pgsql, how may I lock a record during a function run? considering the following function.
create or replace function foo.bar_func(int)returns int as $$
with s as (select * from foo.bar where id=$1);-- <--lock the fetched row [BEGIN]
/*
Some query, update, insert, ...
*/
select coalesce(s.id,-1) from s;--return something.
-- <-- release the locked row [END]
$$ language sql;
I like to lock the row(if found) at the begin of function, till it finishes its work.
How does pg_advisory_lock(bigint) work? does it help here?what is the difference with select for update?
SELECT … FOR UPDATE does what you expect, namely locks the returned rows exclusively until the end of the current transaction (see here).
Advisory locks, on the other hand, are application defined. They are held either until the end of the current transaction or until the end of the current session (see here). Thus, they need to be checked manually and you may need to release them manually.
If you want to use variables (like s in your sample code), you have to use PL/pgSQL. However, there doesn't seem to be a way to make your function transactional. Instead, it will be always executed in the context of the surrounding transaction. Adding an EXCEPTION clause to your function causes the function to be wrapped in a subtransaction (see here), but locks acquired by your function will be held until the end of the surrounding transaction. I tested with PG 9.3.

Table locking in a plpgsql function

Let's say I've written plpgsql function that does the following:
CREATE OR REPLACE FUNCTION foobar (_foo_data_id bigint)
RETURNS bigint AS $$
BEGIN
DROP TABLE IF EXISTS tmp_foobar;
CREATE TEMP TABLE tmp_foobar AS
SELECT *
FROM foo_table ft
WHERE ft.foo_data_id = _foo_data_id;
-- more SELECT queries on unrelated tables
-- a final SELECT query that invokes tmp_foobar
END;
First question:
If I simultaneously invoked this function twice, is it possible for the second invocation of foobar() to drop the tmp_foobar table while the first invocation of foobar() is still running?
I understand that SELECT statements create an ACCESS SHARE lock, but will that lock persist until the SELECT statement completes or until the implied COMMIT at the end of the function?
Second question:
If the latter is true, will the second invocation of foobar() indefinitely re-try DROP TABLE IF EXISTS tmp_foobar; until the lock is dropped or will it fail at some point?
If you simultaneously invoke a function twice, it means you're using two separate sessions to do so. Temporary tables are not shared between sessions, so the second session would not "see" tmp_foobar from the first session, and there would be no interaction. See http://www.postgresql.org/docs/9.2/static/sql-createtable.html#AEN70605 ("Temporary tables").
Locks persist until the end of the transaction (regardless of how you acquire them; exception are advisory locks, but that's not what you're doing.)
The second question does not need an answer, because the premise is false.
One more thing. It might be useful to create indexes on that temporary table of yours, and ANALYZE it; that might cause the final query to be faster.

Are PostgreSQL functions transactional?

Is a PostgreSQL function such as the following automatically transactional?
CREATE OR REPLACE FUNCTION refresh_materialized_view(name)
RETURNS integer AS
$BODY$
DECLARE
_table_name ALIAS FOR $1;
_entry materialized_views%ROWTYPE;
_result INT;
BEGIN
EXECUTE 'TRUNCATE TABLE ' || _table_name;
UPDATE materialized_views
SET last_refresh = CURRENT_TIMESTAMP
WHERE table_name = _table_name;
RETURN 1;
END
$BODY$
LANGUAGE plpgsql VOLATILE SECURITY DEFINER;
In other words, if an error occurs during the execution of the function, will any changes be rolled back? If this isn't the default behavior, how can I make the function transactional?
PostgreSQL 12 update: there is limited support for top-level PROCEDUREs that can do transaction control. You still cannot manage transactions in regular SQL-callable functions, so the below remains true except when using the new top-level procedures.
Functions are part of the transaction they're called from. Their effects are rolled back if the transaction rolls back. Their work commits if the transaction commits. Any BEGIN ... EXCEPT blocks within the function operate like (and under the hood use) savepoints like the SAVEPOINT and ROLLBACK TO SAVEPOINT SQL statements.
The function either succeeds in its entirety or fails in its entirety, barring BEGIN ... EXCEPT error handling. If an error is raised within the function and not handled, the transaction calling the function is aborted. Aborted transactions cannot commit, and if they try to commit the COMMIT is treated as ROLLBACK, same as for any other transaction in error. Observe:
regress=# BEGIN;
BEGIN
regress=# SELECT 1/0;
ERROR: division by zero
regress=# COMMIT;
ROLLBACK
See how the transaction, which is in the error state due to the zero division, rolls back on COMMIT?
If you call a function without an explicit surounding transaction the rules are exactly the same as for any other Pg statement:
BEGIN;
SELECT refresh_materialized_view(name);
COMMIT;
(where COMMIT will fail if the SELECT raised an error).
PostgreSQL does not (yet) support autonomous transactions in functions, where the procedure/function could commit/rollback independently of the calling transaction. This can be simulated using a new session via dblink.
BUT, things that aren't transactional or are imperfectly transactional exist in PostgreSQL. If it has non-transactional behaviour in a normal BEGIN; do stuff; COMMIT; block, it has non-transactional behaviour in a function too. For example, nextval and setval, TRUNCATE, etc.
As my knowledge of PostgreSQL is less deeper than Craig Ringer´s I will try to give a shorter answer: Yes.
If you execute a function that has an error in it, none of the steps will impact in the database.
Also, if you execute a query in PgAdmin the same happen.
For example, if you execute in a query:
update your_table yt set column1 = 10 where yt.id=20;
select anything_that_do_not_exists;
The update in the row, id = 20 of your_table will not be saved in the database.
UPDATE Sep - 2018
To clarify the concept I have made a little example with non-transactional function nextval.
First, let´s create a sequence:
create sequence test_sequence start 100;
Then, let´s execute:
update your_table yt set column1 = 10 where yt.id=20;
select nextval('test_sequence');
select anything_that_do_not_exists;
Now, if we open another query and execute
select nextval('test_sequence');
We will get 101 because the first value (100) was used in the latter query (that is because the sequences are not transactional) although the update was not committed.
https://www.postgresql.org/docs/current/static/plpgsql-structure.html
It is important not to confuse the use of BEGIN/END for grouping statements in PL/pgSQL with the similarly-named SQL commands for transaction control. PL/pgSQL's BEGIN/END are only for grouping; they do not start or end a transaction. Functions and trigger procedures are always executed within a transaction established by an outer query — they cannot start or commit that transaction, since there would be no context for them to execute in. However, a block containing an EXCEPTION clause effectively forms a subtransaction that can be rolled back without affecting the outer transaction. For more about that see Section 39.6.6.
In the function level, it is not transnational. In other words, each statement in the function belongs to a single transaction, which is the default db auto commit value. Auto commit is true by default. But anyway, you have to call the function using
select schemaName.functionName()
The above statement 'select schemaName.functionName()' is a single transaction, let's name the transaction T1, and so the all the statements in the function belong to the transaction T1. In this way, the function is in a single transaction.
Postgres 14 update: All statements written in between the BEGIN and END block of a Procedure/Function is executed in a single transaction. Thus, any errors arising while execution of this block will cause automatic roll back of the transaction.
Additionally, the ATOMIC Transaction including triggers as well.