Which explicit lock to use for a trigger? - postgresql

I am trying to understand which type of a lock to use for a trigger function.
Simplified function:
CREATE OR REPLACE FUNCTION max_count() RETURNS TRIGGER AS
$$
DECLARE
max_row INTEGER := 6;
association_count INTEGER := 0;
BEGIN
LOCK TABLE my_table IN ROW EXCLUSIVE MODE;
SELECT INTO association_count COUNT(*) FROM my_table WHERE user_id = NEW.user_id;
IF association_count > max_row THEN
RAISE EXCEPTION 'Too many rows';
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE CONSTRAINT TRIGGER my_max_count
AFTER INSERT OR UPDATE ON my_table
DEFERRABLE INITIALLY DEFERRED
FOR EACH ROW
EXECUTE PROCEDURE max_count();
I initially was planning to use EXCLUSIVE but it feels too heavy. What I really want is to ensure that during this function execution no new rows are added to the table with concerned user_id.

If you want to prevent concurrent transactions from modifying the table, a SHARE lock would be correct. But that could lead to a deadlock if two such transactions run at the same time — each has modified some rows and is blocked by the other one when it tries to escalate the table lock.
Moreover, all table locks that conflict with SHARE UPDATE EXCLUSIVE will lead to autovacuum cancelation, which will cause table bloat when it happens too often.
So stay away from table locks, they are usually the wrong thing.
The better way to go about this is to use no explicit locking at all, but to use the SERIALIZABLE isolation level for all transactions that access this table.
Then you can simply use your trigger (without lock), and no anomalies can occur. If you get a serialization error, repeat the transaction.
This comes with a certain performance penalty, but allows more concurrency than a table lock. It also avoids the problems described in the beginning.

Related

Can postgres insert triggers and/or check be ran without inserting

I would love to be able to validate objects representing table rows using the database's existing constraints (triggers that raise exceptions and checks) without actually inserting them into the database.
Is there currently a way one could do this in postgres? At least with BEFORE INSERT triggers and CHECK, I assume it makes no sense with AFTER INSERT triggers.
The easiest way I can think or right now would be to:
Lock the table
Insert a new row
If exception raise to the API / else DELETE the row and call it valid
Unlock
But I can see several issues with this.
A simpler way is to insert within a transaction and not commit:
BEGIN;
INSERT INTO tbl(...) VALUES (...);
-- see effects ...
ROLLBACK;
No need for additional locking. The row is never visible to any other transaction with default transacton isolation level READ COMMITTED. (You might be stalling concurrent writes that confict with the tested row.)
Notable side-effect: Sequences of serial or IDENTITY columns are advanced even if the INSERT is never committed. But gaps in sequential numbers are to be expected anyway and nothing to worry about.
Be wary of triggers with side-effects. All "transactional" SQL effects are rolled back, even most DDL commands. But some special operations (like advancing sequences) are never rolled back.
Also, DEFERRED constraints do not kick in. The manual:
DEFERRED constraints are not checked until transaction commit.
If you need this a lot, work with a copy of your table, or even your database.
Strictly speaking, while any trigger / constraint / concurrent event is allowed, there is no other way to "validate objects" than to insert them into the actual target table in the actual target database at the actual point in time. Triggers, constraints, even default values, can interact with the current state of the whole DB. The more possibilities are ruled out and requirements are reduced, the more options we might have to emulate the test.
CREATE FUNCTION validate_function ( )
RETURNS trigger LANGUAGE plpgsql
AS $function$
DECLARE
valid_flag boolean := 't';
BEGIN
--Validation code
if valid_flag = 'f' then
RAISE EXCEPTION 'This record is not valid id %', id
USING HINT = 'Please enter valid record';
RETURN NULL;
else
RETURN NEW;
end if;
END;
$function$
CREATE TRIGGER validate_rec BEFORE INSERT OR UPDATE ON some_tbl
FOR EACH ROW EXECUTE FUNCTION validate_function();
With this function and trigger you validate inside the trigger. If the new record fails validation you set the valid_flag to false and then use that to raise exception. The RETURN NULL; is probably redundant and I am not sure it will be reached, but if it is it will also abort the insert or update. If the record is valid then you RETURN NEW and the insert/update completes.

Postgres Locking a table inside the function is not working?

CREATE OR REPLACE FUNCTION()
RETURND VOID AS
BEGIN
FOR I IN 1..5
LOOP
LOCK TABLE tbl_Employee1 IN EXCLUSIVE MODE;
INSERT INTO tbl_Employee1
VALUES
(i,'test');
END LOOP;
COMMIT;
END;
$$ LANGUAGE PLPGSQL
When I select the table it is going into infinty loop means the transaction is not complete. Please help me out ?
Your code has been stripped down so much that it doesn't really make sense any more.
However, you should only lock the table once, not in each iteration of the loop. Plus you can't use commit in a function in Postgres, so you have to remove that as well. It's also bad coding style (in Postgres and Oracle) to not provide column names for the insert statement.
Immediate solution:
CREATE OR REPLACE FUNCTION ...
RETURNS VOID AS
$$
BEGIN
LOCK TABLE Employee1 IN EXCLUSIVE MODE;
FOR I IN 1..5 LOOP
INSERT INTO Employee1 (id, name)
VALUES (i,'test');
END LOOP;
-- no commit here!
END;
$$ LANGUAGE PLPGSQL
The above is needlessly complicated in Postgres and can be implemented much more efficiently without a loop:
CREATE OR REPLACE FUNCTION ....
RETURNS VOID AS
$$
BEGIN
LOCK TABLE Employee1 IN EXCLUSIVE MODE;
INSERT INTO Employee1 (id, name)
select i, test
from generate_series(1,5);
END;
$$ LANGUAGE PLPGSQL
Locking a table in exclusive mode seems like a bad idea to begin with. In Oracle as well, but in Postgres this might have more severe implications. If you want to prevent duplicates in the table, create a unique index (or constraint) and deal with errors. Or use insert ... on conflict in Postgres. That will be much more efficient (and scalable) than locking a complete table.
Additionally: LOCK TABLE IN EXCLUSIVE MODE; behaves differently in Oracle and Postgres. While Oracle will still allow read only queries on that table, you block every access to it in Postgres - including SELECT statements.

PostgreSQL BEFORE INSERT trigger locking behavior in a concurrent environment

I have a general function that can manipulate the sequence of any table (why is irrelevant to my question). It reads the current value, works out the new value, sets it, and returns its calculation, which is what's inserted. This is obviously a multi-step process.
I call it from a BEFORE INSERT trigger on tables where I need it.
All I need to know is am I guaranteed that the function will be called by only one caller at a time in a multi-user environment?
Specifically, does the BEFORE INSERT trigger have to complete before it is called again by another caller?
Logically, I would assume yes, but one never knows what may be going on under the hood.
If the answer is no, what minimal locking would I need on the function to guarantee I can read and write the sequence in a "thread-safe" manner?
I'm using PG 10.
EDIT
Here is the function updated with a lock:
CREATE OR REPLACE FUNCTION public.uts_set()
RETURNS TRIGGER AS
$$
DECLARE
sv int8;
seq text := format('%I.%I_uts_seq', tg_table_schema, tg_table_name);
BEGIN
EXECUTE format('LOCK TABLE %I IN ROW EXCLUSIVE MODE;', tg_table_name);
EXECUTE 'SELECT last_value+1 FROM ' || seq INTO sv; -- currval(seq) isn't useable
PERFORM setval(seq, GREATEST(sv, (EXTRACT(epoch FROM localtimestamp) * 1000000)::int8), false);
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
However, a SELECT already acquires ROW EXCLUSIVE, so this statement may be redundant and a stronger lock may be needed. Or, conversely, it may mean no lock is needed.
UPDATE
If I am reading this SO question correctly, my original version without the LOCK should work since the trigger acquires the same lock my updated function is redundantly taking.
All I need to know is am I guaranteed that the function will be called by only one caller at a time in a multi-user environment?
No. Not related to calling functions itself, but you can achieve this behaviour with SERIALIZABLE transaction isolation level:
This level emulates serial transaction execution for all committed
transactions; as if transactions had been executed one after another,
serially, rather than concurrently
But this approach would introduce several tradeoffs, such preparing your application to retry transactions with serialization failure.
Maybe a missed something, but I really believe that you just need NEXTVAL, something like below:
CREATE OR REPLACE FUNCTION public.uts_set()
RETURNS TRIGGER AS
$$
DECLARE
sv int8;
-- First, use %I wildcard for identifiers instead of %s
seq text := format('%I.%I', tg_table_schema, tg_table_name || '_uts_seq');
BEGIN
-- Second, you couldn't call CURRVAL on a session
-- that you didn't issued NEXTVAL before
sv := NEXTVAL(seq);
-- Do your logic here...
-- Result is ignored since this is an STATEMENT trigger
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
Remember that CURRVAL acts on session local scope and NEXTVAL on global scope, so you have a reliable thread-safe mechanism in hands.
The sequence itself handles thread safety with concurrent sessions. So it real comes down to the code that is interacting with the sequence. The following code is thread safe:
SELECT nextval('myseq');
If the sequence is doing much fancier things like setval and currval, I would be more worried about that being done in a high transaction/multi-user environment. Even so, the sequence itself should be locked from other queries while the sequence is being manipulated.

PostgreSQL concurrent update selects

I am attempting to have some sort of update select for a job queue. I need it to support concurrent processes affecting the same table or database This server will be used only for the queue so a database per queue is acceptable. Originally I was thinking about something like the following:
UPDATE state=1,ts=NOW() FROM queue WHERE ID IN (SELECT ID FROM queue WHERE state=0 LIMIT X) RETURN *
Which I been reading that this will cause a race condition, I read that there was a an option for the SELECT subquery to use FOR UPDATE, but then that will lock the row and concurrent calls will be blocked where I would not mind if they skip over to the next unlocked row.
So what i am asking for is the best way to have a fifo system in postgres that requires the least amount of locking the entire database.
The typical way to do this is to wrap it in a PLPGSQL function, select FOR UPDATE NOWAIT, and then use exception handling to skip the locked rows.
This does place some additional overhead on the function because exception handling requires additional processor cycles to manage even if there are no exceptions.
As a very simple example:
CREATE OR REPLACE FUNCTION get_all_unlocked_customers() RETURNS SETOF customer
LANGUAGE PLPGSQL AS
$$
RETURN QUERY SELECT * FROM customer FOR UPDATE NOWAIT;
EXCEPTION
WHEN LOCK_NOT_AVAILABLE THEN
-- NO NEED TO DO ANYTHING
END;
END;
$$;

Table locking in a plpgsql function

Let's say I've written plpgsql function that does the following:
CREATE OR REPLACE FUNCTION foobar (_foo_data_id bigint)
RETURNS bigint AS $$
BEGIN
DROP TABLE IF EXISTS tmp_foobar;
CREATE TEMP TABLE tmp_foobar AS
SELECT *
FROM foo_table ft
WHERE ft.foo_data_id = _foo_data_id;
-- more SELECT queries on unrelated tables
-- a final SELECT query that invokes tmp_foobar
END;
First question:
If I simultaneously invoked this function twice, is it possible for the second invocation of foobar() to drop the tmp_foobar table while the first invocation of foobar() is still running?
I understand that SELECT statements create an ACCESS SHARE lock, but will that lock persist until the SELECT statement completes or until the implied COMMIT at the end of the function?
Second question:
If the latter is true, will the second invocation of foobar() indefinitely re-try DROP TABLE IF EXISTS tmp_foobar; until the lock is dropped or will it fail at some point?
If you simultaneously invoke a function twice, it means you're using two separate sessions to do so. Temporary tables are not shared between sessions, so the second session would not "see" tmp_foobar from the first session, and there would be no interaction. See http://www.postgresql.org/docs/9.2/static/sql-createtable.html#AEN70605 ("Temporary tables").
Locks persist until the end of the transaction (regardless of how you acquire them; exception are advisory locks, but that's not what you're doing.)
The second question does not need an answer, because the premise is false.
One more thing. It might be useful to create indexes on that temporary table of yours, and ANALYZE it; that might cause the final query to be faster.