FOR EACH STATEMENT and FOR EACH ROW triggers order of execution - postgresql

In a postgres database, say i have 3 tables, tb1, tb2 and tb3.
tb2 gets updated after insert on tb1 for each row using trigger T1, and tb3 gets updated after insert on tb1 for each statement using trigger T2.
my situation is I don't want tb3 to get updated until tb1 and tb2 finish updating, because she uses both.
now if I execute a query that inserts 10k lines on tb1 after the last line is inserted in tb1 the statement will end. and trigger T2 will fire. My question is, will T2 fire before tb2 gets its last 10k-th line or after?
If before, can you propose a solution so that tb3 doesn't get updated until after the two tables get both all the inserts finished?

The documentation has the desired information:
Statement-level BEFORE triggers naturally fire before the statement starts to do anything, while statement-level AFTER triggers fire at the very end of the statement. These types of triggers may be defined on tables, views, or foreign tables. Row-level BEFORE triggers fire immediately before a particular row is operated on, while row-level AFTER triggers fire at the end of the statement (but before any statement-level AFTER triggers).
[...]
If more than one trigger is defined for the same event on the same relation, the triggers will be fired in alphabetical order by trigger name.
So you can rely on t2 running after t1.

Related

How to avoid deadlock when delete/update the same record in the Postgres

I have a scenario when I play with Postgres.
We have one table with primary key, and there are two concurrent process, the one can update record, another process can delete record.
Now we are facing deadlock, when two processes play with update/delete the same record in the table.
I google how to avoid deadlock, someone says to use "SELECT FOR UPDATE".
Suppose there are two statements as following
update table_A set name='aaaa' where cid=1;
delete table_A where cid=1;
My question is,
(1) Do I need to add "SELECT FOR UPDATE" to both statements or just one statement in order to avoid deadlock?
(2) Could you give a complete example how to add "SELECT FOR UPDATE" ? I mean, what does it look like after you add "SELECT FOR UPDATE"? I never do it before, I want to learn how to add it.
SELECT ... FOR UPDATE locks the selected rows so that any other transaction can neither perform an update nor a SELECT ... FOR UPDATE on these rows. These transactions must wait until the transaction with the first SELECT ... FOR UPDATE releases the lock on the rows again.
If SELECT ... FOR UPDATE is the first statement in all transactions, no deadlock can occur. Because no transaction can lock other lines, which could be used in the further course of other transactions.
So your two transactions should look like this:
BEGIN;
SELECT * FROM table_A WHERE cid = 1 FOR UPDATE;
-- some other statements
UPDATE table_A SET name = 'aaaa' WHERE cid = 1;
END;
and:
BEGIN;
SELECT * FROM table_A WHERE cid = 1 FOR UPDATE;
-- some other statements
DELETE FROM table_A WHERE cid = 1;
END;

The order of executing triggers on bulk inserts

Assume we are inserting 1000 (from 1 to 1000) rows in one statement into one table.
The table has one before and one after trigger.
What the order of processing of these rows?
At this moment I suppose:
Before trigger execution for 1st row
insert the 1st row
Before trigger execution for 2nd row
insert the 2nd row
......
Then at the end the after triggers will be called in an undefined order.
Am I right? Where can I find the proofs of my assumptions?
It depends your after insert trigger is row-level triggers or statement-level triggers.
Statement triggers are triggered once after each statement
...
FOR EACH STATEMENT
EXECUTE PROCEDURE xyz();
And if you want the trigger to execute for each affected row and that means you want a row-level trigger.
...
FOR EACH ROW
EXECUTE PROCEDURE xyz();

DB2 Triggers execution order

Given....
CREATE PROCEDURE NotaRandomUpdate (#table1_id INT, #new_value INT)
AS
BEGIN
begin transaction
UPDATE Table1
SET field1 = #new_value
WHERE id = #table1_id
INSERT INTO Table2 VALUE(#new_value)
end transaction
END
In the above (very) simplified situation, if there are 2 seperate TRIGGERS, one on each of Table1 & Table2, which trigger would execute 1st?
I'm looking to take the combined result of the full transaction (with information not referenced in the transaction itself) and save that combined result eleswhere - so I need to bring data from the join of Table1=>Table2 out.
If Table1-Trigger executes 1st, then I'm faced with not having data needed (at that instance) from Table2.
If Table2-Trigger executes 1st, then I'm faced with not having data needed (at that instance) from Table1.
I presume the triggers only execute during/after the commit phase....or are they executed immediately the Table1-update & Table-insert statements are executed and thus the overall database updates are wrapped up inside the full transaction?
This is due to happen in a DB2 database.
Is a solution possible?.
Or am I faced with running a "some time later" activity (like pre-EOD) which executes a query which joins the 2 tables after all relevent updates (for that day) have been completed, providing of course that each of Table1 & Table2 have some timestamp columns that can be tracked.
end
Any relevant triggers for Table1 will fire before any relevant triggers on Table2 , assuming no rollback.
Db2 triggers execute with Insert or Update or Delete statements, whether per-row or per-statement. Hence the statements inside trigger body will only run (assuming trigger is valid) during execution of the triggering statement. Commit will not invoke trigger logic.
Each of your Insert/Update/Delete statements that executes will execute any relevant valid triggers during execution of that statement before execution of the next statement will begin.

some questions about "select for update"?

Here is Pseudo code using libpq.so;but it does not go as what I think.
transaction begin
re1 = [select ics_time from table1 where c1=c11, c2=c22, c3=c33, c4=c44 for update];
if(re1 satisfies the condition)
{
re2 = [select id where c1=c11, c2=c22, c3=c33, c4=c44 for update];
delete from table1 where id = re2;
delete from table2 where id = re2;
delete from table3 where id = re3;
insert a new record into table1,table2,table3 with the c1,c2,c3,c4 as primary keys;
]
commit or rollback
Note that c1,c2,c3,c4 are all set as the primary key in the database, so it is only one row with these keys in the database.
What confuses me is as follows:
There are two "select for update" which will lock the same row. In
this code, does the second SQL statement wait for the exclusive lock
blocked by the first statement? But, the actual situation is that it
does not happen.
Something occurs beyond my expectation. In the log, I see a large
number of duplicate insert errors. In my opinion that the "select
for update " locks the row with the unique for keys, two processes
go serially. The insert operation goes after a delete. How can these
duplicate insertation occur? Doesn't the "select for update" add an
exclusive lock to the row, which blocks all other processes that
want to lock the same row?
Regarding your first point: Locks are not held by the statement, locks are held by the surrounding transaction. Your pseudo-code seems to use one connections with one transaction which in turn uses several statements. So the second SELECT FOR UPDATE is not blocked by the first. Read the docs about locking for this:
[...]An exclusive row-level lock on a specific row is automatically acquired when the row is updated or deleted. The lock is held until the transaction commits or rolls back, just like table-level locks. Row-level locks do not affect data querying; they block only writers to the same row.
Otherwise it would be very funny, if a transaction could block itself so easily.
Regarding your second point: I cannot answer this because a) your pseudo code is to pseudo for this problem and b) I don't understand what you mean by "processes" and the exact usecase.

Is it possible to dynamically loop through a table's columns?

I have a trigger function for a table test which has the following code snippet:
IF TG_OP='UPDATE' THEN
IF OLD.locked > 0 AND
( OLD.org_id <> NEW.org_id OR
OLD.document_code <> NEW.document_code OR
-- other columns ...
)
THEN
RAISE EXCEPTION 'Message';
-- more code
So I am statically checking all the column's new value with its previous value to ensure integrity. Now every time my business logic changes and I have to add new columns into that table, I will have to modify this trigger each time. I thought it would be better if somehow I could dynamically check all the columns of that table, without explicitly typing their name.
How can it be done?
From 9.0 beta2 documentation about WHEN clause in triggers, which might be able to be used in earlier versions within the trigger body:
OLD.* IS DISTINCT FROM NEW.*
or possibly (from 8.2 release notes)
IF row(new.*) IS DISTINCT FROM row(old.*)
Take a look at the information_schema, there is a view "columns". Execute a query to get all current columnnames from the table that fired the trigger:
SELECT
column_name
FROM
information_schema.columns
WHERE
table_schema = TG_TABLE_SCHEMA
AND
table_name = TG_TABLE_NAME;
Loop through the result and there you go!
More information can be found in the fine manual.
In Postgres 9.0 or later add a WHEN clause to your trigger definition (CREATE TRIGGER statement):
CREATE TRIGGER foo
BEFORE UPDATE
FOR EACH ROW
WHEN (OLD IS DISTINCT FROM NEW) -- parentheses required!
EXECUTE PROCEDURE ...;
Only possible for triggers BEFORE / AFTER UPDATE, where both OLD and NEW are defined. You'd get an exception trying to use this WHEN clause with INSERT or DELETE triggers.
And radically simplify the trigger function accordingly:
...
IF OLD.locked > 0 THEN
RAISE EXCEPTION 'Message';
END IF;
...
No need to test IF TG_OP='UPDATE' ... since this trigger only works for UPDATE anyway.
Or move that condition in the WHEN clause, too:
CREATE TRIGGER foo
BEFORE UPDATE
FOR EACH ROW
WHEN (OLD.locked > 0
AND OLD IS DISTINCT FROM NEW)
EXECUTE PROCEDURE ...;
Leaving only an unconditional RAISE EXCEPTION in your trigger function, which is only called when needed to begin with.
Read the fine print:
In a BEFORE trigger, the WHEN condition is evaluated just before the
function is or would be executed, so using WHEN is not materially
different from testing the same condition at the beginning of the
trigger function. Note in particular that the NEW row seen by the
condition is the current value, as possibly modified by earlier
triggers. Also, a BEFORE trigger's WHEN condition is not allowed to
examine the system columns of the NEW row (such as oid), because those
won't have been set yet.
In an AFTER trigger, the WHEN condition is evaluated just after the
row update occurs, and it determines whether an event is queued to
fire the trigger at the end of statement. So when an AFTER trigger's
WHEN condition does not return true, it is not necessary to queue an
event nor to re-fetch the row at end of statement. This can result in
significant speedups in statements that modify many rows, if the
trigger only needs to be fired for a few of the rows.
Related:
Fire trigger on update of columnA or ColumnB or ColumnC
To also address the question title
Is it possible to dynamically loop through a table's columns?
Yes. Examples:
Handle result when dynamic SQL is in a loop
Removing all columns with given name
Iteration over RECORD variable inside trigger
Use pl/perl or pl/python. They are much better suited for such tasks. much better.
You can also install hstore-new, and use it's row->hstore semantics, but that's definitely not a good idea when using normal datatypes.