The order of executing triggers on bulk inserts - postgresql

Assume we are inserting 1000 (from 1 to 1000) rows in one statement into one table.
The table has one before and one after trigger.
What the order of processing of these rows?
At this moment I suppose:
Before trigger execution for 1st row
insert the 1st row
Before trigger execution for 2nd row
insert the 2nd row
......
Then at the end the after triggers will be called in an undefined order.
Am I right? Where can I find the proofs of my assumptions?

It depends your after insert trigger is row-level triggers or statement-level triggers.
Statement triggers are triggered once after each statement
...
FOR EACH STATEMENT
EXECUTE PROCEDURE xyz();
And if you want the trigger to execute for each affected row and that means you want a row-level trigger.
...
FOR EACH ROW
EXECUTE PROCEDURE xyz();

Related

DB2 Triggers execution order

Given....
CREATE PROCEDURE NotaRandomUpdate (#table1_id INT, #new_value INT)
AS
BEGIN
begin transaction
UPDATE Table1
SET field1 = #new_value
WHERE id = #table1_id
INSERT INTO Table2 VALUE(#new_value)
end transaction
END
In the above (very) simplified situation, if there are 2 seperate TRIGGERS, one on each of Table1 & Table2, which trigger would execute 1st?
I'm looking to take the combined result of the full transaction (with information not referenced in the transaction itself) and save that combined result eleswhere - so I need to bring data from the join of Table1=>Table2 out.
If Table1-Trigger executes 1st, then I'm faced with not having data needed (at that instance) from Table2.
If Table2-Trigger executes 1st, then I'm faced with not having data needed (at that instance) from Table1.
I presume the triggers only execute during/after the commit phase....or are they executed immediately the Table1-update & Table-insert statements are executed and thus the overall database updates are wrapped up inside the full transaction?
This is due to happen in a DB2 database.
Is a solution possible?.
Or am I faced with running a "some time later" activity (like pre-EOD) which executes a query which joins the 2 tables after all relevent updates (for that day) have been completed, providing of course that each of Table1 & Table2 have some timestamp columns that can be tracked.
end
Any relevant triggers for Table1 will fire before any relevant triggers on Table2 , assuming no rollback.
Db2 triggers execute with Insert or Update or Delete statements, whether per-row or per-statement. Hence the statements inside trigger body will only run (assuming trigger is valid) during execution of the triggering statement. Commit will not invoke trigger logic.
Each of your Insert/Update/Delete statements that executes will execute any relevant valid triggers during execution of that statement before execution of the next statement will begin.

How to update column value based on other column change within same table without any primary key column in table

I have a table called 'custom_manual_edit' with columns 'name', 'builder' and 'flag' in which there is no column with primary key.I have written a trigger when user update any change in builder column and that trigger will invoke a function that should update a flag column value to 10 for record for which builder value is changed
below is my trigger
CREATE TRIGGER builder_update_trigger_manual_custom_edits
AFTER UPDATE
ON edmonton.custom_manual_edit
FOR EACH ROW
WHEN (((old.builder)::text IS DISTINCT FROM (new.builder)::text))
EXECUTE PROCEDURE
edmonton.automated_builder_update_trigger_manual_custom_edits();
and my function
CREATE OR REPLACE FUNCTION
edmonton.automated_builder_update_trigger_manual_custom_edits()
RETURNS trigger AS
$BODY$
DECLARE
e record;
BEGIN
IF NEW.builder <> OLD.builder THEN
EXECUTE FORMAT('UPDATE edmonton.custom_manual_edit set builder_edit_flag = 10;
END IF;
RETURN NEW;
END
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
I know this will update entire table flag column to 10 but how to update flag value for records for which builder value is changed.
Please check the documentation: 36.1. Overview of Trigger Behavior
Trigger functions invoked by per-statement triggers should always
return NULL. Trigger functions invoked by per-row triggers can return
a table row (a value of type HeapTuple) to the calling executor, if
they choose. A row-level trigger fired before an operation has the
following choices:
It can return NULL to skip the operation for the current row. This
instructs the executor to not perform the row-level operation that
invoked the trigger (the insertion, modification, or deletion of a
particular table row).
For row-level INSERT and UPDATE triggers only, the returned row
becomes the row that will be inserted or will replace the row being
updated. This allows the trigger function to modify the row being
inserted or updated.
A row-level BEFORE trigger that does not intend to cause either of
these behaviors must be careful to return as its result the same row
that was passed in (that is, the NEW row for INSERT and UPDATE
triggers, the OLD row for DELETE triggers).
According to the above you must:
declare the trigger as BEFORE UPDATE, not AFTER UPDATE
changebuilder_edit_flag column value directly in NEW row instead of firing UPDATE statement
CREATE TRIGGER builder_update_trigger_manual_custom_edits
BEFORE UPDATE
ON edmonton.custom_manual_edit
FOR EACH ROW
.....
.....
CREATE OR REPLACE FUNCTION
edmonton.automated_builder_update_trigger_manual_custom_edits()
.....
.....
BEGIN
IF NEW.builder <> OLD.builder THEN
NEW.builder_edit_flag = 10;
END IF;
RETURN NEW;
.....
.....

How to optimize pgplsql function?

I have function which is triggered for each row. It has next statements:
SELECT DocDate
FROM Operation
WHERE Operation.ID = New.OperationID
INTO _DocDate;
EXECUTE PROCEDURE ChangeSaldo( _DocDate, ... );
Each row belongs to parent table Operation where we fetch DocDate.
We need run ChangeSaldo for each row with its own _DocDate
That is guaranteed that all rows belongs to same row in Operation table
Will Postgres execute this SELECT for each row or this statement will be cached?
If it is not cached is there a way to optimize SELECT so it will be executed only once?
I plan to create stable function and put there that SELECT.
Because of nature of stable functions the result will be cached

Triggers in DB to be executed after a “set” of insert statements

I’ve got a table that stores events.
However, the way the system has been designed, events are usually recorded in batches. What I mean by this is that a set of events (10 or so) are usually recorded together, rather than just single events.
We can assume that: there is a column called “batch_no” in the events table, so we know which events belong to which batch no.
The question: What I am trying to do is execute a trigger function, every time a batch of events have finished loading to the table. However, the problem is I can’t think of how the trigger will know this, and not just call the function for every row.
The solutions I’ve been thinking about involves something like: (a) define a trigger for each row; (b) on condition that calculate count(select * from events, where NEW.batchNO = events.batchNO); delay some time; calculate again the same count, and if they are equal, we know the batch has finished loading, and we call the trigger.
Although, clearly the solution above sounds complicated? Is there a more better or simpler solution? (Or if not, any help for how I could implement what I described?)
You can pass parameters to a trigger function but only in the CREATE TRIGGER statement, which helps to use the same trigger function for multiple triggers, but does not help with your situation.
You need the trigger to fire on a condition that is not known at the time of trigger creation. I see basically three possibilities:
1) Statement-level trigger
Using the FOR EACH STATEMENT clause. The manual:
A trigger that is marked FOR EACH ROW is called once for every row
that the operation modifies. For example, a DELETE that affects 10
rows will cause any ON DELETE triggers on the target relation to be
called 10 separate times, once for each deleted row. In contrast, a
trigger that is marked FOR EACH STATEMENT only executes once for any
given operation, regardless of how many rows it modifies (in
particular, an operation that modifies zero rows will still result in
the execution of any applicable FOR EACH STATEMENT triggers).
Only applicable if you insert all your batches with a single INSERT command (multiple rows), but not more than one batch at a time.
2) WHEN condition for a row-level trigger.
Using the FOR EACH ROW clause plus a WHEN condition. You need version 9.0+ for that.
If you can tell from a single inserted row, which is the last one of a batch then your trigger definition could look like this:
CREATE TRIGGER insert_after_batch
AFTER INSERT ON tbl
FOR EACH ROW
WHEN (NEW.batch_last) -- any expression identifying the last
EXECUTE PROCEDURE trg_tbl_insert_after_batch();
This assumes a column batch_last boolean in your table, where you flag the last row of a batch. Any expression based on column values is possible.
This way the trigger only fires for the last row of each batch. Make it an AFTER trigger, so all rows are already visible in the table and you can query them together for whatever you want to do in your trigger. Probably the way to go.
3) Conditional code inside your trigger
That's basically the fallback for versions before 9.0 without the WHEN clause. Do the same check inside the trigger before executing the payload. More expensive than 2).

Is it possible to dynamically loop through a table's columns?

I have a trigger function for a table test which has the following code snippet:
IF TG_OP='UPDATE' THEN
IF OLD.locked > 0 AND
( OLD.org_id <> NEW.org_id OR
OLD.document_code <> NEW.document_code OR
-- other columns ...
)
THEN
RAISE EXCEPTION 'Message';
-- more code
So I am statically checking all the column's new value with its previous value to ensure integrity. Now every time my business logic changes and I have to add new columns into that table, I will have to modify this trigger each time. I thought it would be better if somehow I could dynamically check all the columns of that table, without explicitly typing their name.
How can it be done?
From 9.0 beta2 documentation about WHEN clause in triggers, which might be able to be used in earlier versions within the trigger body:
OLD.* IS DISTINCT FROM NEW.*
or possibly (from 8.2 release notes)
IF row(new.*) IS DISTINCT FROM row(old.*)
Take a look at the information_schema, there is a view "columns". Execute a query to get all current columnnames from the table that fired the trigger:
SELECT
column_name
FROM
information_schema.columns
WHERE
table_schema = TG_TABLE_SCHEMA
AND
table_name = TG_TABLE_NAME;
Loop through the result and there you go!
More information can be found in the fine manual.
In Postgres 9.0 or later add a WHEN clause to your trigger definition (CREATE TRIGGER statement):
CREATE TRIGGER foo
BEFORE UPDATE
FOR EACH ROW
WHEN (OLD IS DISTINCT FROM NEW) -- parentheses required!
EXECUTE PROCEDURE ...;
Only possible for triggers BEFORE / AFTER UPDATE, where both OLD and NEW are defined. You'd get an exception trying to use this WHEN clause with INSERT or DELETE triggers.
And radically simplify the trigger function accordingly:
...
IF OLD.locked > 0 THEN
RAISE EXCEPTION 'Message';
END IF;
...
No need to test IF TG_OP='UPDATE' ... since this trigger only works for UPDATE anyway.
Or move that condition in the WHEN clause, too:
CREATE TRIGGER foo
BEFORE UPDATE
FOR EACH ROW
WHEN (OLD.locked > 0
AND OLD IS DISTINCT FROM NEW)
EXECUTE PROCEDURE ...;
Leaving only an unconditional RAISE EXCEPTION in your trigger function, which is only called when needed to begin with.
Read the fine print:
In a BEFORE trigger, the WHEN condition is evaluated just before the
function is or would be executed, so using WHEN is not materially
different from testing the same condition at the beginning of the
trigger function. Note in particular that the NEW row seen by the
condition is the current value, as possibly modified by earlier
triggers. Also, a BEFORE trigger's WHEN condition is not allowed to
examine the system columns of the NEW row (such as oid), because those
won't have been set yet.
In an AFTER trigger, the WHEN condition is evaluated just after the
row update occurs, and it determines whether an event is queued to
fire the trigger at the end of statement. So when an AFTER trigger's
WHEN condition does not return true, it is not necessary to queue an
event nor to re-fetch the row at end of statement. This can result in
significant speedups in statements that modify many rows, if the
trigger only needs to be fired for a few of the rows.
Related:
Fire trigger on update of columnA or ColumnB or ColumnC
To also address the question title
Is it possible to dynamically loop through a table's columns?
Yes. Examples:
Handle result when dynamic SQL is in a loop
Removing all columns with given name
Iteration over RECORD variable inside trigger
Use pl/perl or pl/python. They are much better suited for such tasks. much better.
You can also install hstore-new, and use it's row->hstore semantics, but that's definitely not a good idea when using normal datatypes.