How can you program a trigger in PostgreSQL that is only called once at the end of a transaction as soon as a table x has been changed for which the trigger was defined? So the trigger should not fire for every row, nor for every statement, but only once, as soon as all statements of a transaction have been executed.
(The background is, I would like to save a time stamp for some tables when they were last changed.)
Related
I have a database where data from different source tables are processed and stored in a materialized view.
I choosed to store it as a MV because the query to process this data takes a while - about 3 seconds - and needs to be called all the time.
So, I created a trigger to refresh the MV every time the source table is modified (INSERT, DELETE or UPDATE).
The problem is, it seems the trigger function waits for the materialized view to finish refreshing to return, and I don't want this.
I want the insert operation to return as fast as possible, and the MV to refresh in parallel.
My function:
CREATE OR REPLACE FUNCTION "MOBILIDADE".atualizar_mv_solicitacao()
RETURNS TRIGGER AS
$$
BEGIN
REFRESH MATERIALIZED VIEW CONCURRENTLY "MOBILIDADE"."MV_SOLICITACAO";
RETURN NULL;
END
$$ LANGUAGE plpgsql;
CREATE TRIGGER solicitacao_atualizar_mv_solicitacao
AFTER INSERT OR DELETE OR UPDATE ON "MOBILIDADE"."GESTAOPROJETOS_SOLICITACAO"
FOR EACH STATEMENT
EXECUTE PROCEDURE "MOBILIDADE".atualizar_mv_solicitacao();
When I run an INSERT operation with the trigger function enabled, it takes about 3 seconds to finish, while when I execute it with the trigger disabled it takes only seconds 0.07 seconds.
INSERT INTO "MOBILIDADE"."GESTAOPROJETOS_SOLICITACAO" (documento_tipo,documento_numero,documento_sigla,documento_ano,requerente,solicitacao,data,data_recebimento_semob,categorias,geom,endereco_regiao,endereco_bairro,endereco_logradouro,anexo,created_by,created_at,acao) VALUES('Indicação',12345,'TESTE',2022,'TESTE','TESTE','2022-09-15','2022-09-15','{"Barreiras físicas" , "Pavimentação"}',ST_Transform(ST_SetSRID(ST_MakePoint(-45.888675631640105,-23.236909838714148),4326),4326),'Sul','Bosque dos Eucaliptos','Rua Lima Duarte',false,1,NOW(),1) RETURNING id
This is the wrong way to go about it. If refreshing the materialized view takes long and you modify the table often, then you cannot refresh the materialized view on every data change. Even if the refresh runs asynchronously (which is not possible with a trigger), it will still put a lot of load on your system.
Consider alternative solutions:
Refresh the materialized view every five minutes or so.
Don't use a materialized view, but a regular table that contains the aggregates and update that table from a trigger whenever the underlying data change. That will only work if the "materialized view" is simple enough.
Given....
CREATE PROCEDURE NotaRandomUpdate (#table1_id INT, #new_value INT)
AS
BEGIN
begin transaction
UPDATE Table1
SET field1 = #new_value
WHERE id = #table1_id
INSERT INTO Table2 VALUE(#new_value)
end transaction
END
In the above (very) simplified situation, if there are 2 seperate TRIGGERS, one on each of Table1 & Table2, which trigger would execute 1st?
I'm looking to take the combined result of the full transaction (with information not referenced in the transaction itself) and save that combined result eleswhere - so I need to bring data from the join of Table1=>Table2 out.
If Table1-Trigger executes 1st, then I'm faced with not having data needed (at that instance) from Table2.
If Table2-Trigger executes 1st, then I'm faced with not having data needed (at that instance) from Table1.
I presume the triggers only execute during/after the commit phase....or are they executed immediately the Table1-update & Table-insert statements are executed and thus the overall database updates are wrapped up inside the full transaction?
This is due to happen in a DB2 database.
Is a solution possible?.
Or am I faced with running a "some time later" activity (like pre-EOD) which executes a query which joins the 2 tables after all relevent updates (for that day) have been completed, providing of course that each of Table1 & Table2 have some timestamp columns that can be tracked.
end
Any relevant triggers for Table1 will fire before any relevant triggers on Table2 , assuming no rollback.
Db2 triggers execute with Insert or Update or Delete statements, whether per-row or per-statement. Hence the statements inside trigger body will only run (assuming trigger is valid) during execution of the triggering statement. Commit will not invoke trigger logic.
Each of your Insert/Update/Delete statements that executes will execute any relevant valid triggers during execution of that statement before execution of the next statement will begin.
I have a function in PG which has if-then block, something like this
if <condition> then
...
update mytab set col1=true;
<some long time operation>
end if;
By function call I expect that when I update mytab and proceed with the long operation, then other sessions will see that mytab.col1 is true (I want to use it like a flag).
But what happens is that mytab is updated, but it is visible only for this session. The other sessions see that mytab.col1 is still false.
First after finishing of the long operation (and the whole if-then block), then other session see that mytab.col1 is true.
How to make this update immediately visible (in Oracle f.i. commit will do the trick).
You cannot do that, since the complete body of a function runs within one transaction.
The only workaround is to open a second database connection with dblink and
execute (and commit) your update there.
I have a column called updated which is intended to show the last time that column was altered.
My first attempt at this was to create a trigger that changed the updated column to the value returned by now(). But because this trigger happens on an update event, it caused an infinite loop (updating the updated column causes the trigger to fire).
I also tried implementing a rule to do this with similar effects.
I can't imagine that this is something I am forced to do on the application layer when ever I call and update function. So how can I update that row's updated column without causing infinite loops?
Use a trigger like this:
CREATE OR REPLACE FUNCTION update() RETURNS trigger AS $$
BEGIN
IF NEW.updated = OLD.updated THEN
NEW.updated = NOW()
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER table_update
BEFORE UPDATE ON table
FOR EACH ROW EXECUTE PROCEDURE update()
This way you aren't in a loop--you only update the value once (before the UPDATE is executed), and you also don't clobber the value, if for some reason you want to set updated explicitly (as in when importing old data from backup, for instance).
I have solved this in other applications by checking the field I am changing and if nothing is changed, then I do not do the update. If you can check the updated column and if it is within the last N seconds, do not do the update. This should stop the infinite loop. Pick the number you want for N, so you can know the update timestamp is always within N seconds.
I’ve got a table that stores events.
However, the way the system has been designed, events are usually recorded in batches. What I mean by this is that a set of events (10 or so) are usually recorded together, rather than just single events.
We can assume that: there is a column called “batch_no” in the events table, so we know which events belong to which batch no.
The question: What I am trying to do is execute a trigger function, every time a batch of events have finished loading to the table. However, the problem is I can’t think of how the trigger will know this, and not just call the function for every row.
The solutions I’ve been thinking about involves something like: (a) define a trigger for each row; (b) on condition that calculate count(select * from events, where NEW.batchNO = events.batchNO); delay some time; calculate again the same count, and if they are equal, we know the batch has finished loading, and we call the trigger.
Although, clearly the solution above sounds complicated? Is there a more better or simpler solution? (Or if not, any help for how I could implement what I described?)
You can pass parameters to a trigger function but only in the CREATE TRIGGER statement, which helps to use the same trigger function for multiple triggers, but does not help with your situation.
You need the trigger to fire on a condition that is not known at the time of trigger creation. I see basically three possibilities:
1) Statement-level trigger
Using the FOR EACH STATEMENT clause. The manual:
A trigger that is marked FOR EACH ROW is called once for every row
that the operation modifies. For example, a DELETE that affects 10
rows will cause any ON DELETE triggers on the target relation to be
called 10 separate times, once for each deleted row. In contrast, a
trigger that is marked FOR EACH STATEMENT only executes once for any
given operation, regardless of how many rows it modifies (in
particular, an operation that modifies zero rows will still result in
the execution of any applicable FOR EACH STATEMENT triggers).
Only applicable if you insert all your batches with a single INSERT command (multiple rows), but not more than one batch at a time.
2) WHEN condition for a row-level trigger.
Using the FOR EACH ROW clause plus a WHEN condition. You need version 9.0+ for that.
If you can tell from a single inserted row, which is the last one of a batch then your trigger definition could look like this:
CREATE TRIGGER insert_after_batch
AFTER INSERT ON tbl
FOR EACH ROW
WHEN (NEW.batch_last) -- any expression identifying the last
EXECUTE PROCEDURE trg_tbl_insert_after_batch();
This assumes a column batch_last boolean in your table, where you flag the last row of a batch. Any expression based on column values is possible.
This way the trigger only fires for the last row of each batch. Make it an AFTER trigger, so all rows are already visible in the table and you can query them together for whatever you want to do in your trigger. Probably the way to go.
3) Conditional code inside your trigger
That's basically the fallback for versions before 9.0 without the WHEN clause. Do the same check inside the trigger before executing the payload. More expensive than 2).