Execute deferred trigger only once per row in PostgreSQL - postgresql

I have a deferred AFTER UPDATE trigger on a table, set to fire when a certain column is updated. It's an integer type I'm using as a counter.
I'm not 100% certain but it looks like if I increment that particular column 100 times during a transaction, the trigger is queued up and executed 100 times at the end of the transaction.
I would like the trigger to only be scheduled once per row no matter how many times I've incremented that column.
Can I do that somehow?
Alternatively if triggered triggers must queue up regardless if they are duplicates, can I clear this queue during the first run of the trigger?
Version of Postgres is 9.1. Here's what I got:
CREATE CONSTRAINT TRIGGER counter_change
AFTER UPDATE OF "Counter" ON "table"
DEFERRABLE INITIALLY DEFERRED
FOR EACH ROW
EXECUTE PROCEDURE counter_change();
CREATE OR REPLACE FUNCTION counter_change()
RETURNS trigger
LANGUAGE plpgsql
AS $$
DECLARE
BEGIN
PERFORM some_expensive_procedure(NEW."id");
RETURN NEW;
END;$$;

This is a tricky problem. But it can be done with per-column triggers and conditional trigger execution introduced in PostgreSQL 9.0.
You need an "updated" flag per row for this solution. Use a boolean column in the same table for simplicity. But it could be in another table or even a temporary table per transaction.
The expensive payload is executed once per row where the counter is updated (once or multiple time).
This should also perform well, because ...
... it avoids multiple calls of triggers at the root (scales well)
... does not change additional rows (minimize table bloat)
... does not need expensive exception handling.
Consider the following
Demo
Tested in PostgreSQL 9.1 with a separate schema x as test environment.
Tables and dummy rows
-- DROP SCHEMA x;
CREATE SCHEMA x;
CREATE TABLE x.tbl (
id int
,counter int
,trig_exec_count integer -- for monitoring payload execution.
,updated bool);
Insert two rows to demonstrate it works with multiple rows:
INSERT INTO x.tbl VALUES
(1, 0, 0, NULL)
,(2, 0, 0, NULL);
Trigger functions and Triggers
1.) Execute expensive payload
CREATE OR REPLACE FUNCTION x.trg_upaft_counter_change_1()
RETURNS trigger AS
$BODY$
BEGIN
-- PERFORM some_expensive_procedure(NEW.id);
-- Update trig_exec_count to count execution of expensive payload.
-- Could be in another table, for simplicity, I use the same:
UPDATE x.tbl t
SET trig_exec_count = trig_exec_count + 1
WHERE t.id = NEW.id;
RETURN NULL; -- RETURN value of AFTER trigger is ignored anyway
END;
$BODY$ LANGUAGE plpgsql;
2.) Flag row as updated.
CREATE OR REPLACE FUNCTION x.trg_upaft_counter_change_2()
RETURNS trigger AS
$BODY$
BEGIN
UPDATE x.tbl
SET updated = TRUE
WHERE id = NEW.id;
RETURN NULL;
END;
$BODY$ LANGUAGE plpgsql;
3.) Reset "updated" flag.
CREATE OR REPLACE FUNCTION x.trg_upaft_counter_change_3()
RETURNS trigger AS
$BODY$
BEGIN
UPDATE x.tbl
SET updated = NULL
WHERE id = NEW.id;
RETURN NULL;
END;
$BODY$ LANGUAGE plpgsql;
Trigger names are relevant! Called for the same event they are executed in alphabetical order.
1.) Payload, only if not "updated" yet:
CREATE CONSTRAINT TRIGGER upaft_counter_change_1
AFTER UPDATE OF counter ON x.tbl
DEFERRABLE INITIALLY DEFERRED
FOR EACH ROW
WHEN (NEW.updated IS NULL)
EXECUTE PROCEDURE x.trg_upaft_counter_change_1();
2.) Flag row as updated, only if not "updated" yet:
CREATE TRIGGER upaft_counter_change_2 -- not deferred!
AFTER UPDATE OF counter ON x.tbl
FOR EACH ROW
WHEN (NEW.updated IS NULL)
EXECUTE PROCEDURE x.trg_upaft_counter_change_2();
3.) Reset Flag. No endless loop because of trigger condition.
CREATE CONSTRAINT TRIGGER upaft_counter_change_3
AFTER UPDATE OF updated ON x.tbl
DEFERRABLE INITIALLY DEFERRED
FOR EACH ROW
WHEN (NEW.updated) --
EXECUTE PROCEDURE x.trg_upaft_counter_change_3();
Test
Run UPDATE & SELECT separately to see the deferred effect. If executed together (in one transaction) the SELECT will show the new tbl.counter but the old tbl2.trig_exec_count.
UPDATE x.tbl SET counter = counter + 1;
SELECT * FROM x.tbl;
Now, update the counter multiple times (in one transaction). The payload will only be executed once. Voilá!
UPDATE x.tbl SET counter = counter + 1;
UPDATE x.tbl SET counter = counter + 1;
UPDATE x.tbl SET counter = counter + 1;
UPDATE x.tbl SET counter = counter + 1;
UPDATE x.tbl SET counter = counter + 1;
SELECT * FROM x.tbl;

I don't know of a way to collapse trigger execution to once per (updated) row per transaction, but you can emulate this with a TEMPORARY ON COMMIT DROP table which tracks those modified rows and performs your expensive operation only once per row per tx:
CREATE OR REPLACE FUNCTION counter_change() RETURNS TRIGGER
AS $$
BEGIN
-- If we're the first invocation of this trigger in this tx,
-- make our scratch table. Create unique index separately to
-- suppress avoid NOTICEs without fiddling with log_min_messages
BEGIN
CREATE LOCAL TEMPORARY TABLE tbl_counter_tx_once
("id" AS_APPROPRIATE NOT NULL)
ON COMMIT DROP;
CREATE UNIQUE INDEX ON tbl_counter_tx_once AS ("id");
EXCEPTION WHEN duplicate_table THEN
NULL;
END;
-- If we're the first invocation in this tx *for this row*,
-- then do our expensive operation.
BEGIN
INSERT INTO tbl_counter_tx_once ("id") VALUES (NEW."id");
PERFORM SOME_EXPENSIVE_OPERATION_HERE(NEW."id");
EXCEPTION WHEN unique_violation THEN
NULL;
END;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
There's of course a risk of name collision with that temporary table, so choose judiciously.

Related

org.postgresql.util.PSQLException: ERROR: column "row_count" does not exist

I tried to execute a Trigger function which checks the value of ROW_COUNT and performs the necessary operation. The trigger function is being called by a trigger.
The trigger function definition is:
create function MB_MDDeleteDefinitionPGSQL() returns trigger language plpgsql as $$
declare integer_var bigint ;
begin
GET DIAGNOSTICS integer_var = ROW_COUNT;
if ROW_COUNT > 0 then
delete from MB_MDFieldProps where propId = OLD.commonPropsId;
delete from MB_MDCustomFieldProps where customId = old.customId and old.reusableId is null;
end if;
end $$;
And the trigger which calls the above function is
create trigger MB_MDDeleteDefinition before delete on MB_MDDefinition for each row
execute procedure MB_MDDeleteDefinitionPGSQL();
You just don't need that ROW_COUNT logic. The trigger fires for each row that is deleted; if no row is deleted, then it does not fire at all.
Note, however, that it would be much simpler (and more efficient) to set foreign key constraints on the dependent tables, which would simply avoid the need for a trigger.
For example:
create table mb_mdfieldprops (
...
propId int
references mb_mddefinition(commonPropsId)
on delete cascade
);

How to use the same trigger function for insert/update/delete triggers avoiding the problem with new and old objects

I am looking for an elegant solution to this situation:
I have created a trigger function that updates the table supply with the sum of some detail rows, whenever a row is inserted or updated on warehouse_supplies.
PostgreSQL insert or update syntax allowed me to share the same function sync_supply_stock() for the insert and update conditions.
However, when I try to wire the after delete condition to the function it cannot be reused (although it is logically valid), for the returning object must be old instead of new.
-- The function I want to use for the 3 conditions (insert, update, delete)
create or replace function sync_supply_stock ()
returns trigger
as $$
begin
-- update the supply whose stock just changed in warehouse_supply with
-- the sum its stocks on all the warehouses.
update supply
set stock = (select sum(stock) from warehouse_supplies where supply_id = new.supply_id)
where supply_id = new.supply_id;
return new;
end;
$$ language plpgsql;
-- The (probably) unnecessary copy of the previous function, this time returning old.
create or replace function sync_supply_stock2 ()
returns trigger
as $$
begin
-- update the supply whose stock just changed in warehouse_supply with
-- the sum its stocks on all the warehouses.
update supply
set stock = (select sum(stock) from warehouse_supplies where supply_id = old.supply_id)
where supply_id = old.supply_id;
return old;
end;
$$ language plpgsql;
-- The after insert/update trigger
create trigger on_warehouse_supplies__after_upsert after insert or update
on warehouse_supplies for each row
execute procedure sync_supply_stock ();
-- The after delete trigger
create trigger on_warehouse_supplies__after_delete after delete
on warehouse_supplies for each row
execute procedure sync_supply_stock2 ();
Am I missing something or is there any fixing to duplicating sync_supply_stock2() as sync_supply_stock2()?
EDIT
For the benefit of future readers, following #bergi answer and discusion, this is a possible factorized solution
create or replace function sync_supply_stock ()
returns trigger
as $$
declare
_supply_id int;
begin
-- read the supply_id column from `new` on insert/update conditions and from `old` on delete conditions
_supply_id = coalesce(new.supply_id, old.supply_id);
-- update the supply whose stock just changed in of_warehouse_supply with
-- the sum its stocks on all the warehouses.
update of_supply
set stock = (select sum(stock) from of_warehouse_supplies where supply_id = _supply_id)
where supply_id = _supply_id;
-- returns `new` on insert/update conditions and `old` on delete conditions
return coalesce(new, old);
end;
$$ language plpgsql;
create trigger on_warehouse_supplies__after_upsert after insert or update
on of_warehouse_supplies for each row
execute procedure sync_supply_stock ();
create trigger on_warehouse_supplies__after_delete after delete
on of_warehouse_supplies for each row
execute procedure sync_supply_stock ();
for the returning object must be old instead of new.
No. The return value is only relevant for BEFORE ROW or INSTEAD OF triggers. From the docs: "The return value of a row-level trigger fired AFTER or a statement-level trigger fired BEFORE or AFTER is always ignored; it might as well be null".
So you can just make your sync_supply_stock trigger function RETURN NULL and it can be used on all operations.

How to implement a purge os soft-deleted records in Postgres?

In my Postgres 9.4 database, I have the following trigger / function, which implements a "soft-delete" functionality:
ALTER TABLE my_schema.my_table
ADD COLUMN delete_ind integer
CREATE OR REPLACE FUNCTION trigger_mytable_soft_delete()
RETURNS trigger AS $$
DECLARE
command text := ' SET delete_ind = 1 WHERE uuid_col = $1';
BEGIN
EXECUTE 'UPDATE "my_schema"."my_table"' || TG_TABLE_NAME || command USING OLD.uuid_col;
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER my_table_soft_delete_trigger
BEFORE DELETE ON "my_schema"."my_table"
FOR EACH ROW EXECUTE PROCEDURE trigger_mytable_soft_delete();
The code above gives me the "soft-delete" functionality, but, it also prevents me from actually deleting / purging those rows, which are already marked as deleted.
The new desired behavior, is to have this delete function to examine the value of the delete_ind field, and, if it is already set to 1, to actually purge that row for good.
What is the correct conditional syntax, which would either set the value of delete_ind. or actually delete the row in question, based on the current value of the delete_ind column?
It can be done with a relatively minor modification to your function:
CREATE OR REPLACE FUNCTION trigger_mytable_soft_delete()
RETURNS trigger AS
$$
BEGIN
if OLD.delete_ind = 1 then
/* Delete proceeds, you just needs to *do nothing*
except for returning the OLD row as it were */
RETURN OLD ;
else
/* instead of deleting, set a flag */
UPDATE my_schema.my_table
SET deleted_ind = 1
WHERE uuid_col = old.uuid_col ;
/* This will skip the process of this row.
It will also avoid subsequent triggers to be fired, and the row will
not be counted on the rows-affected count. If more triggers need
to be processed, make sure this is the last in the chain.
*/
RETURN NULL ;
end if ;
END;
$$
LANGUAGE plpgsql;
(If your function is used one by one table, you can hard-code it, and don't need Dynamic SQL)
Side note: If the column delete_ind will only be used as a flag, its meaning would be best conveyed by declaring it as boolean not null instead of integer.

PSQL Add value from row to another value in the same row using triggers

I have a test table with three columns (file, qty, qty_total). I will input multiple rows like this for example, insert into test_table (file,qty) VALUS (A,5);. What i want is for on commit is for a trigger to take the value from qty and add it to qty_total. As what will happen is that this value will get updated as this example demonstrates. Update test_table set qty = 10 where file = A; So the qty_total is now 15. Thanks
Managed to solve this myself. I created a trigger function `CREATE FUNCTION public.qty_total()
RETURNS trigger
LANGUAGE 'plpgsql'
COST 100.0
VOLATILE NOT LEAKPROOF
AS $BODY$
BEGIN
IF TG_OP = 'UPDATE' THEN
NEW."total" := (OLD.total + NEW.col2);
RETURN NEW;
ELSE
NEW."total" := NEW.col2;
RETURN NEW;
END IF;
END;
$BODY$;
ALTER FUNCTION public.qty_total()
OWNER TO postgres; This was called by a trigger CREATE TRIGGER qty_trigger
BEFORE INSERT OR UPDATE
ON public.test
FOR EACH ROW
EXECUTE PROCEDURE qty_total(); now when i insert a new code and value, the value is copied to the total, when it is updated, the value is added to the total and i have my new qty_total. This may not have the best error catching in it, but since i am passing the data from php, i am happy to make sure the errors are caught and removed.

postgres count from table efficient way

In my application we are using postgresql,now it has one million records in summary table.
When I run the following query it takes 80,927 ms
SELECT COUNT(*) AS count
FROM summary_views
GROUP BY question_id,category_type_id
Is there any efficient way to do this?
COUNT(*) in PostgreSQL tends to be slow. It's a feature of MVCC. One of the workarounds of the problem is a row counting trigger with a helper table:
create table table_count(
table_count_id text primary key,
rows int default 0
);
CREATE OR REPLACE FUNCTION table_count_update()
RETURNS trigger AS
$BODY$
begin
if tg_op = 'INSERT' then
update table_count set rows = rows + 1
where table_count_id = TG_TABLE_NAME;
elsif tg_op = 'DELETE' then
update table_count set rows = rows - 1
where table_count_id = TG_TABLE_NAME;
end if;
return null;
end;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
Next step is to add proper trigger declaration for each table you'd like to use it with. For example for table tab_name:
begin;
insert into table_count values
('tab_name',(select count(*) from tab_name));
create trigger tab_name_table_count after insert or delete
on tab_name for each row execute procedure table_count_update();
commit;
It is important to run in a transaction block to keep actual count and helper table in sync in case of delete or insert between initial count and trigger creation. Transaction guarantees this. From now on to get current count instantly, just invoke:
select rows from table_count where table_count_id = 'tab_name';
Edit: In case of your group by clause, you'll need more sophisticated trigger function and count table.