Get values from varying columns in a generic trigger - postgresql

I am new to PostgreSQL and found a trigger which serves my purpose completely except for one little thing. The trigger is quite generic and runs across different tables and logs different field changes. I found here.
What I now need to do is test for a specific field which changes as the tables change on which the trigger fires. I thought of using substr as all the column will have the same name format e.g. XXX_cust_no but the XXX can change to 2 or 4 characters. I need to log the value in theXXX_cust_no field with every record that is written to the history_ / audit table. Using a bunch of IF / ELSE statements to accomplish this is not something I would like to do.
The trigger as it now works logs the table_name, column_name, old_value, new_value. I however need to log the XXX_cust_no of the record that was changed as well.

Basically you need dynamic SQL for dynamic column names. format helps to format the DML command. Pass values from NEW and OLD with the USING clause.
Given these tables:
CREATE TABLE tbl (
t_id serial PRIMARY KEY
,abc_cust_no text
);
CREATE TABLE log (
id int
,table_name text
,column_name text
,old_value text
,new_value text
);
It could work like this:
CREATE OR REPLACE FUNCTION trg_demo()
RETURNS TRIGGER AS
$func$
BEGIN
EXECUTE format('
INSERT INTO log(id, table_name, column_name, old_value, new_value)
SELECT ($2).t_id
, $3
, $4
,($1).%1$I
,($2).%1$I', TG_ARGV[0])
USING OLD, NEW, TG_RELNAME, TG_ARGV[0];
RETURN NEW;
END
$func$ LANGUAGE plpgsql;
CREATE TRIGGER demo
BEFORE UPDATE ON tbl
FOR EACH ROW EXECUTE PROCEDURE trg_demo('abc_cust_no'); -- col name here.
SQL Fiddle.
Related answer on dba.SE:
How to access NEW or OLD field given only the field's name?
List of special variables visible in plpgsql trigger functions in the manual.

Related

Declare and return value for DELETE and INSERT

I am trying to remove duplicated data from some of our databases based upon unique id's. All deleted data should be stored in a separate table for auditing purposes. Since it concerns quite some databases and different schemas and tables I wanted to start using variables to reduce chance of errors and the amount of work it will take me.
This is the best example query I could think off, but it doesn't work:
do $$
declare #source_schema varchar := 'my_source_schema';
declare #source_table varchar := 'my_source_table';
declare #target_table varchar := 'my_target_schema' || source_table || '_duplicates'; --target schema and appendix are always the same, source_table is a variable input.
declare #unique_keys varchar := ('1', '2', '3')
begin
select into #target_table
from #source_schema.#source_table
where id in (#unique_keys);
delete from #source_schema.#source_table where export_id in (#unique_keys);
end ;
$$;
The query syntax works with hard-coded values.
Most of the times my variables are perceived as columns or not recognized at all. :(
You need to create and then call a plpgsql procedure with input parameters :
CREATE OR REPLACE PROCEDURE duplicates_suppress
(my_target_schema text, my_source_schema text, my_source_table text, unique_keys text[])
LANGUAGE plpgsql AS
$$
BEGIN
EXECUTE FORMAT(
'WITH list AS (INSERT INTO %1$I.%3$I_duplicates SELECT * FROM %2$I.%3$I WHERE array[id] <# %4$L :: integer[] RETURNING id)
DELETE FROM %2$I.%3$I AS t USING list AS l WHERE t.id = l.id', my_target_schema, my_source_schema, my_source_table, unique_keys :: text) ;
END ;
$$ ;
The procedure duplicates_suppress inserts into my_target_schema.my_source_table || '_duplicates' the rows from my_source_schema.my_source_table whose id is in the array unique_keys and then deletes these rows from the table my_source_schema.my_source_table .
See the test result in dbfiddle.
As has been commented, you need some kind of dynamic SQL. In a FUNCTION, PROCEDURE or a DO statement to do it on the server.
You should be comfortable with PL/pgSQL. Dynamic SQL is no beginners' toy.
Example with a PROCEDURE, like Edouard already suggested. You'll need a FUNCTION instead to wrap it in an outer transaction (like you very well might). See:
When to use stored procedure / user-defined function?
CREATE OR REPLACE PROCEDURE pg_temp.f_archive_dupes(_source_schema text, _source_table text, _unique_keys int[], OUT _row_count int)
LANGUAGE plpgsql AS
$proc$
-- target schema and appendix are always the same, source_table is a variable input
DECLARE
_target_schema CONSTANT text := 's2'; -- hardcoded
_target_table text := _source_table || '_duplicates';
_sql text := format(
'WITH del AS (
DELETE FROM %I.%I
WHERE id = ANY($1)
RETURNING *
)
INSERT INTO %I.%I TABLE del', _source_schema, _source_table
, _target_schema, _target_table);
BEGIN
RAISE NOTICE '%', _sql; -- debug
EXECUTE _sql USING _unique_keys; -- execute
GET DIAGNOSTICS _row_count = ROW_COUNT;
END
$proc$;
Call:
CALL pg_temp.f_archive_dupes('s1', 't1', '{1, 3}', 0);
db<>fiddle here
I made the procedure temporary, since I assume you don't need to keep it permanently. Create it once per database. See:
How to create a temporary function in PostgreSQL?
Passed schema and table names are case-sensitive strings! (Unlike unquoted identifiers in plain SQL.) Either way, be wary of SQL-injection when concatenating SQL dynamically. See:
Are PostgreSQL column names case-sensitive?
Table name as a PostgreSQL function parameter
Made _unique_keys type int[] (array of integer) since your sample values look like integers. Use a the actual data type of your id columns!
The variable _sql holds the query string, so it can easily be debugged before actually executing. Using RAISE NOTICE '%', _sql; for that purpose.
I suggest to comment the EXECUTE line until you are sure.
I made the PROCEDURE return the number of processed rows. You didn't ask for that, but it's typically convenient. At hardly any cost. See:
Dynamic SQL (EXECUTE) as condition for IF statement
Best way to get result count before LIMIT was applied
Last, but not least, use DELETE ... RETURNING * in a data-modifying CTE. Since that has to find rows only once it comes at about half the cost of separate SELECT and DELETE. And it's perfectly safe. If anything goes wrong, the whole transaction is rolled back anyway.
Two separate commands can also run into concurrency issues or race conditions which are ruled out this way, as DELETE implicitly locks the rows to delete. Example:
Replicating data between Postgres DBs
Or you can build the statements in a client program. Like psql, and use \gexec. Example:
Filter column names from existing table for SQL DDL statement
Based on Erwin's answer, minor optimization...
create or replace procedure pg_temp.p_archive_dump
(_source_schema text, _source_table text,
_unique_key int[],_target_schema text)
language plpgsql as
$$
declare
_row_count bigint;
_target_table text := '';
BEGIN
select quote_ident(_source_table) ||'_'|| array_to_string(_unique_key,'_') into _target_table from quote_ident(_source_table);
raise notice 'the deleted table records will store in %.%',_target_schema, _target_table;
execute format('create table %I.%I as select * from %I.%I limit 0',_target_schema, _target_table,_source_schema,_source_table );
execute format('with mm as ( delete from %I.%I where id = any (%L) returning * ) insert into %I.%I table mm'
,_source_schema,_source_table,_unique_key, _target_schema, _target_table);
GET DIAGNOSTICS _row_count = ROW_COUNT;
RAISE notice 'rows influenced, %',_row_count;
end
$$;
--
if your _unique_key is not that much, this solution also create a table for you. Obviously you need to create the target schema yourself.
If your unique_key is too much, you can customize to properly rename the dumped table.
Let's call it.
call pg_temp.p_archive_dump('s1','t1', '{1,2}','s2');
s1 is the source schema, t1 is source table, {1,2} is the unique key you want to extract to the new table. s2 is the target schema

Get data of multiple inserted rows in one object using trigger in postgres

I am trying to write a trigger which gets data from the table attribute in which multiple rows are inserted corresponding to one actionId at one time and group all that data into the one object:
Table Schema
actionId
key
value
I am firing trigger on rows insertion,SO how can I handle this multiple row insertion and how can I collect all the data.
CREATE TRIGGER attribute_changes
AFTER INSERT
ON attributes
FOR EACH ROW
EXECUTE PROCEDURE log_attribute_changes();
and the function,
CREATE OR REPLACE FUNCTION wflowr222.log_task_extendedattribute_changes()
RETURNS trigger AS
$BODY$
DECLARE
_message json;
_extendedAttributes jsonb;
BEGIN
SELECT json_agg(tmp)
INTO _extendedAttributes
FROM (
-- your subquery goes here, for example:
SELECT attributes.key, attributes.value
FROM attributes
WHERE attributes.actionId=NEW.actionId
) tmp;
_message :=json_build_object('actionId',NEW.actionId,'extendedAttributes',_extendedAttributes);
INSERT INTO wflowr222.irisevents(message)
VALUES(_message );
RETURN NULL;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
and data format is,
actionId key value
2 flag true
2 image http:test.com/image
2 status New
I tried to do it via Insert trigger, but it is firing on each row inserted.
If anyone has any idea about this?
I expect that the problem is that you're using a FOR EACH ROW trigger; what you likely want is a FOR EACH STATEMENT trigger - ie. which only fires once for your multi-line INSERT statement. See the description at https://www.postgresql.org/docs/current/sql-createtrigger.html for a more through explanation.
AFAICT, you will also need to add REFERENCING NEW TABLE AS NEW in this mode to make the NEW reference available to the trigger function. So your CREATE TRIGGER syntax would need to be:
CREATE TRIGGER attribute_changes
AFTER INSERT
ON attributes
REFERENCING NEW TABLE AS NEW
FOR EACH STATEMENT
EXECUTE PROCEDURE log_attribute_changes();
I've read elsewhere that the required REFERENCING NEW TABLE ... syntax is only supported in PostgreSQL 10 and later.
Considering the version of postgres you have, and therefore keeping in mind that you can't use a trigger defined FOR EACH STATEMENT for your purpose, the only alternative I see is
using a trigger after insert in order to collect some information about changes in a utility table
using a unix cron that execute a pl/sql that do the job on data set
For example:
Your utility table
CREATE TABLE utility (
actionid integer,
createtime timestamp
);
You can define a trigger FOR EACH ROW with a body that do something like this
INSERT INTO utilty values(NEW.actionid, curent_timestamp);
And, finally, have a crontab UNIX that execute a file or a procedure that to something like this:
SELECT a.* FROM utility u JOIN yourtable a ON a.actionid = u.actionid WHERE u.createtime < current_timestamp;
// do something here with records selected above
TRUNCATE table utility;
If you had postgres 9.5 you could have used pg_cron instead of unix cron...

PostgreSQL select INTO function

I am writing a function which will select and SUM the resulting output into a new table-therefore I attempted to use the INTO function. However, my standalone code works, yet once a place into a function I get an error stating that the new SELECT INTO table is not a defined variable (perhaps I am missing something). Please see code below:
CREATE OR REPLACE FUNCTION rev_1.calculate_costing_layer()
RETURNS trigger AS
$BODY$
BEGIN
-- This will create an intersection between pipelines and sum the cost to a new table for output
-- May need to create individual cost columns- Will also keep infrastructure costing seperated
--DROP table rev_1.costing_layer;
SELECT inyaninga_phases.geom, catchment_e_gravity_lines.name, SUM(catchment_e_gravity_lines.cost) AS gravity_sum
INTO rev_1.costing_layer
FROM rev_1.inyaninga_phases
ON ST_Intersects(catchment_e_gravity_lines.geom,inyaninga_phases.geom)
GROUP BY catchment_e_gravity_lines.name, inyaninga_phases.geom;
RETURN NEW;
END;
$BODY$
language plpgsql
Per the documentation:
CREATE TABLE AS is functionally similar to SELECT INTO. CREATE TABLE AS is the recommended syntax, since this form of SELECT INTO is not available in ECPG or PL/pgSQL, because they interpret the INTO clause differently. Furthermore, CREATE TABLE AS offers a superset of the functionality provided by SELECT INTO.
Use CREATE TABLE AS.
Although SELECT ... INTO new_table is valid PostgreSQL, its use has been deprecated (or, at least, "unrecommended"). It doesn't work at all in PL/PGSQL, because INSERT INTO is used to get results into variables.
If you want to create a new table, you should use instead:
CREATE TABLE rev_1.costing_layer AS
SELECT
inyaninga_phases.geom, catchment_e_gravity_lines.name, SUM(catchment_e_gravity_lines.cost) AS gravity_sum
FROM
rev_1.inyaninga_phases
ON ST_Intersects(catchment_e_gravity_lines.geom,inyaninga_phases.geom)
GROUP BY
catchment_e_gravity_lines.name, inyaninga_phases.geom;
If the table has already been created an you just want to insert a new row in it, you should use:
INSERT INTO
rev_1.costing_layer
(geom, name, gravity_sum)
-- Same select than before
SELECT
inyaninga_phases.geom, catchment_e_gravity_lines.name, SUM(catchment_e_gravity_lines.cost) AS gravity_sum
FROM
rev_1.inyaninga_phases
ON ST_Intersects(catchment_e_gravity_lines.geom,inyaninga_phases.geom)
GROUP BY
catchment_e_gravity_lines.name, inyaninga_phases.geom;
In a trigger function, you're not very likely to create a new table every time, so, my guess is that you want to do the INSERT and not the CREATE TABLE ... AS.

Capture columns in plpgsql during UPDATE

I am writing a trigger in plpgsql for Postgres 9.1. I need to be able to capture the column names that were issued in the SET clause of an UPDATE so I can record the specified action in an audit table. The examples in the Postgres documentation are simple and inadequate for my needs. I have searched the internet for days and I am unable to find any other examples that try to achieve what I want to do here.
I am on a tight schedule to resolve this soon. I don't know Tcl so pl/Tcl is out of the question for me at this point. pl/Perl may work but I don't know where to start with it. Also I wanted to find a way to accomplish this in pl/pgsql if at all possible for portability and maintenance. If someone can recommend a pl/Perl solution to this I would be grateful.
Here is the table structure of the target table that will be audited:
Note: There are many other columns in the record table but I have not listed them here in order to keep things simple. But the trigger should be able to record changes to any of the columns in the row.
CREATE TABLE record (
record_id integer NOT NULL PRIMARY KEY,
lastname text,
frstname text,
dob date,
created timestamp default NOW(),
created_by integer,
inactive boolean default false
);
create sequence record_record_id_seq;
alter table record alter record_id set default nextval('record_record_id_seq');
Here is my audit table:
CREATE TABLE record_audit (
id integer NOT NULL PRIMARY KEY,
operation char(1) NOT NULL, -- U, I or D
source_column text,
source_id integer,
old_value text,
new_value text,
created_date timestamp default now(),
created_by integer
);
create sequence record_audit_id_seq;
alter table record_audit alter id set default nextval('record_audit_id_seq');
My goal is to record INSERTS and UPDATES to the record table in the record_audit table that will detail not only what the target record_id was (source_id) that was updated and what column was updated (source_column), but also the old_value and the new_value of the column.
I understand that the column values will have to be CAST() to a type of text. I believe I can access the old_value and new_value by accessing NEW and OLD but I am having difficulty figuring out how to obtain the column names used in the SET clause of the UPDATE query. I need the trigger to add a new record to the record_audit table for every column specified in the SET clause. Note, there are not DELETE actions as records are simply UPDATED to inactive = 't' (and thus recorded in the audit table)
Here is my trigger so far (obviously incomplete). Please forgive me, I am learning pl/pgsql as I go.
-- Trigger function for record_audit table
CREATE OR REPLACE FUNCTION audit_record() RETURNS TRIGER AS $$
DECLARE
insert_table text;
ref_col text; --how to get the referenced column name??
BEGIN
--
-- Create a new row in record_audit depending on the operation (TG_OP)
--
IF (TG_OP = 'INSERT') THEN
-- old_value and new_value are meaningless for INSERTs. Just record the new ID.
INSERT INTO record_audit
(operation,source_id,created_by)
VALUES
('I', NEW.record_id, NEW.created_by);
ELSIF (TG_OP = 'UPDATE') THEN
FOR i in 1 .. TG_ARGV[0] LOOP
ref_col := TG_ARGV[i].column; -- I know .column doesn't exist but what to use?
INSERT INTO record_audit
(operation, source_column, source_id, old_value, new_value, created_by)
VALUES
('U', ref_col, NEW.record_id, OLD.ref_col, NEW.ref_col, NEW.created_by);
END LOOP;
END IF;
RETURN NULL; -- result is ignored anyway since this is an AFTER trigger
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER record_audit_trig
AFTER INSERT OR UPDATE on record
FOR EACH ROW EXECUTE PROCEDURE audit_record();
Thanks for reading this long and winding question!
you cannot to get this information - not in PL level - probably it is possible in C.
Good enough solution is based on changed fields in records NEW and OLD. You can get list of fields from system tables ~ are related to table that is joined to trigger.

Getting row values based on array of column names passed to PostgreSQL function

I'm trying to set up full text search in PostgreSQL 9.2. I created a new table to hold the content that I want to search (so that I can search across lots of different types of items), which looks like this:
CREATE TABLE search (
target_id bigint PRIMARY KEY,
target_type text,
fts tsvector
);
CREATE INDEX search_fts ON search USING gin(fts);
Every time a new item gets inserted (or updated) into one of the various tables I want to search across, it should automatically be added to the search table. Assuming that my table looks like the following:
CREATE TABLE item (id bigint PRIMARY KEY, name text NOT NULL, description text);
I created a trigger passing in the column names that I want to be able to search:
CREATE TRIGGER insert_item_search BEFORE INSERT
ON item FOR EACH ROW EXECUTE PROCEDURE
insert_search('{name, description}'::text[]);
Then created a new function insert_search as:
CREATE OR REPLACE FUNCTION insert_search(cols text[]) RETURNS TRIGGER AS $$
BEGIN
INSERT INTO search (target_id, target_type, fts) VALUES (
NEW.id, TG_TABLE_NAME, to_tsvector('english', 'foo')
);
RETURN NEW;
END;
$$ LANGUAGE PLPGSQL;
My question is, how do I pass in the table values based on cols to to_tsvector? Right now, the function is getting called and inserts the id and type correctly, but I don't know the right way to dynamically grab the other values based on the cols argument.
First, to pass arguments, just send them directly:
CREATE TRIGGER insert_item_search BEFORE INSERT
ON item FOR EACH ROW EXECUTE PROCEDURE
insert_search('name', 'description');
And, from PL/pgSQL you will get those arguments as an array, called TG_ARGV. But, the problem is that PL/pgSQL cannot get the values from NEW record based on their names. To do that you can either use a language that lets you do that (like PL/python or PL/perl) or use the hstore extension.
I'd stick with the last one and use hstore (unless you already use one of the other languages to create functions):
CREATE OR REPLACE FUNCTION insert_search() RETURNS TRIGGER AS $$
DECLARE
v_new hstore;
BEGIN
v_new = hstore(NEW); -- convert the record to hstore
FOR i IN 0..(TG_NARGS-1) LOOP
INSERT INTO search (target_id, target_type, fts) VALUES (
NEW.id, TG_TABLE_NAME, to_tsvector('english', v_new -> TG_ARGV[i])
);
END LOOP;
RETURN NEW;
END;
$$ LANGUAGE PLPGSQL;
As you can see above, I used the hstore's operator -> to get the value based on the name (on TG_ARGV[i]).
You can access the parameters specified in the trigger definition with the TG_ARGV variable. You can find documentation on that here. TG_ARGV is an array accessed by a 0 based index. So it would be something like TG_ARGV[0], TG_ARGV[1], and so on.