rails + psql with structure dump uses PROCEDURE over FUNCTION - postgresql

Every time I dump my structure.sql on a rails app, I get PROCEDURE over FUNCTION. FUNCTION is our default and I have to commit the file in parts which is annoying and sometimes I miss lines which is even worse, as it is a rather big structure.sql file.
git diff example:
-CREATE TRIGGER cache_comments_count AFTER INSERT OR DELETE OR UPDATE ON public.comments FOR EACH ROW EXECUTE PROCEDURE public.update_comments_counter();
+CREATE TRIGGER cache_comments_count AFTER INSERT OR DELETE OR UPDATE ON public.comments FOR EACH ROW EXECUTE FUNCTION public.update_comments_counter();
I'm sure there is a postgresql setting for this somewhere, but I can't find it.

Whether you use Function or Procedure you get exactly the same. The documentation shows
CREATE [ CONSTRAINT ] TRIGGER name...
EXECUTE { FUNCTION | PROCEDURE } function_name ( arguments )
This means you can use either term FUNCTION or PROCEDURE but either way function_name is always called. See demo. For demo I have separate triggers for insert and update. Insert using execute procedure and update using execute function. This cannot be changed in Postgres it would have to be Rails setting. NOTE: Prior to v11 Postgres only allowed execute procedure even though you had to create a trigger function that was called.

The function pg_get_triggerdef() changed between Postgres 11 and 12 when Postgres introduced real procedures. Since Postgres 12 it always returns a syntax that uses EXECUTE FUNCTION as in reality it is a function that is called when the trigger fires, not a procedure.
So this code:
create table t1 (id int);
create function trg_func()
returns trigger
as
$$
begin
return new;
end;
$$
language plpgsql;
create trigger test_trigger
before insert or update
on t1
for each row
execute procedure trg_func();
select pg_get_triggerdef(oid)
from pg_trigger
where tgname = 'test_trigger';
returns the following in Postgres 11 and earlier:
CREATE TRIGGER test_trigger BEFORE INSERT OR UPDATE ON public.t1 FOR EACH ROW EXECUTE PROCEDURE trg_func()
and the following in Postgres 12 and later:
CREATE TRIGGER test_trigger BEFORE INSERT OR UPDATE ON public.t1 FOR EACH ROW EXECUTE FUNCTION trg_func()
I guess Rails uses pg_get_triggerdef() to obtain the trigger source. So there is nothing you can do. If you want a consistent result, you should use the same Postgres version everywhere.
The column action_statement in the view information_schema.triggers also reflects the change in naming.
Postgres 11 example
Postgres 12 example

Related

Get data of multiple inserted rows in one object using trigger in postgres

I am trying to write a trigger which gets data from the table attribute in which multiple rows are inserted corresponding to one actionId at one time and group all that data into the one object:
Table Schema
actionId
key
value
I am firing trigger on rows insertion,SO how can I handle this multiple row insertion and how can I collect all the data.
CREATE TRIGGER attribute_changes
AFTER INSERT
ON attributes
FOR EACH ROW
EXECUTE PROCEDURE log_attribute_changes();
and the function,
CREATE OR REPLACE FUNCTION wflowr222.log_task_extendedattribute_changes()
RETURNS trigger AS
$BODY$
DECLARE
_message json;
_extendedAttributes jsonb;
BEGIN
SELECT json_agg(tmp)
INTO _extendedAttributes
FROM (
-- your subquery goes here, for example:
SELECT attributes.key, attributes.value
FROM attributes
WHERE attributes.actionId=NEW.actionId
) tmp;
_message :=json_build_object('actionId',NEW.actionId,'extendedAttributes',_extendedAttributes);
INSERT INTO wflowr222.irisevents(message)
VALUES(_message );
RETURN NULL;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
and data format is,
actionId key value
2 flag true
2 image http:test.com/image
2 status New
I tried to do it via Insert trigger, but it is firing on each row inserted.
If anyone has any idea about this?
I expect that the problem is that you're using a FOR EACH ROW trigger; what you likely want is a FOR EACH STATEMENT trigger - ie. which only fires once for your multi-line INSERT statement. See the description at https://www.postgresql.org/docs/current/sql-createtrigger.html for a more through explanation.
AFAICT, you will also need to add REFERENCING NEW TABLE AS NEW in this mode to make the NEW reference available to the trigger function. So your CREATE TRIGGER syntax would need to be:
CREATE TRIGGER attribute_changes
AFTER INSERT
ON attributes
REFERENCING NEW TABLE AS NEW
FOR EACH STATEMENT
EXECUTE PROCEDURE log_attribute_changes();
I've read elsewhere that the required REFERENCING NEW TABLE ... syntax is only supported in PostgreSQL 10 and later.
Considering the version of postgres you have, and therefore keeping in mind that you can't use a trigger defined FOR EACH STATEMENT for your purpose, the only alternative I see is
using a trigger after insert in order to collect some information about changes in a utility table
using a unix cron that execute a pl/sql that do the job on data set
For example:
Your utility table
CREATE TABLE utility (
actionid integer,
createtime timestamp
);
You can define a trigger FOR EACH ROW with a body that do something like this
INSERT INTO utilty values(NEW.actionid, curent_timestamp);
And, finally, have a crontab UNIX that execute a file or a procedure that to something like this:
SELECT a.* FROM utility u JOIN yourtable a ON a.actionid = u.actionid WHERE u.createtime < current_timestamp;
// do something here with records selected above
TRUNCATE table utility;
If you had postgres 9.5 you could have used pg_cron instead of unix cron...

Automatically DROP FUNCTION when DROP TABLE on POSTGRESQL 11.7

I'm trying to, somehow, trigger a automatic function drop when a table is dropped and I can't figure out how to do it.
TL;DR: Is there a way to trigger a function drop when a specific table is dropped? (POSTGRESQL 11.7)
Detailed explanation
I'll try to explain my problem using a simplified use case with dummy names.
I have three tables: sensor1, sensor2 and sumSensors;
A FUNCTION (sumdata) was created to INSERT data on sumSensors table. Inside this function I'll fetch data from sensor1 and sensor2 tables and insert its sum on table sumSensors;
A trigger was created for each sensor table which like this:
CREATE TRIGGER trig1
AFTER INSERT ON sensor1
FOR EACH ROW EXECUTE
FUNCTION sumdata();
Now, when a new row is inserted on tables sensor1 OR sensor2, the function sumdata will be executed and insert the sum of last values from both on table sumSensors
If I wanted to DROP FUNTION sumdata CASCADE;, the triggers would be automatically removed from tables sensor1 and sensor2. Until now that's everything fine! But that's not what I want.
My problem is:
Q: And if I just DROP TABLE sumSensors CASCADE;? What would happen to the function which was meant to insert on this table?
A: As expected, since there's no association between sumSensors table and sumdata function, the function won't be dropped (still exist)! The same happens to the triggers which use it (still exist). This means that when a new row is inserted on sensor tables, the function sumdata will be executed and corrupted, leading to a failure (even the INSERT which triggered the function execution won't be actually inserted).
Is there a way to trigger a function drop when a specific table is dropped?
Thank you in advance
There is no dependency tracking for functions in PostgreSQL (as of version 12).
You can use event triggers to maintain the dependencies yourself.
Full example follows.
More information: documentation of event triggers feature, support functions.
BEGIN;
CREATE TABLE _testtable ( id serial primary key, payload text );
INSERT INTO _testtable (payload) VALUES ('Test data');
CREATE FUNCTION _testfunc(integer) RETURNS integer
LANGUAGE SQL AS $$ SELECT $1 + count(*)::integer FROM _testtable; $$;
SELECT _testfunc(100);
CREATE FUNCTION trg_drop_dependent_functions()
RETURNS event_trigger
LANGUAGE plpgsql AS $$
DECLARE
_dropped record;
BEGIN
FOR _dropped IN
SELECT schema_name, object_name
FROM pg_catalog.pg_event_trigger_dropped_objects()
WHERE object_type = 'table'
LOOP
IF _dropped.schema_name = 'public' AND _dropped.object_name = '_testtable' THEN
EXECUTE 'DROP FUNCTION IF EXISTS _testfunc(integer)';
END IF;
END LOOP;
END;
$$;
CREATE EVENT TRIGGER trg_drop_dependent_functions ON sql_drop
EXECUTE FUNCTION trg_drop_dependent_functions();
DROP TABLE _testtable;
ROLLBACK;

Insert values in a loop and see the progress postgresql [duplicate]

I have Postgresql Function which has to INSERT about 1.5 million data into a table. What I want is I want to see the table getting populated with every one records insertion. Currently what is happening when I am trying with say about 1000 records, the get gets populated only after the complete function gets executed. If I stop the function half way through, no data gets populated. How can I make the record committed even if I stop after certain number of records have been inserted?
This can be done using dblink. I showed an example with one insert being committed you will need to add your while loop logic and commit every loop. You can http://www.postgresql.org/docs/9.3/static/contrib-dblink-connect.html
CREATE OR REPLACE FUNCTION log_the_dancing(ip_dance_entry text)
RETURNS INT AS
$BODY$
DECLARE
BEGIN
PERFORM dblink_connect('dblink_trans','dbname=sandbox port=5433 user=postgres');
PERFORM dblink('dblink_trans','INSERT INTO dance_log(dance_entry) SELECT ' || '''' || ip_dance_entry || '''');
PERFORM dblink('dblink_trans','COMMIT;');
PERFORM dblink_disconnect('dblink_trans');
RETURN 0;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION log_the_dancing(ip_dance_entry text)
OWNER TO postgres;
BEGIN TRANSACTION;
select log_the_dancing('The Flamingo');
select log_the_dancing('Break Dance');
select log_the_dancing('Cha Cha');
ROLLBACK TRANSACTION;
--Show records committed even though we rolled back outer transaction
select *
from dance_log;
What you're asking for is generally called an autonomous transaction.
PostgreSQL does not support autonomous transactions at this time (9.4).
To properly support them it really needs stored procedures, not just the user-defined functions it currently supports. It's also very complicated to implement autonomous tx's in PostgreSQL for a variety of internal reasons related to its session and process model.
For now, use dblink as suggested by Bob.
If you have the flexibility to change from function to procedure, from PostgreSQL 12 onwards you can do internal commits if you use procedures instead of functions, invoked by CALL command. Therefore your function will be changed to a procedure and invoked with CALL command: e.g:
CREATE PROCEDURE transaction_test2()
LANGUAGE plpgsql
AS $$
DECLARE
r RECORD;
BEGIN
FOR r IN SELECT * FROM test2 ORDER BY x LOOP
INSERT INTO test1 (a) VALUES (r.x);
COMMIT;
END LOOP;
END;
$$;
CALL transaction_test2();
More details about transaction management regarding Postgres are available here: https://www.postgresql.org/docs/12/plpgsql-transactions.html
For Postgresql 9.5 or newer you can use dynamic background workers provided by pg_background extension. It creates autonomous transaction. Please, refer the github page of the extension. The sollution is better then db_link. There is a complete guide on Autonomous transaction support in PostgreSQL. There is a third way to start autonomous transaction in Postgres, but some patching neede. Please see Peter's Eisentraut patch proposal for OracleDB-style transactions.

Triggers in Postgresql/postgis

I have a shapefile loaded into a postgis database. This shapefile is frequently updated by the source and thus my current process is:
Use shp2pgql with -a option to generate insert statements.
Run the SQL generated in step 1 to append to database.
Of course, I end up with all the rows from both versions of the shapefile, and what I need is to get rid of all the previous rows and load only the rows from the updated shapefile.
I tried creating a trigger and trigger function in the database:
CREATE TRIGGER drop_all_rows_from_owner_table_trigger
BEFORE INSERT
ON owner_polygons_common_ownership_layer
FOR EACH STATEMENT
EXECUTE PROCEDURE drop_all_rows_from_owner_table();
Here's the trigger function:
CREATE OR REPLACE FUNCTION drop_all_rows_from_owner_table()
RETURNS trigger AS $$
BEGIN DELETE FROM owner_polygons_common_ownership_layer;
RETURN NEW;
END;
$$
LANGUAGE 'plpgsql';
I believe all I have accomplished is to delete all rows from the table, insert the new rows, then delete them again, because when I look at the table after the process ends I have zero rows. I used the FOR EACH STATEMENT clause because shp2sql created one INSERT statement.
My question is: Are triggers the way to go to accomplish this?
Your trigger function seems right.
However, I don't think this is the way to go: you cannot be sure that shp2pgsql produces a single statement.
If your shapefile grows, it could split your inserts in multiple statements.
So, if you can't use the -d option (that delete and recreate the table), I'd add a step to the process, between 1 and 2, to truncate the table.
You could also prepend the truncate statement in the generated sql file, or you can execute another psql command to truncate the table.

How to add a column to an existing table then use it in a single PostgreSQL function

I have a table being created in a PostgreSQL ( version 9 ) database by a third party product and I need to change that table to add a new column then set the column in question to a standard value.
I have the following in my function:
CREATE FUNCTION alterscorecolumns()
RETURNS void AS
$BODY$
ALTER TABLE "hi_scores" ADD "total_score" integer;
UPDATE "hi_scores" SET total_score = score1+score2+score3;
$BODY$
However, I'm not allowed to do this because it doesn't know that the total_score field exists. I just get the message ERROR: column "total_score" of relation "hi_scores" does not exist.
I am guessing there is some execution-plan related reason for this and that maybe I need to tell it to run the ALTER TABLE before it tries to perform the update, but I can't seem to figure out what I need to do.
You can't do it that way. The SQL in the function is parsed when you create the function. At the time of the creation of the function the column is not there, so you get the error message.
You will need to use dynamic SQL to run the UPDATE statement.
Something like:
CREATE FUNCTION alterscorecolumns()
RETURNS void AS
$BODY$
begin
execute 'ALTER TABLE hi_scores ADD total_score integer';
execute 'UPDATE hi_scores SET total_score = score1+score2+score3';
$BODY$
language plpgsql;
(Not tested, so there might be syntax errors in there)
Just add DEFAULT to your statement like this:
ALTER TABLE "hi_scores" ADD "total_score" integer DEFAULT 0;
#mu already provided: if you want to save this procedure as a function, you have to use dynamic SQL with EXECUTE. But only for the UPDATE. The ALTER TABLE statement works just fine.
As this is obviously a one-time operation (can't add the same column twice), it hardly makes sense to persist a function for the purpose. You could use a DO statement instead:
DO
$BODY$
BEGIN
ALTER TABLE hi_scores ADD total_score integer;
EXECUTE 'UPDATE hi_scores SET total_score = score1+score2+score3';
END;
$BODY$;
But then again, keep it simple: just execute two SQL statements. As soon as the ALTER TABLE is done, the UPDATE will just work normally. Inside a transaction or not - doesn't matter, as long you execute them in order.
ALTER TABLE hi_scores ADD total_score integer;
UPDATE hi_scores SET total_score = score1+score2+score3;