I have a shapefile loaded into a postgis database. This shapefile is frequently updated by the source and thus my current process is:
Use shp2pgql with -a option to generate insert statements.
Run the SQL generated in step 1 to append to database.
Of course, I end up with all the rows from both versions of the shapefile, and what I need is to get rid of all the previous rows and load only the rows from the updated shapefile.
I tried creating a trigger and trigger function in the database:
CREATE TRIGGER drop_all_rows_from_owner_table_trigger
BEFORE INSERT
ON owner_polygons_common_ownership_layer
FOR EACH STATEMENT
EXECUTE PROCEDURE drop_all_rows_from_owner_table();
Here's the trigger function:
CREATE OR REPLACE FUNCTION drop_all_rows_from_owner_table()
RETURNS trigger AS $$
BEGIN DELETE FROM owner_polygons_common_ownership_layer;
RETURN NEW;
END;
$$
LANGUAGE 'plpgsql';
I believe all I have accomplished is to delete all rows from the table, insert the new rows, then delete them again, because when I look at the table after the process ends I have zero rows. I used the FOR EACH STATEMENT clause because shp2sql created one INSERT statement.
My question is: Are triggers the way to go to accomplish this?
Your trigger function seems right.
However, I don't think this is the way to go: you cannot be sure that shp2pgsql produces a single statement.
If your shapefile grows, it could split your inserts in multiple statements.
So, if you can't use the -d option (that delete and recreate the table), I'd add a step to the process, between 1 and 2, to truncate the table.
You could also prepend the truncate statement in the generated sql file, or you can execute another psql command to truncate the table.
Related
Every time I dump my structure.sql on a rails app, I get PROCEDURE over FUNCTION. FUNCTION is our default and I have to commit the file in parts which is annoying and sometimes I miss lines which is even worse, as it is a rather big structure.sql file.
git diff example:
-CREATE TRIGGER cache_comments_count AFTER INSERT OR DELETE OR UPDATE ON public.comments FOR EACH ROW EXECUTE PROCEDURE public.update_comments_counter();
+CREATE TRIGGER cache_comments_count AFTER INSERT OR DELETE OR UPDATE ON public.comments FOR EACH ROW EXECUTE FUNCTION public.update_comments_counter();
I'm sure there is a postgresql setting for this somewhere, but I can't find it.
Whether you use Function or Procedure you get exactly the same. The documentation shows
CREATE [ CONSTRAINT ] TRIGGER name...
EXECUTE { FUNCTION | PROCEDURE } function_name ( arguments )
This means you can use either term FUNCTION or PROCEDURE but either way function_name is always called. See demo. For demo I have separate triggers for insert and update. Insert using execute procedure and update using execute function. This cannot be changed in Postgres it would have to be Rails setting. NOTE: Prior to v11 Postgres only allowed execute procedure even though you had to create a trigger function that was called.
The function pg_get_triggerdef() changed between Postgres 11 and 12 when Postgres introduced real procedures. Since Postgres 12 it always returns a syntax that uses EXECUTE FUNCTION as in reality it is a function that is called when the trigger fires, not a procedure.
So this code:
create table t1 (id int);
create function trg_func()
returns trigger
as
$$
begin
return new;
end;
$$
language plpgsql;
create trigger test_trigger
before insert or update
on t1
for each row
execute procedure trg_func();
select pg_get_triggerdef(oid)
from pg_trigger
where tgname = 'test_trigger';
returns the following in Postgres 11 and earlier:
CREATE TRIGGER test_trigger BEFORE INSERT OR UPDATE ON public.t1 FOR EACH ROW EXECUTE PROCEDURE trg_func()
and the following in Postgres 12 and later:
CREATE TRIGGER test_trigger BEFORE INSERT OR UPDATE ON public.t1 FOR EACH ROW EXECUTE FUNCTION trg_func()
I guess Rails uses pg_get_triggerdef() to obtain the trigger source. So there is nothing you can do. If you want a consistent result, you should use the same Postgres version everywhere.
The column action_statement in the view information_schema.triggers also reflects the change in naming.
Postgres 11 example
Postgres 12 example
I am trying to write a trigger which gets data from the table attribute in which multiple rows are inserted corresponding to one actionId at one time and group all that data into the one object:
Table Schema
actionId
key
value
I am firing trigger on rows insertion,SO how can I handle this multiple row insertion and how can I collect all the data.
CREATE TRIGGER attribute_changes
AFTER INSERT
ON attributes
FOR EACH ROW
EXECUTE PROCEDURE log_attribute_changes();
and the function,
CREATE OR REPLACE FUNCTION wflowr222.log_task_extendedattribute_changes()
RETURNS trigger AS
$BODY$
DECLARE
_message json;
_extendedAttributes jsonb;
BEGIN
SELECT json_agg(tmp)
INTO _extendedAttributes
FROM (
-- your subquery goes here, for example:
SELECT attributes.key, attributes.value
FROM attributes
WHERE attributes.actionId=NEW.actionId
) tmp;
_message :=json_build_object('actionId',NEW.actionId,'extendedAttributes',_extendedAttributes);
INSERT INTO wflowr222.irisevents(message)
VALUES(_message );
RETURN NULL;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
and data format is,
actionId key value
2 flag true
2 image http:test.com/image
2 status New
I tried to do it via Insert trigger, but it is firing on each row inserted.
If anyone has any idea about this?
I expect that the problem is that you're using a FOR EACH ROW trigger; what you likely want is a FOR EACH STATEMENT trigger - ie. which only fires once for your multi-line INSERT statement. See the description at https://www.postgresql.org/docs/current/sql-createtrigger.html for a more through explanation.
AFAICT, you will also need to add REFERENCING NEW TABLE AS NEW in this mode to make the NEW reference available to the trigger function. So your CREATE TRIGGER syntax would need to be:
CREATE TRIGGER attribute_changes
AFTER INSERT
ON attributes
REFERENCING NEW TABLE AS NEW
FOR EACH STATEMENT
EXECUTE PROCEDURE log_attribute_changes();
I've read elsewhere that the required REFERENCING NEW TABLE ... syntax is only supported in PostgreSQL 10 and later.
Considering the version of postgres you have, and therefore keeping in mind that you can't use a trigger defined FOR EACH STATEMENT for your purpose, the only alternative I see is
using a trigger after insert in order to collect some information about changes in a utility table
using a unix cron that execute a pl/sql that do the job on data set
For example:
Your utility table
CREATE TABLE utility (
actionid integer,
createtime timestamp
);
You can define a trigger FOR EACH ROW with a body that do something like this
INSERT INTO utilty values(NEW.actionid, curent_timestamp);
And, finally, have a crontab UNIX that execute a file or a procedure that to something like this:
SELECT a.* FROM utility u JOIN yourtable a ON a.actionid = u.actionid WHERE u.createtime < current_timestamp;
// do something here with records selected above
TRUNCATE table utility;
If you had postgres 9.5 you could have used pg_cron instead of unix cron...
My trigger is defined the following way:
CREATE TRIGGER update_contract_finished_at
AFTER INSERT OR DELETE OR UPDATE OF performed_on
ON task
FOR EACH ROW
EXECUTE PROCEDURE update_contract_finished_at_function();
I now want to evoke this trigger to set the variables which are updated by the trigger. How do I do that?
Something like
for each row in task
execute procedure update_contract_finished_at_function();
I know I can update with a standard update set statement. I also want to verifiy that my trigger works on all the data correctly.
I'd write a slightly modified copy of update_contract_finished_at_function that takes type task as input and returns void.
Then replace NEW in the trigger function with $1 and call the function like this:
SELECT copy_func(task) FROM task;
If the functions are almost identical, it should be good enough to test the trigget function.
The way to manually trigger your on update trigger once would be:
UPDATE task SET performed_on = performed_on
however depending on how complicated your logic is in there and how many rows you have in the table a separate query might be significantly faster for initializing a large number of rows.
Since you mentioned you want to test the behaviour of your trigger you can clone the table or do a table or database dump and restore the data afterwards. If this is a live system you should instead do a database dump, restore to another system, add your trigger, test it, repeat from restore until you nail it... and only after you're sure it does what you want update the live system with it.
I ended up writing a PL/pgSQL function that in a loop processes all events in chronological order and calling it:
create or replace function process_event_history()
returns void
language plpgsql
as
$$
declare
event record;
begin
for event in
select id, timestamp
from events
order by timestamp
loop
update events set timestamp = event.timestamp
where id = event.id;
end loop;
end;
$$;
--;;
-- Execute the above function causing the trigger to run for all events.
select process_event_history();
--;;
-- Remove the temporary processing function.
drop function process_event_history();
I'm trying to implement something similar to replication with python trigger procedures.
procedure
CREATE OR REPLACE FUNCTION foo.send_payload()
RETURNS trigger AS
$$
import json, zmq
try:
payload = json.dumps(TD)
ctx = zmq.Context()
socket = ctx.socket(zmq.PUSH)
socket.connect("ipc:///tmp/feeds/0")
socket.send(payload)
socket.close()
except:
pass
$$
LANGUAGE plpython VOLATILE;
trigger
CREATE TRIGGER foo.my_trigger
AFTER INSERT
ON foo.my_table
FOR EACH ROW
EXECUTE PROCEDURE foo.send_payload();
This does work, but it's not very efficient.
Rows are inserted in bulk and I want to reuse the socket to send all of them.
However, when I do a statement level trigger I don't have access to the rows.
I was thinking about defining a sequence which would be the last row id processed.
Then use that to grab all the data in the procedure with a SELECT inside the statement level trigger.
The problem is that there doesn't seem to be a way of getting a sequence value without incrementing it.
Any suggestions on how to approach this problem?
Use two triggers. "FOR EACH ROW" would stack the rows in some temporary place (maybe SD), and "FOR EACH STATEMENT" would get data from shared place, send, and clear the shared place.
Alternatively (and I think it's better idea), you can use LISTEN/NOTIFY, as I once described in my blog.
So, I think this should be fairly simple, but the documentation makes it seem somewhat more complicated. I've written an SQL function in PostgreSQL (8.1, for now) which does some cleanup on some string input. For what it's worth, the string is an LDAP distinguished name, and I want there to consistently be no spaces after the commas - and the function is clean_dn(), which returns the cleaned DN. I want to do the same thing to force all input to another couple of columns to lower case, etc - which should be easy once I figure this part out.
Anyway, I want this function to be run on the "dn" column of a table any time anyone attempts to insert to or update and modify that column. But all the rule examples I can find seem to make the assumption that all insert/update queries modify all the columns in a table all the time. In my situation, that is not the case. What I think I really want is a constraint which just changes the value rather than returning true or false, but that doesn't seem to make sense with the SQL idea of a constraint. Do I have my rule do an UPDATE into the NEW table? Do I have to create a new rule for every possible combination of NEW values? And if I add a column, do I have to go through and update all of my rule combinations to refelect every possible new combination of columns?
There has to be an easy way...
First, update to a current version of PostgreSQL. 8.1 is long dead and forgotten und unsupported and very, very old .. you get my point? Current version is PostgreSQL 9.2.
Then, use a trigger instead of a rule. It's simpler. It's the way most people go. I do.
For column col in table tbl ...
First, create a trigger function:
CREATE OR REPLACE FUNCTION trg_tbl_insupbef()
RETURNS trigger AS
$BODY$
BEGIN
NEW.col := f_myfunc(NEW.col); -- your function here, must return matching type
RETURN NEW;
END;
$BODY$
LANGUAGE plpgsql VOLATILE;
Then use it in a trigger.
For ancient Postgres 8.1:
CREATE TRIGGER insupbef
BEFORE INSERT OR UPDATE
ON tbl
FOR EACH ROW
EXECUTE PROCEDURE trg_tbl_insupbef();
For modern day Postgres (9.0+)
CREATE TRIGGER insbef
BEFORE INSERT OR UPDATE OF col -- only call trigger, if column was updated
ON tbl
FOR EACH ROW
EXECUTE PROCEDURE trg_tbl_insupbef();
You could pack more stuff into one trigger, but then you can't condition the UPDATE trigger on just the one column ...