plpython get all rows on INSERT TRIGGER - postgresql

I'm trying to implement something similar to replication with python trigger procedures.
procedure
CREATE OR REPLACE FUNCTION foo.send_payload()
RETURNS trigger AS
$$
import json, zmq
try:
payload = json.dumps(TD)
ctx = zmq.Context()
socket = ctx.socket(zmq.PUSH)
socket.connect("ipc:///tmp/feeds/0")
socket.send(payload)
socket.close()
except:
pass
$$
LANGUAGE plpython VOLATILE;
trigger
CREATE TRIGGER foo.my_trigger
AFTER INSERT
ON foo.my_table
FOR EACH ROW
EXECUTE PROCEDURE foo.send_payload();
This does work, but it's not very efficient.
Rows are inserted in bulk and I want to reuse the socket to send all of them.
However, when I do a statement level trigger I don't have access to the rows.
I was thinking about defining a sequence which would be the last row id processed.
Then use that to grab all the data in the procedure with a SELECT inside the statement level trigger.
The problem is that there doesn't seem to be a way of getting a sequence value without incrementing it.
Any suggestions on how to approach this problem?

Use two triggers. "FOR EACH ROW" would stack the rows in some temporary place (maybe SD), and "FOR EACH STATEMENT" would get data from shared place, send, and clear the shared place.
Alternatively (and I think it's better idea), you can use LISTEN/NOTIFY, as I once described in my blog.

Related

Transaction Outbox Pattern with AWS Aurora Postgres and aws_lambda.invoke

I am working on a project which is made out of a couple of microservices. I am planning to use Transaction Outbox Pattern by calling a lambda function in Postgres after insert trigger.
I am thinking something like this
CREATE OR REPLACE FUNCTION tx_msg_func() RETURNS trigger AS
$$
DECLARE newRecord JSON;
BEGIN
newRecord := row_to_json(NEW.*);
PERFORM * FROM aws_lambda.invoke(
aws_commons.create_lambda_function_arn('my_lambda_function'),
newRecord,
'Event'
);
RETURN NEW;
END;
$$
LANGUAGE 'plpgsql';
CREATE TRIGGER tx_msg_insert AFTER INSERT ON tx_outbox_table
FOR EACH ROW EXECUTE PROCEDURE tx_msg_func();
Here, the lambda function will receive the new record as JSON and will send an SQS message. After sending the message successfully, it will delete the record from tx_outbox_table
I am wondering if there is any downside here that I am missing. Do you think this is a production-ready solution? Is there anything I should be aware of?
Well, what about transaction? It should be as short as possible. After insert is executed inside transaction, so... TCP call goes inside the transaction. What can be done with it? Here is an idea. Exceptions are processed outside current transaction. New transaction is lighter when nothing is written, so maybe that is the way to go?

How to invoke a trigger on all rows manually in postgres

My trigger is defined the following way:
CREATE TRIGGER update_contract_finished_at
AFTER INSERT OR DELETE OR UPDATE OF performed_on
ON task
FOR EACH ROW
EXECUTE PROCEDURE update_contract_finished_at_function();
I now want to evoke this trigger to set the variables which are updated by the trigger. How do I do that?
Something like
for each row in task
execute procedure update_contract_finished_at_function();
I know I can update with a standard update set statement. I also want to verifiy that my trigger works on all the data correctly.
I'd write a slightly modified copy of update_contract_finished_at_function that takes type task as input and returns void.
Then replace NEW in the trigger function with $1 and call the function like this:
SELECT copy_func(task) FROM task;
If the functions are almost identical, it should be good enough to test the trigget function.
The way to manually trigger your on update trigger once would be:
UPDATE task SET performed_on = performed_on
however depending on how complicated your logic is in there and how many rows you have in the table a separate query might be significantly faster for initializing a large number of rows.
Since you mentioned you want to test the behaviour of your trigger you can clone the table or do a table or database dump and restore the data afterwards. If this is a live system you should instead do a database dump, restore to another system, add your trigger, test it, repeat from restore until you nail it... and only after you're sure it does what you want update the live system with it.
I ended up writing a PL/pgSQL function that in a loop processes all events in chronological order and calling it:
create or replace function process_event_history()
returns void
language plpgsql
as
$$
declare
event record;
begin
for event in
select id, timestamp
from events
order by timestamp
loop
update events set timestamp = event.timestamp
where id = event.id;
end loop;
end;
$$;
--;;
-- Execute the above function causing the trigger to run for all events.
select process_event_history();
--;;
-- Remove the temporary processing function.
drop function process_event_history();

PostgreSQL BEFORE INSERT trigger locking behavior in a concurrent environment

I have a general function that can manipulate the sequence of any table (why is irrelevant to my question). It reads the current value, works out the new value, sets it, and returns its calculation, which is what's inserted. This is obviously a multi-step process.
I call it from a BEFORE INSERT trigger on tables where I need it.
All I need to know is am I guaranteed that the function will be called by only one caller at a time in a multi-user environment?
Specifically, does the BEFORE INSERT trigger have to complete before it is called again by another caller?
Logically, I would assume yes, but one never knows what may be going on under the hood.
If the answer is no, what minimal locking would I need on the function to guarantee I can read and write the sequence in a "thread-safe" manner?
I'm using PG 10.
EDIT
Here is the function updated with a lock:
CREATE OR REPLACE FUNCTION public.uts_set()
RETURNS TRIGGER AS
$$
DECLARE
sv int8;
seq text := format('%I.%I_uts_seq', tg_table_schema, tg_table_name);
BEGIN
EXECUTE format('LOCK TABLE %I IN ROW EXCLUSIVE MODE;', tg_table_name);
EXECUTE 'SELECT last_value+1 FROM ' || seq INTO sv; -- currval(seq) isn't useable
PERFORM setval(seq, GREATEST(sv, (EXTRACT(epoch FROM localtimestamp) * 1000000)::int8), false);
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
However, a SELECT already acquires ROW EXCLUSIVE, so this statement may be redundant and a stronger lock may be needed. Or, conversely, it may mean no lock is needed.
UPDATE
If I am reading this SO question correctly, my original version without the LOCK should work since the trigger acquires the same lock my updated function is redundantly taking.
All I need to know is am I guaranteed that the function will be called by only one caller at a time in a multi-user environment?
No. Not related to calling functions itself, but you can achieve this behaviour with SERIALIZABLE transaction isolation level:
This level emulates serial transaction execution for all committed
transactions; as if transactions had been executed one after another,
serially, rather than concurrently
But this approach would introduce several tradeoffs, such preparing your application to retry transactions with serialization failure.
Maybe a missed something, but I really believe that you just need NEXTVAL, something like below:
CREATE OR REPLACE FUNCTION public.uts_set()
RETURNS TRIGGER AS
$$
DECLARE
sv int8;
-- First, use %I wildcard for identifiers instead of %s
seq text := format('%I.%I', tg_table_schema, tg_table_name || '_uts_seq');
BEGIN
-- Second, you couldn't call CURRVAL on a session
-- that you didn't issued NEXTVAL before
sv := NEXTVAL(seq);
-- Do your logic here...
-- Result is ignored since this is an STATEMENT trigger
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
Remember that CURRVAL acts on session local scope and NEXTVAL on global scope, so you have a reliable thread-safe mechanism in hands.
The sequence itself handles thread safety with concurrent sessions. So it real comes down to the code that is interacting with the sequence. The following code is thread safe:
SELECT nextval('myseq');
If the sequence is doing much fancier things like setval and currval, I would be more worried about that being done in a high transaction/multi-user environment. Even so, the sequence itself should be locked from other queries while the sequence is being manipulated.

How can I send column values as the payload in a postgresql NOTIFY message?

If an entry in a table satisfies certain conditions, a NOTIFY is sent out. I want the payload to include the ID number and several other columns of information. Is there a postgres method to convert variables (OLD.ColumnID, etc) to strings?
using postgres 9.3
#klin is correct that NOTIFY doesn't support anything other than string literals. However there is a function pg_notify() which takes normal arguments to deal with exactly this situation. It's been around since at least 9.0 and that link is to the official documentation - always worth reading it carefully, there is a wealth of information there.
My guess is that the notify has to be done within a trigger function. Use a dynamic query, e.g.
execute format('notify channel, ''id: %s''', old.id);
The solution was to upgrade Postgres to a version that supported JSON.
Even postgresql 9.3 supports json. You could have just used row_to_json(payload)::text
Sorry for the long answer, i just cant walk away without reacting to the other answers too.
The format version fails in many ways. Before EXECUTE, you shoud prepare the plan. The "pseudo command" does not fits the syntax of execute which is
EXECUTE somepreparedplanname (parameter1, ...)
The %s in format is again too bad, this way you can summon sql injection attacks. When constructing a query with format, you need to use %L for literals %I for column/table/function/etc ids, and use %s almost never.
The other solution with the pg_notify function is correct. Try
LISTEN channel;
SELECT pg_notify('channel','Id: '|| pg_backend_pid ());
in psql command line.
So back to the original question: sdemurjian,
Its not clarified in the question, if you wants to use this notification thing in some trigger function. So here is an example (maybe not) for you (because im a little late. sorry for that too):
CREATE TABLE columns("columnID" oid, "columnData" text);
CREATE FUNCTION column_trigger_func() RETURNS TRIGGER AS
$$ BEGIN PERFORM pg_notify('columnchannel', 'Id: '||OLD."columnID");
RETURN NEW; END; $$ LANGUAGE plpgsql;
CREATE TRIGGER column_notify BEFORE UPDATE ON columns FOR EACH ROW
EXECUTE PROCEDURE column_trigger_func();
LISTEN columnchannel;
INSERT INTO columns VALUES(1,'testdata');
BEGIN; UPDATE columns SET "columnData" = 'success'; END;
BEGIN; UPDATE columns SET "columnData" = 'fail'; ROLLBACK;
Please note that in early postgres versions (any before 9), the notify command does not accepts any payload and there is no pg_notify function.
In 8.1 the trigger function stil works if you define it like
CREATE FUNCTION column_trigger_func() RETURNS TRIGGER AS
$$ BEGIN NOTIFY columnchannel; RETURN NEW; END; $$ LANGUAGE plpgsql;

Sanitize input to a column in postgres

So, I think this should be fairly simple, but the documentation makes it seem somewhat more complicated. I've written an SQL function in PostgreSQL (8.1, for now) which does some cleanup on some string input. For what it's worth, the string is an LDAP distinguished name, and I want there to consistently be no spaces after the commas - and the function is clean_dn(), which returns the cleaned DN. I want to do the same thing to force all input to another couple of columns to lower case, etc - which should be easy once I figure this part out.
Anyway, I want this function to be run on the "dn" column of a table any time anyone attempts to insert to or update and modify that column. But all the rule examples I can find seem to make the assumption that all insert/update queries modify all the columns in a table all the time. In my situation, that is not the case. What I think I really want is a constraint which just changes the value rather than returning true or false, but that doesn't seem to make sense with the SQL idea of a constraint. Do I have my rule do an UPDATE into the NEW table? Do I have to create a new rule for every possible combination of NEW values? And if I add a column, do I have to go through and update all of my rule combinations to refelect every possible new combination of columns?
There has to be an easy way...
First, update to a current version of PostgreSQL. 8.1 is long dead and forgotten und unsupported and very, very old .. you get my point? Current version is PostgreSQL 9.2.
Then, use a trigger instead of a rule. It's simpler. It's the way most people go. I do.
For column col in table tbl ...
First, create a trigger function:
CREATE OR REPLACE FUNCTION trg_tbl_insupbef()
RETURNS trigger AS
$BODY$
BEGIN
NEW.col := f_myfunc(NEW.col); -- your function here, must return matching type
RETURN NEW;
END;
$BODY$
LANGUAGE plpgsql VOLATILE;
Then use it in a trigger.
For ancient Postgres 8.1:
CREATE TRIGGER insupbef
BEFORE INSERT OR UPDATE
ON tbl
FOR EACH ROW
EXECUTE PROCEDURE trg_tbl_insupbef();
For modern day Postgres (9.0+)
CREATE TRIGGER insbef
BEFORE INSERT OR UPDATE OF col -- only call trigger, if column was updated
ON tbl
FOR EACH ROW
EXECUTE PROCEDURE trg_tbl_insupbef();
You could pack more stuff into one trigger, but then you can't condition the UPDATE trigger on just the one column ...