Postgres: raise exception from trigger if column is in INSERT or UPDATE satement - postgresql

I want to audit created_by, created_timestamp, modified_by, and modified_timestamp columns in my PostgreSQL table with triggers. Creating BEFORE INSERT and BEFORE UPDATE triggers to set these values to current_user and now() is reasonably straightforward.
However, if someone tries to do:
INSERT INTO SOMETABLE(someColumn, created_by) VALUES ('test', 'someOtherUser');
I'd rather throw an exception like, 'Manually setting created_by in an INSERT query is not allowed." instead of having the trigger silently change 'someOtherUser' to current_user.
I thought I could accomplish this in the trigger with:
if new.created_by is not null then raise exception 'Manually setting created_by in an INSERT query is not allowed.'; end if;
This works as expected for INSERT queries and triggers.
However, using the same strategy for UPDATE triggers, I'm finding it a bit more difficult, because the NEW record has the unchanged values from the existing row in addition to the changed values in the UPDATE query. (At least, I think that's what's happening.)
I can compare new.created_by to old.created_by to ensure they're the same, thus preventing the query from changing the value, but even though the end result is similar (i.e. the value in the table doesn't get changed), this really isn't the same as disallowing the column from being in the UPDATE query at all.
Is there an elegant way to determine if a column is present in the INSERT or UPDATE query? I've seen some suggestions here to convert to JSON and test that way, but that seems to be a rather ugly solution to me.
Are there other solutions to ensurevthese columns (created_by, created_timestamp, etc.) are only set by the trigger functions and are not manually settable in INSERT and UPDATE queries?

Create a special trigger for UPDATE with a name that is early in the alphabet, so that it is called before your other trigger:
CREATE FUNCTION yell() RETURNS trigger
LANGUAGE plpgsql AS
$$BEGIN
RAISE EXCEPTION 'direct update of "created_by" is forbidden';
END;$$;
CREATE TRIGGER aa_nosuchupdate
BEFORE UPDATE OF created_by FOR EACH ROW
EXECUTE PROCEDURE yell();
The INSERT case can be handled in your other trigger.

Related

Trigger sometimes fails with duplicate key error

I'm using a PostgreSQL RDS instance in AWS. Basically, there is a query that inserts data into a first table, let's call it table. The data there can have duplicates in some fields (except for the primary key obviously).
Then there is the trigger that updates another table, infotable, allowing no duplicates.
The trigger:
CREATE TRIGGER insert_infotable AFTER INSERT ON table
FOR EACH ROW EXECUTE PROCEDURE insert_infotable();
The relevant part of the trigger function looks like this:
CREATE OR REPLACE FUNCTION insert_infotable() RETURNS trigger AS $insert_infotable$
BEGIN
--some irrelevant code
IF NOT EXISTS (SELECT * FROM infotable WHERE col1 = NEW.col1 AND col2 = NEW.col2) THEN
INSERT INTO infotable(col1, col2, col3, col4, col5, col6) values (--some values--);
END IF;
RETURN NEW;
END;
$insert_infotable$ LANGUAGE plpgsql;
The table infotable has a UNIQUE constraint on the columns col1 and col2.
In general all is working fine, but rarely, about once in 1k inserts, the trigger returns an error 'duplicate key value violates unique constraint "unique_col1_and_col2"' for table infotable. Which shouldn't happen since there is the IF NOT EXISTS part in the trigger function.
The first question is what might be the cause of this? The only thing I can think of is races where two users are getting the same info simultaneously, both trigger the trigger but then one updates the second table via trigger and the second user gets the duplicate error. And because of that his whole insert query fails, including the insert to the main table.
If that's the case, what can I do about it? Is using a lock on insert a good idea for a table that is supposed to have 100+ users inserting data simultaneously?
And if yes, what type of lock should I use and what table should I lock -- the main table, or the second one, which gets modified by the trigger? (or I guess should I have the lock with my main insert statement or inside the trigger function?)
Yes, this is a race condition. Two such triggers running concurrently won't see each other's modifications, because the transactions are not yet committed.
Since you have a unique constraint on infotable, you can simply use
INSERT INTO infotable ...
ON CONFLICT (col1, col2) DO NOTHING;

Postgres trigger that will in any case set a value on modification?

I have a Web Application that has a modified field in the important tables to be able to track back when any modification was done e.g. (never mind the ;; it is there because this postgres sql code is executed from a Scala framework that uses ; as separator and ;; escapes it)
CREATE TABLE security_permission (
id BIGSERIAL,
value VARCHAR(255) NOT NULL,
modified TIMESTAMP DEFAULT now(),
PRIMARY KEY (id)
);
CREATE OR REPLACE FUNCTION update_modified()
RETURNS TRIGGER AS $$
BEGIN
NEW.modified = now();;
RETURN NEW;;
END;;
$$ language 'plpgsql';
CREATE TRIGGER update_modified_security_permission BEFORE UPDATE ON security_permission FOR EACH ROW EXECUTE PROCEDURE update_modified();
The problem is this works only if the field is NOT specified in the insert/update statement. If the field is specified even with NULL then the modified is not set. I do not have full control of the generated statements because they are part of an ORM framework that generates them automatically but I'd like to nevertheless always set the modified field. How can I do that?
I have tried using BEFORE INSERT OR UPDATE ON and AFTER INSERT OR UPDATE ON but nothing seems to work if the field is populated in the insert/update statement even if NULL. How can I do this?
Define the trigger as before update or insert:
CREATE TRIGGER update_modified_security_permission
BEFORE UPDATE OR INSERT ON security_permission
FOR EACH ROW EXECUTE PROCEDURE update_modified();
See a working example here.

create trigger in PosgreSQL

So I found this example:
create function eager.account_insert() returns trigger
security definer
language plpgsql
as $$
begin
insert into eager.account_balances(name) values(new.name);
return new;
end;
$$;
create trigger account_insert after insert on accounts
for each row execute procedure eager.account_insert();
The thing I can't understand: function eager.account_insert() does not take any arguments, however, it operates with variable new. It returns it, but should't it return trigger?
Also, this: insert into eager.account_balances(name), this is not some certain record chosen, what it this?
The new (and old when it's an update statement) is the RECORD you're inserting or updating. You can get columns from it and do whatever you want with them. Many times BEFORE INSERT triggers check for valid values etc.
The function must return a RECORD with the same columns as the table, or NULL if the insert should not happen (usually for INSTEAD triggers).
The insert statement is just a regular insert with one column specified of the table and the value is taken from the newly inserted RECORD's column name.
The documentation explains the triggers very well.

postgres autoincrement not updated on explicit id inserts

I have the following table in postgres:
CREATE TABLE "test" (
"id" serial NOT NULL PRIMARY KEY,
"value" text
)
I am doing following insertions:
insert into test (id, value) values (1, 'alpha')
insert into test (id, value) values (2, 'beta')
insert into test (value) values ('gamma')
In the first 2 inserts I am explicitly mentioning the id. However the table's auto increment pointer is not updated in this case. Hence in the 3rd insert I get the error:
ERROR: duplicate key value violates unique constraint "test_pkey"
DETAIL: Key (id)=(1) already exists.
I never faced this problem in Mysql in both MyISAM and INNODB engines. Explicit or not, mysql always update autoincrement pointer based on the max row id.
What is the workaround for this problem in postgres? I need it because I want a tighter control for some ids in my table.
UPDATE:
I need it because for some values I need to have a fixed id. For other new entries I dont mind creating new ones.
I think it may be possible by manually incrementing the nextval pointer to max(id) + 1 whenever I am explicitly inserting the ids. But I am not sure how to do that.
That's how it's supposed to work - next_val('test_id_seq') is only called when the system needs a value for this column and you have not provided one. If you provide value no such call is performed and consequently the sequence is not "updated".
You could work around this by manually setting the value of the sequence after your last insert with explicitly provided values:
SELECT setval('test_id_seq', (SELECT MAX(id) from "test"));
The name of the sequence is autogenerated and is always tablename_columnname_seq.
In the recent version of Django, this topic is discussed in the documentation:
Django uses PostgreSQL’s SERIAL data type to store auto-incrementing
primary keys. A SERIAL column is populated with values from a sequence
that keeps track of the next available value. Manually assigning a
value to an auto-incrementing field doesn’t update the field’s
sequence, which might later cause a conflict.
Ref: https://docs.djangoproject.com/en/dev/ref/databases/#manually-specified-autoincrement-pk
There is also management command manage.py sqlsequencereset app_label ... that is able to generate SQL statements for resetting sequences for the given app name(s)
Ref: https://docs.djangoproject.com/en/dev/ref/django-admin/#django-admin-sqlsequencereset
For example these SQL statements were generated by manage.py sqlsequencereset my_app_in_my_project:
BEGIN;
SELECT setval(pg_get_serial_sequence('"my_project_aaa"','id'), coalesce(max("id"), 1), max("id") IS NOT null) FROM "my_project_aaa";
SELECT setval(pg_get_serial_sequence('"my_project_bbb"','id'), coalesce(max("id"), 1), max("id") IS NOT null) FROM "my_project_bbb";
SELECT setval(pg_get_serial_sequence('"my_project_ccc"','id'), coalesce(max("id"), 1), max("id") IS NOT null) FROM "my_project_ccc";
COMMIT;
It can be done automatically using a trigger. This way you are sure that the largest value is always used as the next default value.
CREATE OR REPLACE FUNCTION set_serial_id_seq()
RETURNS trigger AS
$BODY$
BEGIN
EXECUTE (FORMAT('SELECT setval(''%s_%s_seq'', (SELECT MAX(%s) from %s));',
TG_TABLE_NAME,
TG_ARGV[0],
TG_ARGV[0],
TG_TABLE_NAME));
RETURN OLD;
END;
$BODY$
LANGUAGE plpgsql;
CREATE TRIGGER set_mytable_id_seq
AFTER INSERT OR UPDATE OR DELETE
ON mytable
FOR EACH STATEMENT
EXECUTE PROCEDURE set_serial_id_seq('mytable_id');
The function can be reused for multiple tables. Change "mytable" to the table of interest.
For more info regarding triggers:
https://www.postgresql.org/docs/9.1/plpgsql-trigger.html
https://www.postgresql.org/docs/9.1/sql-createtrigger.html

Need help writing a PostgreSQL trigger function

I have two tables representing two different types of imagery. I am using PostGIS to represent the boundaries of those images. Here is a simplified example of those tables:
CREATE TABLE img_format_a (
id SERIAL PRIMARY KEY,
file_path VARCHAR(1000),
boundary GEOGRAPHY(POLYGON, 4326)
);
CREATE TABLE img_format_p (
id SERIAL PRIMARY KEY,
file_path VARCHAR(1000),
boundary GEOGRAPHY(POLYGON, 4326)
);
I also have a cross reference table, which I want to contain all the IDs of the images that overlap each other. Whenever an image of type "A" gets inserted into the database, I want to check to see whether it overlaps any of the existing imagery of type "P" (and vice versa) and insert corresponding entries into the img_a_img_p cross reference table. This table should represent a many-to-many relationship.
My first instinct is to write a trigger to manage thisimg_a_img_p table. I've never created a trigger before, so let me know if this is a silly thing to do, but it seems to make sense to me. So I create the following trigger:
CREATE TRIGGER update_a_p_cross_reference
AFTER INSERT OR DELETE OR UPDATE OF boundary
ON img_format_p FOR EACH ROW
EXECUTE PROCEDURE check_p_cross_reference();
The part where I am getting stuck is with writing the trigger function. My code is in Java and I see that there are tools like PL/pgSQL, but I'm not sure if that's what I should use or if I even need one of those special add-ons.
Essentially all I need the trigger to do is update the cross reference table each time a new image gets inserted into either img_format_a or img_format_p. When a new image is inserted, I would like to use a PostGIS function like ST_Intersects to determine whether the new image overlaps with any of the images in the other table. For each image pair where ST_INTERSECTS returns true, I would like to insert a new entry into img_a_img_p with the ID's of both images. Can someone help me figure out how to write this trigger function? Here is some pseudocode:
SELECT * FROM img_format_p P
WHERE ST_Intersects(A.boundary, P.boundary);
for each match in selection {
INSERT INTO img_a_img_p VALUES (A.id, P.id);
}
You could wrap the usual INSERT ... SELECT idiom in a PL/pgSQL function sort of like this:
create function check_p_cross_reference() returns trigger as
$$
begin
insert into img_a_img_p (img_a_id, img_p_id)
select a.id, p.id
from img_format_a, img_format_p
where p.id = NEW.id
and ST_Intersects(a.boundary, p.boundary);
return null;
end;
$$ language plpgsql;
Triggers have two extra variables, NEW and OLD:
NEW
Data type RECORD; variable holding the new database row for INSERT/UPDATE operations in row-level triggers. This variable is NULL in statement-level triggers and for DELETE operations.
OLD
Data type RECORD; variable holding the old database row for UPDATE/DELETE operations in row-level triggers. This variable is NULL in statement-level triggers and for INSERT operations.
So you can use NEW.id to access the new img_format_p value that's going in. You (currently) can't use the plain SQL language for triggers:
It is not currently possible to write a trigger function in the plain SQL function language.
but PL/pgSQL is pretty close. This would make sense as an AFTER INSERT trigger:
CREATE TRIGGER update_a_p_cross_reference
AFTER INSERT
ON img_format_p FOR EACH ROW
EXECUTE PROCEDURE check_p_cross_reference();
Deletes could be handled with a foreign key on img_a_img_p and a cascading delete. You could use your trigger for UPDATEs as well:
CREATE TRIGGER update_a_p_cross_reference
AFTER INSERT OR UPDATE OF boundary
ON img_format_p FOR EACH ROW
EXECUTE PROCEDURE check_p_cross_reference();
but you'd probably want to clear out the old entries before inserting the new ones with something like:
delete from img_a_img_p where img_p_id = NEW.id;
before the INSERT...SELECT statement.