Currently running version 9.5.3. Update planned, of course.
I have a PostgreSQL database whose schema pre-dates table row-level security (i.e. CREATE POLICY ...). Row-level security was implemented using views. The security is done in the view by selecting only rows that have the ownername matching CURRENT_USER.
I'm trying to build an upsert query using such a view. The problem comes when I try to name the conflict_target.
The problem with using ON CONFLICT UPDATE ... comes from naming what constraint has been violated.
Toy Example
CREATE TABLE foo (id serial, num int, word text, data text, ownername varchar(64));
For each user, the combinations of word and num must be unique.
CREATE UNIQUE INDEX foo_num_word_owner_idx ON foo (num, word, ownername);
The row-level security is implemented using a view based on the current user name. Permission is granted for the view, and removed for the underlying table for ordinary user. security_barrier was added after v 9.5. Note that users don't see ownername.
CREATE VIEW foo_user WITH (security_barrier = True) AS
SELECT id, num, word, data FROM foo
WHERE foo.ownername = CURRENT_USER;
Now auto-set ownername:
CREATE OR REPLACE FUNCTION trf_set_owner() RETURNS trigger AS
$$
BEGIN
IF (TG_OP = 'INSERT') THEN
NEW.ownername = CURRENT_USER::varchar(64);
END IF;
IF (TG_OP = 'UPDATE') THEN
NEW.ownername = CURRENT_USER::varchar(64);
END IF;
RETURN NEW;
END;
$$
LANGUAGE 'plpgsql';
CREATE TRIGGER foo_row_owner
BEFORE INSERT OR UPDATE ON foo FOR EACH ROW
EXECUTE PROCEDURE trf_set_owner();
Note that the ownername column is not displayed in the view; the row security is invisible to the user.
Now add some data:
INSERT INTO foo_user (num, word, data) VALUES (1, 'asdf', 'cat'), (2, 'qwer', 'dog');
SELECT * FROM foo;
-- normally, this would give an error related to privileges,
-- because we don't allow users to query the underlying table.
-- bypassed here for demo purposes.
id | num | word | data | ownername
----+-----+------+------+-----------
1 | 1 | asdf | cat | admin
2 | 2 | qwer | dog | admin
(2 rows)
SELECT * FROM foo_user;
id | num | word | data
----+-----+------+------
1 | 1 | asdf | cat
2 | 2 | qwer | dog
(2 rows)
So far, so good.
What I've Tried
As stated above, for each user, num and word must be unique. There is no problem with different owners having the same num and word (in fact, we expect it).
I'm trying to take advantage of the ON CONFLICT clause in INSERT to create some back-end UPSERT-ish functionality. And it's falling down.
Simple example of error
First, a simple failed insert:
INSERT INTO foo_user (num, word, data) VALUES (2, 'qwer', 'frog');
ERROR: duplicate key value violates unique constraint "foo_num_word_owner_idx"
DETAIL: Key (num, word, ownername)=(2, qwer, admin) already exists.
Entirely expected. Nothing wrong with that.
ON CONFLICT, first attempt
Now we try to make the client experience a bit smoother:
INSERT INTO foo_user (num, word, data) VALUES (2, 'qwer', 'frog')
ON CONFLICT DO UPDATE
SET data = 'frog'
WHERE num = 2 AND word = 'qwer';
ERROR: ON CONFLICT DO UPDATE requires inference specification or constraint name
LINE 2: ON CONFLICT DO UPDATE
^
HINT: For example, ON CONFLICT (column_name).
Yep, just like the documentation says. It needs to know what rule it broke. No problem:
ON CONFLICT, second attempt
INSERT INTO foo_user (num, word, data) VALUES (2, 'qwer', 'frog')
ON CONFLICT (num, word, ownername) DO UPDATE
SET data = 'frog'
WHERE num = 2 AND word = 'qwer';
ERROR: column "ownername" does not exist
LINE 2: ON CONFLICT (num, word, ownername) DO UPDATE
True. Ownername does not exist in the view. We can't drop ownername from the unique index, because we fully expect different owners to have identical num and word values.
ON CONFLICT, third attempt
So I tried converting the index to a constraint, and naming the constraint:
ALTER TABLE foo
ADD CONSTRAINT foo_num_word_owner_crt UNIQUE
USING INDEX foo_num_word_owner_idx;
NOTICE: ALTER TABLE / ADD CONSTRAINT USING INDEX will rename index
"foo_num_word_owner_idx" to "foo_num_word_owner_crt"
Ok, now to test:
INSERT INTO foo_user (num, word, data) VALUES (2, 'qwer', 'frog')
ON CONFLICT ON CONSTRAINT foo_num_word_owner_crt DO UPDATE
SET data = 'frog'
WHERE num = 2 AND word = 'qwer';
ERROR: constraint "foo_num_word_owner_crt" for table "foo_user" does not exist
Well that makes sense: we're querying on the view but specifying a table constraint.
Conclusion
Now I'm out of ideas. How do we get ON CONFLICT to play nice with views like this? Or is it not possible?
I'm this close (holds up thumb and forefinger) to proposing we switch from views to tables with row-level security, but that's rather a lot of work (not necessarily an API-breaker, but still).
Any insights are much appreciated.
You can circumvent the issue by removing the ON CONFLICT clause and using an INSTEAD OF trigger that manually tests for any index conflict:
CREATE OR REPLACE FUNCTION trf_set_num_word() RETURNS trigger AS $$
BEGIN
-- Check if (num, word, ownername) exists by trying an UPDATE
UPDATE foo SET data = 'frog'
WHERE num = NEW.num AND word = NEW.word AND ownername = CURRENT_USER::varchar(64);
IF FOUND THEN
RETURN NULL; -- If so, don't INSERT/UPDATE
END IF;
RETURN NEW; -- If not, do the INSERT
END;
$$ LANGUAGE 'plpgsql';
CREATE TRIGGER foo_user_num_word
INSTEAD OF INSERT OR UPDATE ON foo_user FOR EACH ROW
EXECUTE PROCEDURE trf_set_num_word();
Related
I am trying to write a trigger with reference to Postgres DOC. But its not even allowing to create a trigger base on truncate, tried different approaches but didn't work.
CREATE TRIGGER delete_after_test
AFTER truncate
ON tableA
FOR EACH ROW
EXECUTE PROCEDURE delete_after_test3();
Function:
CREATE OR REPLACE FUNCTION econnect.delete_after_test3()
RETURNS trigger
LANGUAGE plpgsql
AS $function$
declare
query text;
begin
insert into econnect.delete_after_test_2 (
"name",
age1,
log_time
)
values
(
old."name",
old.age1,
CURRENT_TIMESTAMP
)
;
return old;
END;
$function$
;
Reference: https://www.postgresql.org/docs/current/sql-createtrigger.html
"TRUNCATE will not fire any ON DELETE triggers that might exist for the tables. But it will fire ON TRUNCATE triggers. If ON TRUNCATE triggers are defined for any of the tables, then all BEFORE TRUNCATE triggers are fired before any truncation happens, and all AFTER TRUNCATE triggers are fired after the last truncation is performed and any sequences are reset. The triggers will fire in the order that the tables are to be processed (first those listed in the command, and then any that were added due to cascading)"
A solution using ON DELETE:
create table delete_test(id integer, fld1 varchar, fld2 boolean);
create table delete_test_save(id integer, fld1 varchar, fld2 boolean);
insert into delete_test values (1, 'test', 't'), (2, 'dog', 'f'), (3, 'cat', 't')
CREATE OR REPLACE FUNCTION public.delete_save()
RETURNS trigger
LANGUAGE plpgsql
AS $function$
BEGIN
INSERT INTO delete_test_save SELECT * FROM del_recs;
RETURN OLD;
END;
$function$
CREATE TRIGGER trg_del_save
AFTER DELETE ON delete_test referencing OLD TABLE AS del_recs FOR EACH statement
EXECUTE FUNCTION delete_save ();
delete from delete_test;
DELETE 3
select * from delete_test;
id | fld1 | fld2
----+------+------
(0 rows)
select * from delete_test_save;
id | fld1 | fld2
----+------+------
1 | test | t
2 | dog | f
3 | cat | t
The example uses a transition relation (referencing OLD TABLE AS del_recs) to collect all the deleted records for use in the function. Then it is possible to do the INSERT INTO delete_test_save SELECT * FROM del_recs; to transfer the records to the other table. No, they will not work with a TRUNCATE trigger.
Transition relations are explained here Create Trigger:
The REFERENCING option enables collection of transition relations, which are row sets that include all of the rows inserted, deleted, or modified by the current SQL statement. This feature lets the trigger see a global view of what the statement did, not just one row at a time. This option is only allowed for an AFTER trigger that is not a constraint trigger; also, if the trigger is an UPDATE trigger, it must not specify a column_name list. OLD TABLE may only be specified once, and only for a trigger that can fire on UPDATE or DELETE; it creates a transition relation containing the before-images of all rows updated or deleted by the statement. Similarly, NEW TABLE may only be specified once, and only for a trigger that can fire on UPDATE or INSERT; it creates a transition relation containing the after-images of all rows updated or inserted by the statement.
I've looked a few methods of creating alphanumeric IDs on Stackoverflow, but they all had their weaknesses, some did not check for collision and others used sequences which are not a good option when using logical replication.
After some Googling I found this website that has the following script which checks for collisions and does not use sequences. However this is done as a trigger when a row is inserted into the table.
-- Create a trigger function that takes no arguments.
-- Trigger functions automatically have OLD, NEW records
-- and TG_TABLE_NAME as well as others.
CREATE OR REPLACE FUNCTION unique_short_id()
RETURNS TRIGGER AS $$
-- Declare the variables we'll be using.
DECLARE
key TEXT;
qry TEXT;
found TEXT;
BEGIN
-- generate the first part of a query as a string with safely
-- escaped table name, using || to concat the parts
qry := 'SELECT id FROM ' || quote_ident(TG_TABLE_NAME) || ' WHERE id=';
-- This loop will probably only run once per call until we've generated
-- millions of ids.
LOOP
-- Generate our string bytes and re-encode as a base64 string.
key := encode(gen_random_bytes(6), 'base64');
-- Base64 encoding contains 2 URL unsafe characters by default.
-- The URL-safe version has these replacements.
key := replace(key, '/', '_'); -- url safe replacement
key := replace(key, '+', '-'); -- url safe replacement
-- Concat the generated key (safely quoted) with the generated query
-- and run it.
-- SELECT id FROM "test" WHERE id='blahblah' INTO found
-- Now "found" will be the duplicated id or NULL.
EXECUTE qry || quote_literal(key) INTO found;
-- Check to see if found is NULL.
-- If we checked to see if found = NULL it would always be FALSE
-- because (NULL = NULL) is always FALSE.
IF found IS NULL THEN
-- If we didn't find a collision then leave the LOOP.
EXIT;
END IF;
-- We haven't EXITed yet, so return to the top of the LOOP
-- and try again.
END LOOP;
-- NEW and OLD are available in TRIGGER PROCEDURES.
-- NEW is the mutated row that will actually be INSERTed.
-- We're replacing id, regardless of what it was before
-- with our key variable.
NEW.id = key;
-- The RECORD returned here is what will actually be INSERTed,
-- or what the next trigger will get if there is one.
RETURN NEW;
END;
$$ language 'plpgsql';
I have have a table which already contains data, I have added a new column called pid would it be possible to modify this and use the function call as default so all my prior data gets a short id?
Suppose you have a table test:
DROP TABLE IF EXISTS test;
CREATE TABLE test (foo text, bar int);
INSERT INTO test (foo, bar) VALUES ('A', 1), ('B', 2);
You could add an id column to it:
ALTER TABLE test ADD COLUMN id text;
and attach the trigger:
DROP TRIGGER IF EXISTS unique_short_id_on_test ON test;
CREATE TRIGGER unique_short_id_on_test
BEFORE INSERT ON test
FOR EACH ROW EXECUTE PROCEDURE unique_short_id();
Now make a temporary table, temp, with the same structure as test (but with no data):
DROP TABLE IF EXISTS temp;
CREATE TABLE temp (LIKE test INCLUDING ALL);
CREATE TRIGGER unique_short_id_on_temp
BEFORE INSERT ON temp
FOR EACH ROW EXECUTE PROCEDURE unique_short_id();
Pouring test into temp:
INSERT INTO temp (foo, bar)
SELECT foo, bar
FROM test
RETURNING *
yields something like:
| foo | bar | id |
|------------+-----+----------|
| A | 1 | 9yt9XQwm |
| B | 2 | LCeiA-P8 |
If other tables have foreign key references on the test table or if test must remain online,
it may not be possible to drop test and rename temp to test.
Instead, it is safer to update test with the ids from temp.
Assuming test has a primary key (for concreteness, let's call it, testid), then
you could update test with the ids from temp using:
UPDATE test
SET id = temp.id
FROM temp
WHERE test.testid = temp.testid;
Then you could drop the temp table:
DROP TABLE temp;
I have a question about copying rows in PostgreSQL. My table hierarchy is quite complex, where many tables are linked to each other via foreign keys. For the sake of simplicity, I will explain my question with two tables, but please bear in mind that my actual case requires a lot more complexity.
Say I have the following two tables:
table A
(
integer identifier primary key
... -- other fields
);
table B
(
integer identifier primary key
integer a foreign key references A (identifier)
... -- other fields
);
Say A and B hold the following rows:
A(1)
B(1, 1)
B(2, 1)
My question is: I would like to create a copy of a row in A such that the related rows in B are also copied into a new row. This would give:
A(1) -- the old row
A(2) -- the new row
B(1, 1) -- the old row
B(2, 1) -- the old row
B(3, 2) -- the new row
B(4, 2) -- the new row
Basically I am looking for a COPY/INSERT CASCADE.
Is there a neat trick to achieve this more or less automatically? Maybe by using temporary tables?
I believe that if I have to write all the INSERT INTO ... FROM ... queries myself in the correct order and stuff, I might go mental.
update
Let's answer my own question ;)
I did some try-outs with the RULE mechanisms in PostgreSQL and this is what I came up with:
First, the table definitions:
drop table if exists A cascade;
drop table if exists B cascade;
create table A
(
identifier serial not null primary key,
name varchar not null
);
create table B
(
identifier serial not null primary key,
name varchar not null,
a integer not null references A (identifier)
);
Next, for each table, we create a function and corresponding rule which translates UPDATE into INSERT.
create function A(in A, in A) returns integer as
$$
declare
r integer;
begin
-- A
if ($1.identifier <> $2.identifier) then
insert into A (identifier, name) values ($2.identifier, $2.name) returning identifier into r;
else
insert into A (name) values ($2.name) returning identifier into r;
end if;
-- B
update B set a = r where a = $1.identifier;
return r;
end;
$$ language plpgsql;
create rule A as on update to A do instead select A(old, new);
create function B(in B, in B) returns integer as
$$
declare
r integer;
begin
if ($1.identifier <> $2.identifier) then
insert into B (identifier, name, a) values ($2.identifier, $2.name, $2.a) returning identifier into r;
else
insert into B (name, a) values ($2.name, $2.a) returning identifier into r;
end if;
return r;
end;
$$ language plpgsql;
create rule B as on update to B do instead select B(old, new);
Finally, some testings:
insert into A (name) values ('test_1');
insert into B (name, a) values ('test_1_child', (select identifier from a where name = 'test_1'));
update A set name = 'test_2', identifier = identifier + 50;
update A set name = 'test_3';
select * from A, B where B.a = A.identifier;
This seems to work quite fine. Any comments?
This will work. One thing I note you wisely avoided was DO ALSO rules on inserts and updates. DO ALSO with insert and update is pretty dangerous so avoid that at pretty much all cost.
On further reflection, however, triggers are not going to perform worse and offer fewer hard corners.
I am currently using postgres 8.3. I have created a table that acts as a dirty flag table for members that exist in another table. I have applied triggers after insert or update on the members table that will insert/update a record on the modifications table with a value of true. The trigger seems to work, however I am noticing that something is flipping the boolean is_modified value. I have no idea how to go about trying to isolate what could be flipping it.
Trigger function:
BEGIN;
CREATE OR REPLACE FUNCTION set_member_as_modified() RETURNS TRIGGER AS $set_member_as_modified$
BEGIN
LOOP
-- first try to update the key
UPDATE member_modification SET is_modified = TRUE, updated = current_timestamp WHERE "memberID" = NEW."memberID";
IF FOUND THEN
RETURN NEW;
END IF;
--member doesn't exist in modification table, so insert them
-- if someone else inserts the same key conncurrently, raise a unique-key failure
BEGIN
INSERT INTO member_modification("memberID",is_modified,updated) VALUES(NEW."memberID", TRUE,current_timestamp);
RETURN NEW;
EXCEPTION WHEN unique_violation THEN
-- do nothing, and loop to try the update again
END;
END LOOP;
END;
$set_member_as_modified$ LANGUAGE plpgsql;
COMMIT;
CREATE TRIGGER set_member_as_modified AFTER INSERT OR UPDATE ON members FOR EACH ROW EXECUTE PROCEDURE set_member_as_modified();
Here is the sql I run and the results:
$CREATE TRIGGER set_member_as_modified AFTER INSERT OR UPDATE ON members FOR EACH ROW EXECUTE PROCEDURE set_member_as_modified();
Results:
UPDATE 1
bluesky=# select * from member_modification;
-[ RECORD 1 ]---+---------------------------
modification_id | 14
is_modified | t
updated | 2011-05-26 09:49:47.992241
memberID | 182346
bluesky=# select * from member_modification;
-[ RECORD 1 ]---+---------------------------
modification_id | 14
is_modified | f
updated | 2011-05-26 09:49:47.992241
memberID | 182346
As you can see something flipped the is_modified value. Is there anything in postgres I can use to determine what queries/processes are acting on this table?
Are you sure you've posted everything needed? The two queries on member_modification suggest that a separate query is being run in between, which sets is_modified back to false.
You could add an text[] field to member_modification, e.g. query_trace text[] not null default '{}', then and a before insert/update trigger on each row on that table which goes something like:
NEW.query_trace := NEW.query_trace || current_query();
If current_query() is not available in 8.3, see this:
http://www.postgresql.org/docs/8.3/static/monitoring-stats.html
SELECT pg_stat_get_backend_pid(s.backendid) AS procpid,
pg_stat_get_backend_activity(s.backendid) AS current_query
FROM (SELECT pg_stat_get_backend_idset() AS backendid) AS s;
You could then get the list of subsequent queries that affected it:
select query_trace[i] from generate_series(1, array_length(query_trace, 1)) as i
I have two tables. Lets say tblA and tblB.
I need to insert a row in tblA and use the returned id as a value to be inserted as one of the columns in tblB.
I tried finding out this in documentation but could not get it. Well, is it possible to write a statement (intended to be used in prepared) like
INSERT INTO tblB VALUES
(DEFAULT, (INSERT INTO tblA (DEFAULT, 'x') RETURNING id), 'y')
like we do for SELECT?
Or should I do this by creating a Stored Procedure?. I'm not sure if I can create a prepared statement out of a Stored Procedure.
Please advise.
Regards,
Mayank
You'll need to wait for PostgreSQL 9.1 for this:
with
ids as (
insert ...
returning id
)
insert ...
from ids;
In the meanwhile, you need to use plpgsql, a temporary table, or some extra logic in your app...
This is possible with 9.0 and the new DO for anonymous blocks:
do $$
declare
new_id integer;
begin
insert into foo1 (id) values (default) returning id into new_id;
insert into foo2 (id) values (new_id);
end$$;
This can be executed as a single statement. I haven't tried creating a PreparedStatement out of that though.
Edit
Another approach would be to simply do it in two steps, first run the insert into tableA using the returning clause, get the generated value through JDBC, then fire the second insert, something like this:
PreparedStatement stmt_1 = con.prepareStatement("INSERT INTO tblA VALUES (DEFAULT, ?) returning id");
stmt_1.setString(1, "x");
stmt_1.execute(); // important! Do not use executeUpdate()!
ResultSet rs = stmt_1.getResult();
long newId = -1;
if (rs.next()) {
newId = rs.getLong(1);
}
PreparedStatement stmt_2 = con.prepareStatement("INSERT INTO tblB VALUES (default,?,?)");
stmt_2.setLong(1, newId);
stmt_2.setString(2, "y");
stmt_2.executeUpdate();
You can do this in two inserts, using currval() to retrieve the foreign key (provided that key is serial):
create temporary table tb1a (id serial primary key, t text);
create temporary table tb1b (id serial primary key,
tb1a_id int references tb1a(id),
t text);
begin;
insert into tb1a values (DEFAULT, 'x');
insert into tb1b values (DEFAULT, currval('tb1a_id_seq'), 'y');
commit;
The result:
select * from tb1a;
id | t
----+---
3 | x
(1 row)
select * from tb1b;
id | tb1a_id | t
----+---------+---
2 | 3 | y
(1 row)
Using currval in this way is safe whether in or outside of a transaction. From the Postgresql 8.4 documentation:
currval
Return the value most recently
obtained by nextval for this sequence
in the current session. (An error is
reported if nextval has never been
called for this sequence in this
session.) Because this is returning a
session-local value, it gives a
predictable answer whether or not
other sessions have executed nextval
since the current session did.
You may want to use AFTER INSERT trigger for that. Something along the lines of:
create function dostuff() returns trigger as $$
begin
insert into table_b(field_1, field_2) values ('foo', NEW.id);
return new; --values returned by after triggers are ignored, anyway
end;
$$ language 'plpgsql';
create trigger trdostuff after insert on table_name for each row execute procedure dostuff();
after insert is needed because you need to have the id to reference it. Hope this helps.
Edit
A trigger will be called in the same "block" as the command that triggered it, even if not using transactions - in other words, it becomes somewhat part of that command.. Therefore, there is no risk of something changing the referenced id between inserts.