I have a question about copying rows in PostgreSQL. My table hierarchy is quite complex, where many tables are linked to each other via foreign keys. For the sake of simplicity, I will explain my question with two tables, but please bear in mind that my actual case requires a lot more complexity.
Say I have the following two tables:
table A
(
integer identifier primary key
... -- other fields
);
table B
(
integer identifier primary key
integer a foreign key references A (identifier)
... -- other fields
);
Say A and B hold the following rows:
A(1)
B(1, 1)
B(2, 1)
My question is: I would like to create a copy of a row in A such that the related rows in B are also copied into a new row. This would give:
A(1) -- the old row
A(2) -- the new row
B(1, 1) -- the old row
B(2, 1) -- the old row
B(3, 2) -- the new row
B(4, 2) -- the new row
Basically I am looking for a COPY/INSERT CASCADE.
Is there a neat trick to achieve this more or less automatically? Maybe by using temporary tables?
I believe that if I have to write all the INSERT INTO ... FROM ... queries myself in the correct order and stuff, I might go mental.
update
Let's answer my own question ;)
I did some try-outs with the RULE mechanisms in PostgreSQL and this is what I came up with:
First, the table definitions:
drop table if exists A cascade;
drop table if exists B cascade;
create table A
(
identifier serial not null primary key,
name varchar not null
);
create table B
(
identifier serial not null primary key,
name varchar not null,
a integer not null references A (identifier)
);
Next, for each table, we create a function and corresponding rule which translates UPDATE into INSERT.
create function A(in A, in A) returns integer as
$$
declare
r integer;
begin
-- A
if ($1.identifier <> $2.identifier) then
insert into A (identifier, name) values ($2.identifier, $2.name) returning identifier into r;
else
insert into A (name) values ($2.name) returning identifier into r;
end if;
-- B
update B set a = r where a = $1.identifier;
return r;
end;
$$ language plpgsql;
create rule A as on update to A do instead select A(old, new);
create function B(in B, in B) returns integer as
$$
declare
r integer;
begin
if ($1.identifier <> $2.identifier) then
insert into B (identifier, name, a) values ($2.identifier, $2.name, $2.a) returning identifier into r;
else
insert into B (name, a) values ($2.name, $2.a) returning identifier into r;
end if;
return r;
end;
$$ language plpgsql;
create rule B as on update to B do instead select B(old, new);
Finally, some testings:
insert into A (name) values ('test_1');
insert into B (name, a) values ('test_1_child', (select identifier from a where name = 'test_1'));
update A set name = 'test_2', identifier = identifier + 50;
update A set name = 'test_3';
select * from A, B where B.a = A.identifier;
This seems to work quite fine. Any comments?
This will work. One thing I note you wisely avoided was DO ALSO rules on inserts and updates. DO ALSO with insert and update is pretty dangerous so avoid that at pretty much all cost.
On further reflection, however, triggers are not going to perform worse and offer fewer hard corners.
Related
I'm working with a postgresql 14 database where I have access to data through a view schema, and inserts and updates are performed with triggers. There is a unique constraint on the storage table, and I was wondering if there is a way to do upserts in this case?
This replicates the problem in a (much) smaller database:
schema:
CREATE SCHEMA storage;
-- unreachable storage schema
CREATE TABLE storage.data (
id SERIAL PRIMARY KEY,
a INTEGER,
b INTEGER,
c INTEGER,
CONSTRAINT pair UNIQUE(a, b)
);
-- reachable access schema
CREATE SCHEMA access;
CREATE VIEW access.data AS
SELECT id, a, b, c FROM storage.data;
-- data insertion trigger for the access view
CREATE FUNCTION data_insert()
RETURNS TRIGGER AS $data_insert$
BEGIN
INSERT INTO storage.data (a, b, c)
VALUES (NEW.a, NEW.b, NEW.c);
RETURN NEW;
END;
$data_insert$ LANGUAGE plpgsql;
CREATE TRIGGER data_insert_trigger
INSTEAD OF INSERT ON access.data
FOR EACH ROW EXECUTE PROCEDURE data_insert();
-- data update trigger for the access view
CREATE FUNCTION data_update()
RETURNS TRIGGER AS $data_update$
BEGIN
UPDATE storage.data SET
a = NEW.a,
b = NEW.b,
c = NEW.c
WHERE id = OLD.id;
RETURN NEW;
END;
$data_update$ LANGUAGE plpgsql;
CREATE TRIGGER data_update_trigger
INSTEAD OF UPDATE ON access.data
FOR EACH ROW EXECUTE PROCEDURE data_update();
What I would like to do is:
# INSERT INTO access.data(a,b,c) VALUES (1,2,3);
INSERT 0 1
# INSERT INTO access.data(a,b,c) VALUES (1,2,4)
ON CONFLICT ON CONSTRAINT pair
DO UPDATE SET c=EXCLUDED.c;
ERROR: constraint "pair" for table "data" does not exist
Is there any way to do an upsert query in this situation, or should I settle for doing a select followed by insert or update?
EDIT: I can not modify the schemas or add functions, I can only make queries to the access schema.
Have you tried something like this?
CREATE FUNCTION data_insert()
RETURNS TRIGGER
LANGUAGE plpgsql
AS $data_insert$
BEGIN
INSERT INTO storage.data (a, b, c)
VALUES ( NEW.a, NEW.b, NEW.c )
ON CONFLICT DO UPDATE
a = EXCLUDED.a
, b = EXCLUDED.b
, c = EXCLUDED.c
;
RETURN NEW;
END;
$data_insert$;
ALTER TABLE table_a
ADD CONSTRAINT fkey
FOREIGN KEY (f_id) REFERENCES table_b(id)
ON DELETE SET NULL;
This is a normal constraint, when a tuple is deleted from table_b, a corresponding tuple's column f_id in table_a will be set to null.
Here besides set f_id to null, I want to set f_id's old value to column f_old_id. Is it possible?
Here is a trigger-based solution.
create or replace function fkey_tf() returns trigger language plpgsql as
$$
begin
update table_a
set f_old_id = f_id, f_id = null
where f_id = OLD.id;
return null;
end;
$$;
create trigger fkey_t
after delete on table_b
for each row execute procedure fkey_tf();
Please note that an index on table_a.f_id is needed in order to not sacrifice performance. This is a good practice for foreign keys anyway.
Currently running version 9.5.3. Update planned, of course.
I have a PostgreSQL database whose schema pre-dates table row-level security (i.e. CREATE POLICY ...). Row-level security was implemented using views. The security is done in the view by selecting only rows that have the ownername matching CURRENT_USER.
I'm trying to build an upsert query using such a view. The problem comes when I try to name the conflict_target.
The problem with using ON CONFLICT UPDATE ... comes from naming what constraint has been violated.
Toy Example
CREATE TABLE foo (id serial, num int, word text, data text, ownername varchar(64));
For each user, the combinations of word and num must be unique.
CREATE UNIQUE INDEX foo_num_word_owner_idx ON foo (num, word, ownername);
The row-level security is implemented using a view based on the current user name. Permission is granted for the view, and removed for the underlying table for ordinary user. security_barrier was added after v 9.5. Note that users don't see ownername.
CREATE VIEW foo_user WITH (security_barrier = True) AS
SELECT id, num, word, data FROM foo
WHERE foo.ownername = CURRENT_USER;
Now auto-set ownername:
CREATE OR REPLACE FUNCTION trf_set_owner() RETURNS trigger AS
$$
BEGIN
IF (TG_OP = 'INSERT') THEN
NEW.ownername = CURRENT_USER::varchar(64);
END IF;
IF (TG_OP = 'UPDATE') THEN
NEW.ownername = CURRENT_USER::varchar(64);
END IF;
RETURN NEW;
END;
$$
LANGUAGE 'plpgsql';
CREATE TRIGGER foo_row_owner
BEFORE INSERT OR UPDATE ON foo FOR EACH ROW
EXECUTE PROCEDURE trf_set_owner();
Note that the ownername column is not displayed in the view; the row security is invisible to the user.
Now add some data:
INSERT INTO foo_user (num, word, data) VALUES (1, 'asdf', 'cat'), (2, 'qwer', 'dog');
SELECT * FROM foo;
-- normally, this would give an error related to privileges,
-- because we don't allow users to query the underlying table.
-- bypassed here for demo purposes.
id | num | word | data | ownername
----+-----+------+------+-----------
1 | 1 | asdf | cat | admin
2 | 2 | qwer | dog | admin
(2 rows)
SELECT * FROM foo_user;
id | num | word | data
----+-----+------+------
1 | 1 | asdf | cat
2 | 2 | qwer | dog
(2 rows)
So far, so good.
What I've Tried
As stated above, for each user, num and word must be unique. There is no problem with different owners having the same num and word (in fact, we expect it).
I'm trying to take advantage of the ON CONFLICT clause in INSERT to create some back-end UPSERT-ish functionality. And it's falling down.
Simple example of error
First, a simple failed insert:
INSERT INTO foo_user (num, word, data) VALUES (2, 'qwer', 'frog');
ERROR: duplicate key value violates unique constraint "foo_num_word_owner_idx"
DETAIL: Key (num, word, ownername)=(2, qwer, admin) already exists.
Entirely expected. Nothing wrong with that.
ON CONFLICT, first attempt
Now we try to make the client experience a bit smoother:
INSERT INTO foo_user (num, word, data) VALUES (2, 'qwer', 'frog')
ON CONFLICT DO UPDATE
SET data = 'frog'
WHERE num = 2 AND word = 'qwer';
ERROR: ON CONFLICT DO UPDATE requires inference specification or constraint name
LINE 2: ON CONFLICT DO UPDATE
^
HINT: For example, ON CONFLICT (column_name).
Yep, just like the documentation says. It needs to know what rule it broke. No problem:
ON CONFLICT, second attempt
INSERT INTO foo_user (num, word, data) VALUES (2, 'qwer', 'frog')
ON CONFLICT (num, word, ownername) DO UPDATE
SET data = 'frog'
WHERE num = 2 AND word = 'qwer';
ERROR: column "ownername" does not exist
LINE 2: ON CONFLICT (num, word, ownername) DO UPDATE
True. Ownername does not exist in the view. We can't drop ownername from the unique index, because we fully expect different owners to have identical num and word values.
ON CONFLICT, third attempt
So I tried converting the index to a constraint, and naming the constraint:
ALTER TABLE foo
ADD CONSTRAINT foo_num_word_owner_crt UNIQUE
USING INDEX foo_num_word_owner_idx;
NOTICE: ALTER TABLE / ADD CONSTRAINT USING INDEX will rename index
"foo_num_word_owner_idx" to "foo_num_word_owner_crt"
Ok, now to test:
INSERT INTO foo_user (num, word, data) VALUES (2, 'qwer', 'frog')
ON CONFLICT ON CONSTRAINT foo_num_word_owner_crt DO UPDATE
SET data = 'frog'
WHERE num = 2 AND word = 'qwer';
ERROR: constraint "foo_num_word_owner_crt" for table "foo_user" does not exist
Well that makes sense: we're querying on the view but specifying a table constraint.
Conclusion
Now I'm out of ideas. How do we get ON CONFLICT to play nice with views like this? Or is it not possible?
I'm this close (holds up thumb and forefinger) to proposing we switch from views to tables with row-level security, but that's rather a lot of work (not necessarily an API-breaker, but still).
Any insights are much appreciated.
You can circumvent the issue by removing the ON CONFLICT clause and using an INSTEAD OF trigger that manually tests for any index conflict:
CREATE OR REPLACE FUNCTION trf_set_num_word() RETURNS trigger AS $$
BEGIN
-- Check if (num, word, ownername) exists by trying an UPDATE
UPDATE foo SET data = 'frog'
WHERE num = NEW.num AND word = NEW.word AND ownername = CURRENT_USER::varchar(64);
IF FOUND THEN
RETURN NULL; -- If so, don't INSERT/UPDATE
END IF;
RETURN NEW; -- If not, do the INSERT
END;
$$ LANGUAGE 'plpgsql';
CREATE TRIGGER foo_user_num_word
INSTEAD OF INSERT OR UPDATE ON foo_user FOR EACH ROW
EXECUTE PROCEDURE trf_set_num_word();
I have 8 similar PL/pgSQL functions; they are used as INSTEAD OF INSERT/UPDATE/DELETE triggers on views to make them writable. The views each combine columns of one generic table (called "things" in the example below) and one special table ("shaped_things" and "flavored_things" below). PostgreSQL's inheritance feature can't be used in our case, by the way.
The triggers have to insert/update rows in the generic table; these parts are identical across all 8 functions. Since the generic table has ~30 columns, I'm trying to use a helper function there, but I'm having trouble passing the view's NEW record to a function that needs a things record as input.
(Similar questions have been asked here and here, but I don't think I can apply the suggested solutions in my case.)
Simplified schema
CREATE TABLE things (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL
-- (plus 30 more columns)
);
CREATE TABLE flavored_things (
thing_id INT PRIMARY KEY REFERENCES things (id) ON DELETE CASCADE,
flavor TEXT NOT NULL
);
CREATE TABLE shaped_things (
thing_id INT PRIMARY KEY REFERENCES things (id) ON DELETE CASCADE,
shape TEXT NOT NULL
);
-- etc...
Writable view implementation for flavored_things
CREATE VIEW flavored_view AS
SELECT t.*,
f.*
FROM things t
JOIN flavored_things f ON f.thing_id = t.id;
CREATE FUNCTION flavored_trig () RETURNS TRIGGER AS $fun$
DECLARE
inserted_id INT;
BEGIN
IF TG_OP = 'INSERT' THEN
INSERT INTO things VALUES ( -- (A)
DEFAULT,
NEW.name
-- (plus 30 more columns)
) RETURNING id INTO inserted_id;
INSERT INTO flavored_things VALUES (
inserted_id,
NEW.flavor
);
RETURN NEW;
ELSIF TG_OP = 'UPDATE' THEN
UPDATE things SET -- (B)
name = NEW.name
-- (plus 30 more columns)
WHERE id = OLD.id;
UPDATE flavored_things SET
flavor = NEW.flavor
WHERE thing_id = OLD.id;
RETURN NEW;
ELSIF TG_OP = 'DELETE' THEN
DELETE FROM flavored_things WHERE thing_id = OLD.id;
DELETE FROM things WHERE id = OLD.id;
RETURN OLD;
END IF;
END;
$fun$ LANGUAGE plpgsql;
CREATE TRIGGER write_flavored
INSTEAD OF INSERT OR UPDATE OR DELETE ON flavored_view
FOR EACH ROW EXECUTE PROCEDURE flavored_trig();
The statements marked "(A)" and "(B)" above are what I would like to replace with a call to a helper function.
Helper function for INSERT
My initial attempt was to replace statement "(A)" with
inserted_id = insert_thing(NEW);
using this function
CREATE FUNCTION insert_thing (new_thing RECORD) RETURNS INTEGER AS $fun$
DECLARE
inserted_id INT;
BEGIN
INSERT INTO things (name) VALUES (
new_thing.name
-- (plus 30 more columns)
) RETURNING id INTO inserted_id;
RETURN inserted_id;
END;
$fun$ LANGUAGE plpgsql;
This fails with the error message "PL/pgSQL functions cannot accept type record".
Giving the parameter the type things doesn't work when the function is called as insert_thing(NEW): "function insert_thing(flavored_view) does not exist".
Simple casting doesn't seem to be available here; insert_thing(NEW::things) produces "cannot cast type flavored_view to things". Writing a CAST function for each view would remove what we gained by using a helper function.
Any ideas?
There are various options, depending on the complete picture.
Basically, your insert function could work like this:
CREATE FUNCTION insert_thing (_thing flavored_view)
RETURNS int AS
$func$
INSERT INTO things (name) VALUES ($1.name) -- plus 30 more columns
RETURNING id;
$func$ LANGUAGE sql;
Using the row type of the view, because NEW in your trigger is of this type.
Use a simple SQL function, which can be inlined and might perform better.
Demo call:
SELECT insert_thing('(1, foo, 1, bar)');
Inside your trigger flavored_trig ():
inserted_id := insert_thing(NEW);
Or, basically rewritten:
IF TG_OP = 'INSERT' THEN
INSERT INTO flavored_things(thing_id, flavor)
VALUES (insert_thing(NEW), NEW.flavor);
RETURN NEW;
ELSIF ...
record is not a valid type outside PL/pgSQL, it's just a generic placeholder for a yet unknown row type in PL/pgSQL) so you cannot use it for an input parameter in a function declaration.
For a more dynamic function accepting various row types you could use a polymorphic type. Examples:
How to return a table by rowtype in PL/pgSQL
Refactor a PL/pgSQL function to return the output of various SELECT queries
How to write a function that returns text or integer values?
Basically you can convert a record to a hstore variable and pass the hstore variable instead of a record variable to a function. You convert record to hstore i.e. so:
DECLARE r record; h hstore;
h = hstore(r);
Your helper function should also be changed so:
CREATE FUNCTION insert_thing (new_thing hstore) RETURNS INTEGER AS $fun$
DECLARE
inserted_id INT;
BEGIN
INSERT INTO things (name) VALUES (
new_thing -> 'name'
-- (plus 30 more columns)
) RETURNING id INTO inserted_id;
RETURN inserted_id;
END;
$fun$ LANGUAGE plpgsql;
And the call:
inserted_id = insert_thing(hstore(NEW));
hope it helps
Composite types. PostgresSQL has documentation on this, you essentially need to use something like
'()' or ROW() to construct the composite type for a row to pass into a function.
I have a web based system that has several tables (postgres/pgsql) that hold many to many relationships such as;
table x
column_id1 smallint FK
column_id2 smallint FK
In this scenario the update is made based on column_id2
At first to update these records we would run the following function;
-- edited to protect the innocent
CREATE FUNCTION associate_id1_with_id2(integer[], integer) RETURNS integer
AS $_$
DECLARE
a alias for $1;
b alias for $2;
i integer;
BEGIN
delete from tablex where user_id = b;
FOR i IN array_lower(a,1) .. array_upper(a,1) LOOP
INSERT INTO tablex (
column_id2,
column_id1)
VALUES (
b,
a[i]);
end loop;
RETURN i;
END;
$_$
LANGUAGE plpgsql;
that seemed sloppy and now with the addition of auditing it really shows.
What I am trying to do now is only delete and insert the necessary rows.
I have been trying various forms of the following with no luck
CREATE OR REPLACE FUNCTION associate_id1_with_id2(integer[], integer) RETURNS integer
AS $_$
DECLARE
a alias for $1;
b alias for $2;
c varchar;
i integer;
BEGIN
c = array_to_string($1,',');
INSERT INTO tablex (
column_id2,
column_id1)
(
SELECT column_id2, column_id1
FROM tablex
WHERE column_id2 = b
AND column_id1 NOT IN (c)
);
DELETE FROM tablex
WHERE column_id2 = b
AND column_id1 NOT IN (c);
RETURN i;
END;
$_$
LANGUAGE plpgsql;
depending on the version of the function I'm attempting there are various errors such as explicit type casts (i'm guessing it doesnt like c being varchar?) for the current version.
first off, is my approach correct or is there a more elegant solution given there are a couple tables which this type of handling is required? If not could you please point me in the right direction?
if this is the right approach could you please assist with the array conversion for the NOT IN portion of the where clause?
Instead of array_to_string, use unnest to transform the array into a set of rows (as if it was a table), and the problem can be solved with vanilla SQL:
INSERT INTO tablex(column_id1,column_id2)
select ai,b from unnest(a) as ai where not exists
(select 1 from tablex where column_id1=ai and column_id2=b);
DELETE FROM tablex
where column_id2=b and column_id1 not in
(select ai from unnest(a) as ai);