I have a situation with two tables where one has a foreign key pointing to the other table (simplified) schema:
CREATE TABLE table1 (
name VARCHAR(64) NOT NULL,
PRIMARY KEY(name)
);
CREATE TABLE table2 (
id SERIAL PRIMARY KEY,
table1_name VARCHAR(64) NOT NULL REFERENCES table1(name)
);
Now I regret using the name column as primary key in table1 - and would like to add integer serial key instead. Since I already have data in the database I guess I need to do this carefully. My current plan is as follows:
Drop the foreign key constraint: table2(name) with ALTER TABLE table2 DROP CONSTRAINT table2_table1_name_fkey;
Drop the primary key constraint on table1(name) with ALTER TABLE table1 DROP CONSTRAINT name_pkey;.
Add a unique constraint on table1(name) with ALTER TABLE table1 ADD UNIQUE(name);
Add a automatic primary key to table1 with ALTER TABLE table1 ADD COLUMN ID SERIAL PRIMARY KEY;.
Add a new column table1_id to table2 with ALTER TABLE table2 ADD COLUMN table1_id INT;
Update all rows in table2 - so that the new column (which will be promoted to a foreign key) gets the correct value - as inferred by the previous (still present) foreign key table1_name.
I have completed steps up to an including step 5, but the UPDATE (with JOIN?) required to complete 6 is beyond my SQL paygrade. My current (google based ...) attempt looks like:
UPDATE
table2
SET
table2.table1_id = t1.id
FROM
table1 t1
LEFT JOIN table2 t2
ON t2.table1_name = t1.name;
You do not need JOIN in UPDATE.
UPDATE
table2 t2
SET
table1_id = t1.id
FROM
table1 t1
WHERE
t2.table1_name = t1.name;
Good day, I have using Javascript in Mirth Connect to insert all raws that are allocated in a table of PostgreSql database directly in an other table , and in case of duplicate, update the row. I am trying with it,but it gives to me this error:
InsertIntoPatientMapping= dbConn.executeUpdate('insert into patient_mapping (select * from patient_mapping_test) ON DUPLICATE KEY UPDATE');
Wrapped org.postgresql.util.PSQLException: ERROR: syntax error at or near "DUPLICATE" Position: 69
What did I wrong?
Assuming you have a id column, for upserting considering only the key as duplicate, you can:
insert into patient_mapping (select * from patient_mapping_test) ON CONFLICT (id) DO UPDATE SET id=EXCLUDED.id
https://www.postgresql.org/docs/13/sql-insert.html
Edit
There are a couple of things you need to consider for this:
You need to specify which columns the conflict occurs;
Each column must have an index;
You need to specify how to replace the conflicted values.
For instance:
create table T1 (id integer unique);
create table T2 (id integer unique);
insert into T1 (id) values (1);
insert into T1 (id) values (2);
insert into T2 (id) values (1);
insert into T2 (select * from T1) on conflict (id) do update set id=EXCLUDED.id;
select * from T2;
This may help you: https://www.postgresqltutorial.com/postgresql-upsert/
I´ve the following table with two columns. The two columns are the combined index key of the table:
ID1 ID2
A X
A Y
A Z
B Z
Now I need to do the following:
UPDATE table SET ID2=X WHERE ID2=Z
As the table has a combined key (ID1 + ID2) this leads to a "duplicate key value violates unique constraint" exception cause there is already an "A-X" combination and the update for row "A-Z" violates that key.
So my question is:
Is there in Postgres an option to skip on duplicate key? Or do you have any other hint for me?
Final state of the table should be:
ID1 ID2
A X
A Y
B X
One solution I can think of, is to define the primary key as deferrable and then do the update and delete in a single statement:
create table t1 (
id1 text,
id2 text,
primary key (id1, id2) deferrable
);
Then you can do the following:
with changed as (
UPDATE t1
SET ID2='X'
WHERE ID2='Z'
returning id1, id2
)
delete from t1
where (id1, id2) in (select id1, id2 from changed)
The DELETE does not see the new row that has been created by the update, but it sees the "old" one, so only the conflicting one(s) are deleted.
Online example
I realize that under normal conditions we prefer a single statement, but at times 2 may be better. This may very well be one of those. In order to implement the solution by #a_horse_with_no_name it will be necessary to drop and recreate the PK. Not a desirable move. So a 2 statement solution which does not require playing around with the PK.
do $$
begin
delete
from tbl t1
where t1.id2 = 'Z'
and exists (
select null
from tbl t2
where t1.id1 = t2.id1
and t2.id2 = 'X'
);
update tbl
set id2 = 'X'
where id2 = 'Z';
end;
$$;
The advantage of the DO block being that if for some reason the second statement fails, the action of the first is rolled back. So the database remains in a consistent state.
Suppose you want to have a hierarchy of things where a thing can have multiple thing parents and children:
CREATE TABLE thing (
id SERIAL PRIMARY KEY
);
CREATE TABLE thing_association (
parent_id INTEGER NOT NULL,
child_id INTEGER NOT NULL,
PRIMARY KEY (parent_id, child_id),
CHECK (parent_id != child_id),
FOREIGN KEY (parent_id) REFERENCES thing(id),
FOREIGN KEY (child_id) REFERENCES thing(id)
);
The CHECK constraint prevents a thing from having a relationship with itself, and the PRIMARY KEY constraint prevents duplicate relationships, but can loops be prevented?
More precisely, if row (x, y) exists in the thing_association table, can the row (y, x) be prevented from being inserted?
Going a step further, if rows (x, y) and (y, z) exist in the thing_association, can the row (z, x) be prevented from being inserted?
I was hoping to accomplish this without triggers, but I'm not sure that's possible. I was able to accomplish this with a BEFORE INSERT trigger:
CREATE TABLE thing (
id SERIAL PRIMARY KEY
);
CREATE TABLE thing_association (
parent_id INTEGER NOT NULL,
child_id INTEGER NOT NULL,
PRIMARY KEY (parent_id, child_id),
CHECK (parent_id != child_id),
FOREIGN KEY (parent_id) REFERENCES thing(id),
FOREIGN KEY (child_id) REFERENCES thing(id)
);
/* maps every thing to all of it's parents */
CREATE VIEW thing_hierarchy AS
WITH RECURSIVE children AS (
SELECT
child_id,
parent_id
FROM thing_association
UNION SELECT
children.child_id,
parents.parent_id
FROM thing_association AS parents
INNER JOIN children
ON children.parent_id = parents.child_id
) SELECT * FROM children;
CREATE FUNCTION check_thing_association_loop() RETURNS TRIGGER AS $$
BEGIN
IF ((NEW.parent_id, NEW.child_id) in (SELECT child_id, parent_id FROM thing_hierarchy)) THEN
RAISE EXCEPTION 'Cannont create a hierarchy loop';
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER thing_association_insert_check
BEFORE INSERT ON thing_association
FOR EACH ROW EXECUTE FUNCTION check_thing_association_loop();
You could merge the view into the trigger function, but the view is useful on its own and it keeps things succint.
I have the following scenario in a Postgres 9.3 database:
Tables B and C reference Table A.
Table C has an optional field that references table B.
I would like to ensure that for each row of table C that references table B, c.b.a = c.a. That is, if C has a reference to B, both rows should point at the same row in table A.
I could refactor table C so that if c.b is specified, c.a is null but that would make queries joining tables A and C awkward.
I might also be able to make table B's primary key include its reference to table A and then make table C's foreign key to table B include table C's reference to table A but I think this adjustment would be too awkward to justify the benefit.
I think this can be done with a trigger that runs before insert/update on table C and rejects operations that violate the specified constraint.
Is there a better way to enforce data integrity in this situation?
There is a very simple, bullet-proof solution. Works for Postgres 9.3 - when the original question was asked. Works for the current Postgres 13 - when the question in the bounty was added:
Would like information on if this is possible to achieve without database triggers
FOREIGN KEY constraints can span multiple columns. Just include the ID of table A in the FK constraint from table C to table B. This enforces that linked rows in B and C always point to the same row in A. Like:
CREATE TABLE a (
a_id int PRIMARY KEY
);
CREATE TABLE b (
b_id int PRIMARY KEY
, a_id int NOT NULL REFERENCES a
, UNIQUE (a_id, b_id) -- redundant, but required for FK
);
CREATE TABLE c (
c_id int PRIMARY KEY
, a_id int NOT NULL REFERENCES a
, b_id int
, CONSTRAINT fk_simple_and_safe_solution
FOREIGN KEY (a_id, b_id) REFERENCES b(a_id, b_id) -- THIS !
);
Minimal sample data:
INSERT INTO a(a_id) VALUES
(1)
, (2);
INSERT INTO b(b_id, a_id) VALUES
(1, 1)
, (2, 2);
INSERT INTO c(c_id, a_id, b_id) VALUES
(1, 1, NULL) -- allowed
, (2, 2, 2); -- allowed
Disallowed as requested:
INSERT INTO c(c_id, a_id, b_id) VALUES (3,2,1);
ERROR: insert or update on table "c" violates foreign key constraint "fk_simple_and_safe_solution"
DETAIL: Key (a_id, b_id)=(2, 1) is not present in table "b".
db<>fiddle here
The default MATCH SIMPLE behavior of FK constraints works like this (quoting the manual):
MATCH SIMPLE allows any of the foreign key columns to be null; if any of them are null, the row is not required to have a match in the referenced table.
So NULL values in c(b_id) are still allowed (as requested: "optional field"). The FK constraint is "disabled" for this special case.
We need the logically redundant UNIQUE constraint on b(a_id, b_id) to allow the FK reference to it. But by making it out to be on (a_id, b_id) instead of (b_id, a_id), it is also useful in its own right, providing a useful index on b(a_id) to support the other FK constraint, among other things. See:
Is a composite index also good for queries on the first field?
(An additional index on c(a_id) is typically useful accordingly.)
Further reading:
Differences between MATCH FULL, MATCH SIMPLE, and MATCH PARTIAL?
Enforcing constraints “two tables away”
I ended up creating a trigger as follows:
create function "check C.A = C.B.A"()
returns trigger
as $$
begin
if NEW.b is not null then
if NEW.a != (select a from B where id = NEW.b) then
raise exception 'a != b.a';
end if;
end if;
return NEW;
end;
$$
language plpgsql;
create trigger "ensure C.A = C.B.A"
before insert or update on C
for each row
execute procedure "check C.A = C.B.A"();
Would like information on if this is possible to achieve without database triggers
Yes, it is possible. The mechanism is called ASSERTION and it is defined in SQL-92 Standard(though it is not implemented by any major RDBMS).
In short it allows to create multiple-row constraints or multi-table check constraints.
As for PostgreSQL it could be emulated by using view with WITH CHECK OPTION and performing operation on view instead of base table.
WITH CHECK OPTION
This option controls the behavior of automatically updatable views. When this option is specified, INSERT and UPDATE commands on the view will be checked to ensure that new rows satisfy the view-defining condition (that is, the new rows are checked to ensure that they are visible through the view). If they are not, the update will be rejected.
Example:
CREATE TABLE a(id INT PRIMARY KEY, cola VARCHAR(10));
CREATE TABLE b(id INT PRIMARY KEY, colb VARCHAR(10), a_id INT REFERENCES a(id) NOT NULL);
CREATE TABLE c(id INT PRIMARY KEY, colc VARCHAR(10),
a_id INT REFERENCES a(id) NOT NULL,
b_id INT REFERENCES b(id));
Sample inserts:
INSERT INTO a(id, cola) VALUES (1, 'A');
INSERT INTO a(id, cola) VALUES (2, 'A2');
INSERT INTO b(id, colb, a_id) VALUES (12, 'B', 1);
INSERT INTO c(id, colc, a_id) VALUES (15, 'C', 2);
Violating the condition(connecting C with B different a_id on both tables)
UPDATE c SET b_id = 12 WHERE id = 15;;
-- no issues whatsover
Creating view:
CREATE VIEW view_c
AS
SELECT *
FROM c
WHERE NOT EXISTS(SELECT 1
FROM b
WHERE c.b_id = b.id
AND c.a_id != b.a_id) -- here is the clue, we want a_id to be the same
WITH CHECK OPTION ;
Trying update second time(error):
UPDATE view_c SET b_id = 12 WHERE id = 15;
--ERROR: new row violates check option for view "view_c"
--DETAIL: Failing row contains (15, C, 2, 12).
Trying brand new inserts with incorrect data(also errors)
INSERT INTO b(id, colb, a_id) VALUES (20, 'B2', 2);
INSERT INTO view_c(id, colc, a_id, b_id) VALUES (30, 'C2', 1, 20);
--ERROR: new row violates check option for view "view_c"
--DETAIL: Failing row contains (30, C2, 1, 20)
db<>fiddle demo