Prevent loops in many-to-many self-referencing relationship (Postgres) - postgresql

Suppose you want to have a hierarchy of things where a thing can have multiple thing parents and children:
CREATE TABLE thing (
id SERIAL PRIMARY KEY
);
CREATE TABLE thing_association (
parent_id INTEGER NOT NULL,
child_id INTEGER NOT NULL,
PRIMARY KEY (parent_id, child_id),
CHECK (parent_id != child_id),
FOREIGN KEY (parent_id) REFERENCES thing(id),
FOREIGN KEY (child_id) REFERENCES thing(id)
);
The CHECK constraint prevents a thing from having a relationship with itself, and the PRIMARY KEY constraint prevents duplicate relationships, but can loops be prevented?
More precisely, if row (x, y) exists in the thing_association table, can the row (y, x) be prevented from being inserted?
Going a step further, if rows (x, y) and (y, z) exist in the thing_association, can the row (z, x) be prevented from being inserted?

I was hoping to accomplish this without triggers, but I'm not sure that's possible. I was able to accomplish this with a BEFORE INSERT trigger:
CREATE TABLE thing (
id SERIAL PRIMARY KEY
);
CREATE TABLE thing_association (
parent_id INTEGER NOT NULL,
child_id INTEGER NOT NULL,
PRIMARY KEY (parent_id, child_id),
CHECK (parent_id != child_id),
FOREIGN KEY (parent_id) REFERENCES thing(id),
FOREIGN KEY (child_id) REFERENCES thing(id)
);
/* maps every thing to all of it's parents */
CREATE VIEW thing_hierarchy AS
WITH RECURSIVE children AS (
SELECT
child_id,
parent_id
FROM thing_association
UNION SELECT
children.child_id,
parents.parent_id
FROM thing_association AS parents
INNER JOIN children
ON children.parent_id = parents.child_id
) SELECT * FROM children;
CREATE FUNCTION check_thing_association_loop() RETURNS TRIGGER AS $$
BEGIN
IF ((NEW.parent_id, NEW.child_id) in (SELECT child_id, parent_id FROM thing_hierarchy)) THEN
RAISE EXCEPTION 'Cannont create a hierarchy loop';
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER thing_association_insert_check
BEFORE INSERT ON thing_association
FOR EACH ROW EXECUTE FUNCTION check_thing_association_loop();
You could merge the view into the trigger function, but the view is useful on its own and it keeps things succint.

Related

PostgreSQL: Foreign key between composite type and independent columns

Minimal definitions:
CREATE TYPE GlobalId AS (
id1 BigInt,
id2 SmallInt
);
CREATE TABLE table1 (
id1 BigSerial NOT NULL,
id2 SmallInt NOT NULL,
PRIMARY KEY (id1, id2)
);
CREATE TABLE table2 (
global_id GlobalId NOT NULL,
FOREIGN KEY (global_id) REFERENCES table1 (id1, id2)
);
In short, I use a composite type for table2 (and many other tables), but for the primary table (table1), I don't directly use the composite type because composite types don't support the use of Serial.
The above produces the following error due to the ostensible mismatch between global_id and id1, id2: number of referencing and referenced columns for foreign key disagree.
Alternatively, if I define the foreign key as FOREIGN KEY (global_id.id1, global_id.id2) REFERENCES table1 (id1, id2), I get a syntax error on using an accessor on global_id.
Any ideas on how to define this foreign key relationship? Alternatively, if there's a way for table1 to use the GlobalId composite type while still getting serial/sequence behavior for id1, that works also.
You can define table1 using your composite type and fill the value using a BEFORE trigger:
CREATE TABLE table1 (id globalid PRIMARY KEY);
CREATE SEQUENCE s OWNED BY table1.id;
CREATE FUNCTION ins_trig() RETURNS trigger LANGUAGE plpgsql AS
$$BEGIN
NEW.id = (nextval('s'), (NEW.id).id2);
RETURN NEW;
END;$$;
CREATE TRIGGER ins_trig BEFORE INSERT ON table1 FOR EACH ROW
EXECUTE PROCEDURE ins_trig();
INSERT INTO table1 VALUES (ROW(NULL, 42));
SELECT * FROM table1;
id
--------
(1,42)
(1 row)

How to have parent record data available when a child one is deleted through cascade

Consider the following two tables:
CREATE TABLE public.parent
(
id bigint NOT NULL DEFAULT nextval('parent_id_seq'::regclass),
CONSTRAINT pk_parent PRIMARY KEY (id)
);
CREATE TABLE public.child
(
child_id bigint NOT NULL DEFAULT nextval('child_child_id_seq'::regclass),
parent_id bigint NOT NULL,
CONSTRAINT pk_child PRIMARY KEY (child_id),
CONSTRAINT inx_parent FOREIGN KEY (parent_id)
REFERENCES public.parent (id) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE
);
CREATE INDEX fki_child
ON public.child
USING btree
(parent_id);
CREATE TRIGGER child_trg
BEFORE DELETE
ON public.child
FOR EACH ROW
EXECUTE PROCEDURE public.trg();
And the trg is defined as:
CREATE OR REPLACE FUNCTION public.trg()
RETURNS trigger AS
$BODY$BEGIN
INSERT INTO temp
SELECT p.id
FROM parent p
WHERE
p.id = OLD.parent_id;
return OLD;
END;$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
To sum up what is happening, there're two tables with a simple parent-child relationship and a cascade on it. There's also a trigger defined on child listening to deletion. I need to access parent's data, in the trigger, when the child's records are deleted due to cascade on parent-child relation. But I can not since they are already deleted! Does anyone have any idea how?
One solution would be to create a BEFORE DELETE trigger on parent instead, which can see all data.
CREATE OR REPLACE FUNCTION public.trg_parent()
RETURNS trigger AS
$func$
BEGIN
INSERT INTO some_tbl (id) -- use target list !!
VALUES (OLD.parent_id);
RETURN OLD;
END
$func$ LANGUAGE plpgsql;
CREATE TRIGGER parent_trg
BEFORE DELETE ON public.parent
FOR EACH ROW EXECUTE PROCEDURE public.trg_parent();

Foreign key constraints involving multiple tables

I have the following scenario in a Postgres 9.3 database:
Tables B and C reference Table A.
Table C has an optional field that references table B.
I would like to ensure that for each row of table C that references table B, c.b.a = c.a. That is, if C has a reference to B, both rows should point at the same row in table A.
I could refactor table C so that if c.b is specified, c.a is null but that would make queries joining tables A and C awkward.
I might also be able to make table B's primary key include its reference to table A and then make table C's foreign key to table B include table C's reference to table A but I think this adjustment would be too awkward to justify the benefit.
I think this can be done with a trigger that runs before insert/update on table C and rejects operations that violate the specified constraint.
Is there a better way to enforce data integrity in this situation?
There is a very simple, bullet-proof solution. Works for Postgres 9.3 - when the original question was asked. Works for the current Postgres 13 - when the question in the bounty was added:
Would like information on if this is possible to achieve without database triggers
FOREIGN KEY constraints can span multiple columns. Just include the ID of table A in the FK constraint from table C to table B. This enforces that linked rows in B and C always point to the same row in A. Like:
CREATE TABLE a (
a_id int PRIMARY KEY
);
CREATE TABLE b (
b_id int PRIMARY KEY
, a_id int NOT NULL REFERENCES a
, UNIQUE (a_id, b_id) -- redundant, but required for FK
);
CREATE TABLE c (
c_id int PRIMARY KEY
, a_id int NOT NULL REFERENCES a
, b_id int
, CONSTRAINT fk_simple_and_safe_solution
FOREIGN KEY (a_id, b_id) REFERENCES b(a_id, b_id) -- THIS !
);
Minimal sample data:
INSERT INTO a(a_id) VALUES
(1)
, (2);
INSERT INTO b(b_id, a_id) VALUES
(1, 1)
, (2, 2);
INSERT INTO c(c_id, a_id, b_id) VALUES
(1, 1, NULL) -- allowed
, (2, 2, 2); -- allowed
Disallowed as requested:
INSERT INTO c(c_id, a_id, b_id) VALUES (3,2,1);
ERROR: insert or update on table "c" violates foreign key constraint "fk_simple_and_safe_solution"
DETAIL: Key (a_id, b_id)=(2, 1) is not present in table "b".
db<>fiddle here
The default MATCH SIMPLE behavior of FK constraints works like this (quoting the manual):
MATCH SIMPLE allows any of the foreign key columns to be null; if any of them are null, the row is not required to have a match in the referenced table.
So NULL values in c(b_id) are still allowed (as requested: "optional field"). The FK constraint is "disabled" for this special case.
We need the logically redundant UNIQUE constraint on b(a_id, b_id) to allow the FK reference to it. But by making it out to be on (a_id, b_id) instead of (b_id, a_id), it is also useful in its own right, providing a useful index on b(a_id) to support the other FK constraint, among other things. See:
Is a composite index also good for queries on the first field?
(An additional index on c(a_id) is typically useful accordingly.)
Further reading:
Differences between MATCH FULL, MATCH SIMPLE, and MATCH PARTIAL?
Enforcing constraints “two tables away”
I ended up creating a trigger as follows:
create function "check C.A = C.B.A"()
returns trigger
as $$
begin
if NEW.b is not null then
if NEW.a != (select a from B where id = NEW.b) then
raise exception 'a != b.a';
end if;
end if;
return NEW;
end;
$$
language plpgsql;
create trigger "ensure C.A = C.B.A"
before insert or update on C
for each row
execute procedure "check C.A = C.B.A"();
Would like information on if this is possible to achieve without database triggers
Yes, it is possible. The mechanism is called ASSERTION and it is defined in SQL-92 Standard(though it is not implemented by any major RDBMS).
In short it allows to create multiple-row constraints or multi-table check constraints.
As for PostgreSQL it could be emulated by using view with WITH CHECK OPTION and performing operation on view instead of base table.
WITH CHECK OPTION
This option controls the behavior of automatically updatable views. When this option is specified, INSERT and UPDATE commands on the view will be checked to ensure that new rows satisfy the view-defining condition (that is, the new rows are checked to ensure that they are visible through the view). If they are not, the update will be rejected.
Example:
CREATE TABLE a(id INT PRIMARY KEY, cola VARCHAR(10));
CREATE TABLE b(id INT PRIMARY KEY, colb VARCHAR(10), a_id INT REFERENCES a(id) NOT NULL);
CREATE TABLE c(id INT PRIMARY KEY, colc VARCHAR(10),
a_id INT REFERENCES a(id) NOT NULL,
b_id INT REFERENCES b(id));
Sample inserts:
INSERT INTO a(id, cola) VALUES (1, 'A');
INSERT INTO a(id, cola) VALUES (2, 'A2');
INSERT INTO b(id, colb, a_id) VALUES (12, 'B', 1);
INSERT INTO c(id, colc, a_id) VALUES (15, 'C', 2);
Violating the condition(connecting C with B different a_id on both tables)
UPDATE c SET b_id = 12 WHERE id = 15;;
-- no issues whatsover
Creating view:
CREATE VIEW view_c
AS
SELECT *
FROM c
WHERE NOT EXISTS(SELECT 1
FROM b
WHERE c.b_id = b.id
AND c.a_id != b.a_id) -- here is the clue, we want a_id to be the same
WITH CHECK OPTION ;
Trying update second time(error):
UPDATE view_c SET b_id = 12 WHERE id = 15;
--ERROR: new row violates check option for view "view_c"
--DETAIL: Failing row contains (15, C, 2, 12).
Trying brand new inserts with incorrect data(also errors)
INSERT INTO b(id, colb, a_id) VALUES (20, 'B2', 2);
INSERT INTO view_c(id, colc, a_id, b_id) VALUES (30, 'C2', 1, 20);
--ERROR: new row violates check option for view "view_c"
--DETAIL: Failing row contains (30, C2, 1, 20)
db<>fiddle demo

Using triggers to maintain linking table

I'm considering employing triggers for maintaining linking table. However my initial approach fails due to foreign key constraint violation. Is there any way to solve the issue without disabling constraints?
CREATE TABLE foo (
id SERIAL PRIMARY KEY,
data TEXT
);
CREATE TABLE bar (
id SERIAL PRIMARY KEY,
data TEXT
);
CREATE TABLE foo_bar_link (
foo_id INT NOT NULL REFERENCES foo(id),
bar_id INT NOT NULL REFERENCES bar(id),
UNIQUE (foo_id, bar_id)
);
CREATE OR REPLACE FUNCTION maintain_link()
RETURNS TRIGGER AS
$maintain_link$
DECLARE
bar_id INT;
BEGIN
INSERT INTO bar (data) VALUES ('not_important_for_this_example_bar_data') RETURNING id INTO bar_id;
INSERT INTO foo_bar_link (foo_id, bar_id) VALUES (NEW.id, bar_id);
RETURN NEW;
END;
$maintain_link$
LANGUAGE plpgsql;
CREATE TRIGGER maintain_link BEFORE INSERT ON foo
FOR EACH ROW EXECUTE PROCEDURE maintain_link();
Here is sqlfiddle.
Use AFTER insert, since using BEFORE insert fails because your parent row in foo doesn't exist yet.

How to add a foreign key constraint to same table using ALTER TABLE in PostgreSQL

To create table I use:
CREATE TABLE category
(
cat_id serial NOT NULL,
cat_name character varying NOT NULL,
parent_id integer NOT NULL,
CONSTRAINT cat_id PRIMARY KEY (cat_id)
)
WITH (
OIDS=FALSE
);
ALTER TABLE category
OWNER TO pgsql;
parent_id is a id to another category. Now I have a problem: how to cascade delete record with its children? I need to set parent_id as foreign key to cat_id.
I try this:
ALTER TABLE category
ADD CONSTRAINT cat_cat_id_fkey FOREIGN KEY (parent_id)
REFERENCES category (cat_id) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE
But it falls with:
ERROR: insert or update on table "category" violates foreign key constraint "cat_cat_id_fkey"
DETAIL: Key (parent_id)=(0) is not present in table "category".
The problem you have - what would be the parent_id of a category at the top of the hierarchy?
If it will be null - it will break the NOT NULL constratint.
If it will be some arbitrary number like 0 - it will break the foreign key (like in your example).
The common solution - drop the NOT NULL constratint on the parent_id and set parent_id to null for top categories.
-- create some fake data for testing
--
DROP SCHEMA tmp CASCADE;
CREATE SCHEMA tmp ;
SET search_path=tmp;
CREATE TABLE category
(
cat_id serial NOT NULL,
cat_name character varying NOT NULL,
parent_id integer NOT NULL,
CONSTRAINT cat_id PRIMARY KEY (cat_id)
);
INSERT INTO category(cat_name,parent_id)
SELECT 'Name_' || gs::text
, gs % 3
FROM generate_series(0,9) gs
;
-- find the records with the non-existing parents
SELECT ca.parent_id , COUNT(*)
FROM category ca
WHERE NOT EXISTS (
SELECT *
FROM category nx
WHERE nx.cat_id = ca.parent_id
)
GROUP BY ca.parent_id
;
-- if all is well: proceed
-- make parent pointer nullable
ALTER TABLE category
ALTER COLUMN parent_id DROP NOT NULL
;
-- set non-existing parent pointers to NULL
UPDATE category ca
SET parent_id = NULL
WHERE NOT EXISTS (
SELECT *
FROM category nx
WHERE nx.cat_id = ca.parent_id
)
;
-- Finally, add the FK constraint
ALTER TABLE category
ADD CONSTRAINT cat_cat_id_fkey FOREIGN KEY (parent_id)
REFERENCES category (cat_id) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE
;
This is quite simple.
Here the foreign key parent_id refers to cat_id.
Here a record with parent_id=0 exists but not a record with cat_id=0.