Postgres: Querying whole rows in a many to many relationship - postgresql

My table is currently structured as follows:
PATHS TABLE
UID uuid primary | NAME text | DURATION number
<uuid 0> | Path 1 | 60
STOPS TABLE
UID uuid primary | NAME text | ADDRESS text
<uuid 1> | Stop 1 | Whatever Str.
<uuid 2> | Stop 2 | Whatever2 Str.
PATH_STOP TABLE
id int primary | PATH uuid fk | STOP uuid fk
0 | <uuid 0> | <uuid 1>
1 | <uuid 0> | <uuid 2>
Meaning that each path has multiple stops assigned to it and one stop can be possibly appear in more than one path, making it a many to many relationship.
I'm finding it confusing querying for paths and get the stops back with it in one single query.
I've been trying for a while to create a function that handles this and this is how far I've come (spoiler, not that far)
create or replace function get_paths() returns setof paths as $$
declare
p paths[]
begin
select * into p from paths;
-- not sure how to move on from here.
end;
$$ language plpgsql;

From https://postgrest.org/en/latest/api.html#many-to-many-relationships
Many-to-many relationships are detected based on the join table. The join table must contain foreign keys to other two tables and they must be part of its composite key.
For the many-to-many relationship between films and actors, the join table roles would be:
create table roles(
film_id int references films(id)
, actor_id int references actors(id)
, primary key(film_id, actor_id)
);
-- the join table can also be detected if the composite key has additional columns
create table roles(
id int generated always as identity,
, film_id int references films(id)
, actor_id int references actors(id)
, primary key(id, film_id, actor_id)
);
Then you can do
await supabase.from('actors').select('first_name,last_name,films(title)')

Related

Insert into two referencing tables by selecting from a single table

I have 2 permanent tables in my PostgreSQL 12 database with a one-to-many relationship (thing, and thing_identifier). The second -- thing_identifier -- has a column referencing thing, such that thing_identifier can hold multiple, external identifiers for a given thing:
CREATE TABLE IF NOT EXISTS thing
(
thing_id SERIAL PRIMARY KEY,
thing_name TEXT, --this is not necessarily unique
thing_attribute TEXT --also not unique
);
CREATE TABLE IF NOT EXISTS thing_identifier
(
id SERIAL PRIMARY KEY,
thing_id integer references thing (thing_id),
identifier text
);
I need to insert some new data into thing and thing_identifier, both of which come from a table I created by using COPY to pull the contents of a large CSV file into the database, something like:
CREATE TABLE IF NOT EXISTS things_to_add
(
id SERIAL PRIMARY KEY,
guid TEXT, --a unique identifier used by the supplier
thing_name TEXT, --not unique
thing_attribute TEXT --also not unique
);
Sample data:
INSERT INTO things_to_add (guid, thing_name) VALUES
('[111-22-ABC]','Thing-a-ma-jig','pretty thing'),
('[999-88-XYZ]','Herk-a-ma-fob','blue thing');
The goal is to have each row in things_to_add result in one new row, each, in thing and thing_identifier, as in the following:
thing:
| thing_id | thing_name | thing attribute |
|----------|---------------------|-------------------|
| 1 | thing-a-ma-jig | pretty thing
| 2 | herk-a-ma-fob | blue thing
thing_identifier:
| id | thing_id | identifier |
|----|----------|------------------|
| 8 | 1 | '[111-22-ABC]' |
| 9 | 2 | '[999-88-XYZ]' |
I could use a CTE INSERTstatement (with RETURNING thing_id) to get the thing_id that results from the INSERT on thing, but I can't figure out how to get both that thing_id from the INSERT on thing and the original guid from things_to_add, which needs to go into thing_identifier.identifier.
Just to be clear, the only guaranteed unique column in thing is thing_id, and the only guaranteed unique column in things_to_add is id (which we don't want to store) and guid (which is what we want in thing_identifier.identifier), so there isn't any way to join thing and things_to_add after the INSERT on thing.
You can retrieve the thing_to_add.guid from a JOIN :
WITH list AS
(
INSERT INTO thing (thing_name)
SELECT thing_name
FROM things_to_add
RETURNING thing_id, thing_name
)
INSERT INTO thing_identifier (thing_id, identifier)
SELECT l.thing_id, t.guid
FROM list AS l
INNER JOIN thing_to_add AS t
ON l.thing_name = t.thing_name
Then, if thing.thing_name is not unique, the problem is more tricky. Updating both tables thing and thing_identifier from the same trigger on thing_to_add may solve the issue :
CREATE OR REPLACE FUNCTION after_insert_thing_to_add ()
RETURNS TRIGGER LANGUAGE sql AS
$$
WITH list AS
(
INSERT INTO thing (thing_name)
SELECT NEW.thing_name
RETURNING thing_id
)
INSERT INTO thing_identifier (thing_id, identifier)
SELECT l.thing_id, NEW.guid
FROM list AS l ;
$$
DROP TRIGGER IF EXISTS after_insert ON thing_to_add ;
CREATE TRIGGER after_insert
AFTER INSERT
ON thing_to_add
FOR EACH ROW
EXECUTE PROCEDURE after_insert_thing_to_add ();

ERROR: more than one owned sequence found in Postgres

I'm setting up a identity column to my existing columns for the Patient table.
Here I would like to use GENERATED ALWAYS AS IDENTITY.
So I setup the identity column by using the following statement (previously it was serial):
ALTER TABLE Patient ALTER PatientId
ADD GENERATED ALWAYS AS IDENTITY (START WITH 1);
For the existing patient table I have a total of 5 records. (patientId 1 to 5)
When I insert a new record after the identity setup, it will throw an error like:
more than one owned sequence found
Even after resetting the identity column, I still get the same error.
ALTER TABLE Patient ALTER COLUMN PatientId RESTART WITH 6;
Let me know if you have any solutions.
Update: This bug has been fixed in PostgreSQL v12 with commit 19781729f78.
The rest of the answer is relevant for older versions.
A serial column has a sequence that is owned by the column and a DEFAULT value that gets the net sequence value.
If you try to change that column into an identity column, you'll get an error that there is already a default value for the column.
Now you must have dropped the default value, but not the sequence that belongs to the serial column. Then when you converted the column into an identity column, a second sequence owned by the column was created.
Now when you try to insert a row, PostgreSQL tries to find and use the sequence owned by the column, but there are two, hence the error message.
I'd argue that this is a bug in PostgreSQL: in my opinion, it should either have repurposed the existing sequence for the identity column or given you an error that there is already a sequence owned by the column, and you should drop it. I'll try to get this bug fixed.
Meanwhile, you should manually drop the sequence left behind from the serial column.
Run the following query:
SELECT d.objid::regclass
FROM pg_depend AS d
JOIN pg_attribute AS a ON d.refobjid = a.attrelid AND
d.refobjsubid = a.attnum
WHERE d.classid = 'pg_class'::regclass
AND d.refclassid = 'pg_class'::regclass
AND d.deptype <> 'i'
AND a.attname = 'patientid'
AND d.refobjid = 'patient'::regclass;
That should give you the name of the sequence left behind from the serial column. Drop it, and the identity column should behave as desired.
This is not an answer -- apologies, but this allows me to show, with a vivid image, the crazy behavior that I (unintentionally) uncovered this morning...
All I had to do was this:
alter TABLE db.generic_items alter column generic_item_id drop default;
alter TABLE db.generic_items alter column generic_item_id add generated by default as identity;
and now when scripting the table to SQL I get (abbreviated):
CREATE TABLE db.generic_items
(
generic_item_id integer NOT NULL GENERATED BY DEFAULT AS IDENTITY ( INCREMENT 1 START 1 MINVALUE 1 MAXVALUE 2147483647 CACHE 1 ),
generic_item_id integer NOT NULL GENERATED BY DEFAULT AS IDENTITY ( INCREMENT 1 START 1 MINVALUE 1 MAXVALUE 2147483647 CACHE 1 ),
generic_item_name character varying(50) COLLATE pg_catalog."default" NOT NULL,
CONSTRAINT pk_generic_items PRIMARY KEY (generic_item_id),
)
I am thankful for the answer posted above, by Laurenz Albe! As he explains, just delete the sequence that was used for the serial default, and this craziness goes away and the table looks normal again.
Again, this is NOT AN ANSWER, but commenting did not let me add enough text.
Apology. Continues from my earlier comment(s).
This is what I executed and it shows, imo, that the manual fix is not sufficient, and with large tables, the repetitive trick I used (see below) would be impractical and potentially wrong because adopting an id belonging to a deleted row.
-- pls disregard the absence of 2 id rows, this is the final situation
\d vaste_data.studie_type
Table "vaste_data.studie_type"
Column | Type | Collation | Nullable | Default
--------+-----------------------+-----------+----------+----------------------------------
id | integer | | not null | generated by default as identity
naam | character varying(25) | | not null |
Indexes:
"pk_tstudytype_tstudytype_id" PRIMARY KEY, btree (id)
Referenced by:
TABLE "stuwadoors" CONSTRAINT "fk_t_stuwadoors_t_studytype" FOREIGN KEY (study_type_id) REFERENCES vaste_data.studie_type(id)
TABLE "psux" CONSTRAINT "study_studytype_fk" FOREIGN KEY (studie_type_id) FOREIGN KEY (studie_type_id) REFERENCES vaste_data.studie_type(id)
alter table vaste_data.studie_type alter column id drop default;
ALTER TABLE
alter table vaste_data.studie_type alter column id add generated by default as identity;
ALTER TABLE
-- I chose to show both sequences so I could try to drop either one.
SELECT d.objid::regclass
FROM pg_depend AS d
JOIN pg_attribute AS a ON d.refobjid = a.attrelid AND
d.refobjsubid = a.attnum
WHERE d.classid = 'pg_class'::regclass
AND d.refclassid = 'pg_class'::regclass
AND a.attname = 'id'
AND d.refobjid = 'vaste_data.studie_type'::regclass;
objid
-----------------------------------------
vaste_data.studie_type_id_seq
vaste_data.tstudytype_tstudytype_id_seq
(2 rows)
drop sequence vaste_data.studie_type_id_seq;
ERROR: cannot drop sequence vaste_data.studie_type_id_seq because column id of table vaste_data.studie_type requires it
HINT: You can drop column id of table vaste_data.studie_type instead.
\d vaste_data.studie_type_id_seq
Sequence "vaste_data.studie_type_id_seq"
Type | Start | Minimum | Maximum | Increment | Cycles? | Cache
---------+-------+---------+------------+-----------+---------+-------
integer | 1 | 1 | 2147483647 | 1 | no | 1
Sequence for identity column: vaste_data.studie_type.id
alter sequence vaste_data.studie_type_id_seq start 6;
ALTER SEQUENCE
drop sequence vaste_data.tstudytype_tstudytype_id_seq;
DROP SEQUENCE
insert into vaste_data.studie_type (naam) values('Overige leiding');
ERROR: duplicate key value violates unique constraint "pk_tstudytype_tstudytype_id"
DETAIL: Key (id)=(1) already exists.
...
ERROR: duplicate key value violates unique constraint "pk_tstudytype_tstudytype_id"
DETAIL: Key (id)=(5) already exists.
insert into vaste_data.studie_type (naam) values('Overige leiding');
INSERT 0 1

Using primary key & foreign key to build ltree

I am pretty new to postgres & especially new to ltree.
Searching the web for ltree brought me to examples where the tree was build by chaining characters. But I want to use the primary key & foreign key.
Therefore I build the following table:
create table fragment(
id serial primary key,
description text,
path ltree
);
create index tree_path_idx on fragment using gist (path);
Instead of A.B.G I want to have 1.3.5.
A root in the examples online is added like so:
insert into fragment (description, path) values ('A', 'A');
Instead of A I want to have the primary key (which I don't know at that moment). Is there a way to do that?
When adding a child I got the same problem:
insert into tree (letter, path) values ('B', '0.??');
I know the id of the parent but not of the child that I want to append.
Is there a way to do that or am I completey off track?
Thank you very much!
You could create a trigger which modifies path before each insert. For example, using this setup:
DROP TABLE IF EXISTS fragment;
CREATE TABLE fragment(
id serial primary key
, description text
, path ltree
);
CREATE INDEX tree_path_idx ON fragment USING gist (path);
Define the trigger:
CREATE OR REPLACE FUNCTION before_insert_on_fragment()
RETURNS TRIGGER LANGUAGE plpgsql AS $$
BEGIN
new.path := new.path || new.id::text;
return new;
END $$;
DROP TRIGGER IF EXISTS before_insert_on_fragment ON fragment;
CREATE TRIGGER before_insert_on_fragment
BEFORE INSERT ON fragment
FOR EACH ROW EXECUTE PROCEDURE before_insert_on_fragment();
Test the trigger:
INSERT INTO fragment (description, path) VALUES ('A', '');
SELECT * FROM fragment;
-- | id | description | path |
-- |----+-------------+------|
-- | 1 | A | 1 |
Now insert B under id = 1:
INSERT INTO fragment (description, path) VALUES ('B', (SELECT path FROM fragment WHERE id=1));
SELECT * FROM fragment;
-- | id | description | path |
-- |----+-------------+------|
-- | 1 | A | 1 |
-- | 2 | B | 1.2 |
Insert C under B:
INSERT INTO fragment (description, path) VALUES ('C', (SELECT path FROM fragment WHERE description='B'));
SELECT * FROM fragment;
-- | id | description | path |
-- |----+-------------+-------|
-- | 1 | A | 1 |
-- | 2 | B | 1.2 |
-- | 3 | C | 1.2.3 |
For anyone checking this in the future, I had the same issue and I figured out a way to do it without triggers and within the same INSERT query:
INSERT INTO fragment (description, path)
VALUES ('description', text2ltree('1.' || currval(pg_get_serial_sequence('fragment', 'id'))));
Explanation:
We can get the id of the current insert operation using currval(pg_get_serial_sequence('fragment', 'id')), which we can concatenate as a string with the parent full path 'parent_path' || and finally convert it to ltree using text2ltree(). The id from currval() doesn't have to be incremented because it is called during INSERT, so it is already incremented.
One edge case to be aware of is when you insert a node without any parent then you can't just remove the string concatenation '1.' || because the argument for text2ltree() must be text while id on its own is an integer. Instead you have concatenate the id with an empty string '' ||.
However, I prefer to create this function to get the path and clean up the insert query:
CREATE FUNCTION get_tree_path("table" TEXT, "column" TEXT, parent_path TEXT)
RETURNS LTREE
LANGUAGE PLPGSQL
AS
$$
BEGIN
IF NOT (parent_path = '') THEN
parent_path = parent_path || '.';
END IF;
RETURN text2ltree(parent_path || currval(pg_get_serial_sequence("table", "column")));
END;
$$
Then, you can call it like this:
INSERT INTO fragment (description, path)
VALUES ('description', get_tree_path('fragment', 'id', '1.9.32'));
If you don't have any parent, then replace the parent_path '1.9.32' with empty text ''.
I came up with this, needs the full parent path for insert, but the updates and deletes are simply cascaded :)
create table if not exists tree
(
-- primary key
id serial,
-- surrogate key
path ltree generated always as (coalesce(parent_path::text,'')::ltree || id::text::ltree) stored unique,
-- foreign key
parent_path ltree,
constraint fk_parent
foreign key(parent_path)
references tree(path)
on delete cascade
on update cascade,
-- content
name text
);

Postgres: complex CASCADE question - making sure you only delete unique foreign key references?

I've got some linked tables in a Postgres database, as follows:
Table "public.key"
Column | Type | Modifiers
--------+------+-----------
id | text | not null
name | text |
Referenced by:
TABLE "enumeration_value" CONSTRAINT "enumeration_value_key_id_fkey" FOREIGN KEY (key_id) REFERENCES key(id)
Table "public.enumeration_value"
Column | Type | Modifiers
--------+------+-----------
id | text | not null
key_id | text |
Foreign-key constraints:
"enumeration_value_key_id_fkey" FOREIGN KEY (key_id) REFERENCES key(id)
Referenced by:
TABLE "classification_item" CONSTRAINT "classification_item_value_id_fkey" FOREIGN KEY (value_id) REFERENCES enumeration_value(id)
Table "public.classification_item"
Column | Type | Modifiers
----------------+------+-----------
id | text | not null
transaction_id | text |
value_id | text |
Foreign-key constraints:
"classification_item_transaction_id_fkey" FOREIGN KEY (transaction_id) REFERENCES transaction(id)
"classification_item_value_id_fkey" FOREIGN KEY (value_id) REFERENCES enumeration_value(id)
I want to
delete all classification_items associated with a certain transaction
delete all enumeration_values associated with those classification_items
and finally, delete all key items associated with those enumeration_values.
The difficulty is that the key items are NOT unique to enumeration_values associated (via classification_item) with a certain transaction. They get created independently, and can exist across multiple of these transactions.
So I know how to do the second two of these steps, but not the first one:
delete from key where id in (select key_id from enumeration_value where id in (select value_id from "classification_item" where id = (select id from "transaction" where slice_id = (select id from slice where name = 'barnet'))));
# In statement above: help! How do I make sure these keys are ONLY used with these values?
delete from enumeration_value where id in (select value_id from "classification_item" where id = (select id from "transaction" where slice_id = (select id from slice where name = 'barnet')));
delete from classification_item where transaction_id in (select id from "transaction" where slice_id = (select id from slice where name = 'barnet'));
If only postgres had a CASCADE DELETE statement....
If only postgres had a CASCADE DELETE
statement....
PostgreSQL has this option for a long time, as of version 8.0 (5 years ago). Just use them.

Inserting self-referential records in Postgresql

Given the following table in PostgreSQL, how do I insert a record which refers to itself?
CREATE TABLE refers (
id SERIAL PRIMARY KEY,
name VARCHAR(255) NOT NULL,
parent_id INTEGER NOT NULL,
FOREIGN KEY (parent_id) REFERENCES refers(id)
);
The examples I'm finding on the Web have been allowed the parent_id to be NULL and then use a trigger to update it. I'd rather update in one shot, if possible.
You can select last_value from the sequence, that is automatically created when you use type serial:
create table test (
id serial primary key,
parent integer not null,
foreign key (parent) references test(id)
);
insert into test values(default, (select last_value from test_id_seq));
insert into test values(default, (select last_value from test_id_seq));
insert into test values(default, (select last_value from test_id_seq));
select * from test;
id | parent
----+--------
1 | 1
2 | 2
3 | 3
(3 rows)
And the following even simpler seems to work as well:
insert into test values(default, lastval());
Though I don't know how this would work when using multiple sequences... I looked it up; lastval() returns the last value returned or set with the last nextval or setval call to any sequence, so the following would get you in trouble:
create table test (
id serial primary key,
foo serial not null,
parent integer not null,
foreign key (parent) references test(id)
);
select setval('test_foo_seq', 100);
insert into test values(default, default, lastval());
ERROR: insert or update on table "test" violates foreign key constraint "test_parent_fkey"
DETAIL: Key (parent)=(101) is not present in table "test".
However the following would be okay:
insert into test values(default, default, currval('test_id_seq'));
select * from test;
id | foo | parent
----+-----+--------
2 | 102 | 2
(1 row)
The main question is - why would you want to insert record which relates to itself?
Schema looks like standard adjacency list - one of methods to implement trees in relational database.
The thing is that in most cases you simply have parent_id NULL for top-level element. This is actually much simpler to handle.