Finding the root node of a tree in Postgress - postgresql

I am using PostgreSQL and I have data in a table like this.
id parent_id
1 NULL
2 1
3 2
4 NULL
5 4
I would like to get the following result.
id. root_id
1 NULL
2 1
3 1
4 NULL
5 4
ie for each id I would like to go up the parents until I find a parent node that does not have a parent - the ultimate ancestor so to speak.
Would be much obliged for some SQL-fu that solves this.
Thanks!

You should start from the roots and walk the tree toward the leaves:
with recursive my_tree as (
select id, parent_id, id as root
from my_table
where parent_id is null
union all
select m.id, m.parent_id, t.root
from my_table m
join my_tree t on t.id = m.parent_id
)
select id, root
from my_tree
order by id;
Note that according to the definition (a node is a root when it does not have a parent), the root of node 1 is 1, not null.
Test it in db<>fiddle.

Related

Finding and creating missing rows in table

Hello postgres experts,
I have an app where users can vote on a poll. My schema looks like this:
polls table:
id
name
1
Favorite fruit
options table:
id
poll_id
content
1
1
apple
2
1
orange
3
1
grape
4
1
banana
participants table:
id
poll_id
name
1
1
John
2
1
Jane
votes table:
id
poll_id
participant_id
option_id
type
1
1
1
1
yes
2
1
1
3
yes
3
1
2
2
yes
I made the poor choice of deciding to not create rows for "no" votes in the votes table thinking it would "save space". I realize now that it was not such a great idea because in the future I would like to know whether the user explicitly voted "no" or if perhaps the option was added after they voted and thus did not have the option to choose it. So I need to run a query that will fill all the missing "no" votes in the votes table for existing participants. The final result should look like this:
votes table:
id
poll_id
participant_id
option_id
type
1
1
1
1
yes
2
1
1
3
yes
3
1
2
2
yes
4
1
1
2
no
5
1
1
4
no
6
1
2
1
no
7
1
2
3
no
8
1
2
4
no
I have a dbfiddle with all the data already in it:
https://dbfiddle.uk/?rdbms=postgres_14&fiddle=7d0f4c83095638cc6006b1d7876d0e01
Side question: Should I be concerned about the size of the votes table in this schema? I expect it to quickly blow up to millions of rows. Is a schema where options are stored as an array in the polls table and votes stored in the participants table a better idea?
Thank you for your help.
You seem to be looking for a JOIN of participants with options, EXCEPT the rows that already are in votes. There are various ways to do that, but most straightforward:
INSERT INTO votes(poll_id, participant_id, option_id, type)
SELECT poll_id, participant_id, option_id, 'no'
FROM (
SELECT o.poll_id, p.id, o.id
FROM options o
JOIN participants p ON o.poll_id = p.poll_id
EXCEPT
SELECT poll_id, participant_id, option_id
FROM votes
) AS missing;
Alternatively:
INSERT INTO votes(poll_id, participant_id, option_id, type)
SELECT o.poll_id, p.id, o.id, 'no'
FROM options o
JOIN participants p ON o.poll_id = p.poll_id
WHERE NOT EXISTS (
SELECT *
FROM votes
WHERE poll_id = o.poll_id AND participant_id = p.id AND option_id = o.id
);
Or, assuming you already have UNIQUE index on votes, just
INSERT INTO votes(poll_id, participant_id, option_id, type)
SELECT o.poll_id, p.id, o.id, 'no'
FROM options o
ON CONFLICT ON CONSTRAINT votes_p_key
DO NOTHING;

How to include and exclude ids in once query postgresql

I use PostgreSQL 13.3
I'm trying to think how I can make include/exclude in query at the same time
I have include_system_ids [1,5] and exclude_system_ids [3]
There's one big table - records
system_records table
record
system_id
1
1
1
5
1
3
2
1
2
5
If a record contains an exclusive identifier, then it should not be included in the final selection. I had some several tries, but I didn't get a necessary result
Awaiting result: record with id 2
Fact result: 1, 2
My variants
select r.id from records r
left join (select record_id from system_records
where system_id in (1,5)
) include_ids on r.id = include_ids
left join (select record_id from system_records
where system_id not in (3)
) exclude_ids on r.id = exclude_ids.id
Honestly, I don't understand how I can do it((
Is there anyone who can help me
Maybe this query could be a solution (result here)
with x as (select record,string_agg(system_id::varchar,',') as sys_id from records group by record)
select records.*
from records,x
where records.record = x.record
and x.sys_id = '1,5'

Subsetting records that contain multiple values in one column

In my postgres table, I have two columns of interest: id and name - my goal is to only keep records where id has more than one value in name. In other words, would like to keep all records of ids that have multiple values and where at least one of those values is B
UPDATE: I have tried adding WHERE EXISTS to the queries below but this does not work
The sample data would look like this:
> test
id name
1 1 A
2 2 A
3 3 A
4 4 A
5 5 A
6 6 A
7 7 A
8 2 B
9 1 B
10 2 B
and the output would look like this:
> output
id name
1 1 A
2 2 A
8 2 B
9 1 B
10 2 B
How would one write a query to select only these kinds records?
Based on your description you would seem to want:
select id, name
from (select t.*, min(name) over (partition by id) as min_name,
max(name) over (partition by id) as max_name
from t
) t
where min_name < max_name;
This can be done using EXISTS:
select id, name
from test t1
where exists (select *
from test t2
where t1.id = t2.id
and t1.name <> t2.name) -- this will select those with multiple names for the id
and exists (select *
from test t3
where t1.id = t3.id
and t3.name = 'B') -- this will select those with at least one b for that id
Those records where for their id more than one name shines up, right?
This could be formulated in "SQL" as follows:
select * from table t1
where id in (
select id
from table t2
group by id
having count(name) > 1)

Using recursive CTE with Ecto

How would I go about using the result of a recursive CTE in a query I plan to run with Ecto? For example let's say I have a table, nodes, structured as so:
-- nodes table example --
id parent_id
1 NULL
2 1
3 1
4 1
5 2
6 2
7 3
8 5
and I also have another table nodes_users structured as so:
-- nodes_users table example --
node_id user_id
1 1
2 2
3 3
5 4
Now, I want to grab all the users with a node at or above a specific node, for the sake of an example let's choose the node w/ the id 8.
I could use the following recursive postgresql query to do so:
WITH RECURSIVE nodes_tree AS (
SELECT *
FROM nodes
WHERE nodes.id = 8
UNION ALL
SELECT n.*
FROM nodes n
INNER JOIN nodes_tree nt ON nt.parent_id = n.id
)
SELECT u.* FROM users u
INNER JOIN users_nodes un ON un.user_id = u.id
INNER JOIN nodes_tree nt ON nt.id = un.node_id
This should return users.* for the users w/ id of 1, 2, and 4.
I'm not sure how I could run this same query using ecto, ideally in a manner that would return a chainable output. I understand that I can insert raw SQL into my query using the fragment macro, but I'm not exactly sure where that would go for this use or if that would even be the most appropriate route to take.
Help and/or suggestions would be appreciated!
I was able to accomplish this using a fragment. Here's an example of the code I used. I'll probably move this method to a stored procedure.
Repo.all(MyProj.User,
from u in MyProj.User,
join: un in MyProj.UserNode, on: u.id == un.user_id,
join: nt in fragment("""
(
WITH RECURSIVE node_tree AS (
SELECT *
FROM nodes
WHERE nodes.id = ?
UNION ALL
SELECT n.*
FROM nodes n
INNER JOIN node_tree nt ON nt.parent_id == n.id
)
) SELECT * FROM node_tree
""", ^node_id), on: un.node_id == nt.id
)

CTE query to root element postgres

This is a very general question. I found some questions and discussions on more specific problems on SO, but I am quite sure, that many of you have already solved this one:
input:
A table that has a tree structure in one field.
An arbitrary id of a database record x.
question:
How can I get the root of the tree of x?
I found out that there should be a way to implement this recursively, but I couldn't achieve it yet.
The root element can be found in the same way as child elements of a given root,
but the query must search in the opposite direction.
Take a look at simple demo: --> http://www.sqlfiddle.com/#!17/fdc8a/1
This query retrieves all childrens of a given root:
WITH RECURSIVE childs( id, parent_id )
AS (
-- get parent
SELECT id, parent_id
FROM tab
WHERE id = 10
UNION ALL
-- get all children
SELECT t.id, t.parent_id
FROM childs c
JOIN tab t
ON t.parent_id = c.id
)
SELECT * from childs;
and this query retrieves all parents of a given child node:
WITH RECURSIVE parents( id, parent_id )
AS (
-- get leaf children
SELECT id, parent_id
FROM tab
WHERE id = 14
UNION ALL
-- get all parents
SELECT t.id, t.parent_id
FROM parents p
JOIN tab t
ON p.parent_id = t.id
)
SELECT * from parents
-- WHERE parent_id is null;
if only the root node is needed, a clause WHERE parent_id IS NULL filters out all except the root.