How to change the query to remain only leaf nodes - postgresql

I have table with the following data:
id | parent_id | short_name
----+-----------+----------------
6 | 5 | cpu
7 | 5 | ram
14 | 9 | tier-a
15 | 9 | rfc1918
16 | 9 | tolerant
17 | 9 | nononymous
13 | 12 | cloudstack
5 | 13 | virtualmachine
8 | 13 | volume
9 | 13 | ipv4
3 | | domain
4 | | account
12 | | vdc
(13 rows)
with recursive query it looks like this:
with recursive tree ( id, parent_id, short_name, deep_name ) as (
select resource_type_id, parent_resource_type_id, short_name, short_name::text
from resource_type
where parent_resource_type_id is null
union all
select rt.resource_type_id as id, rt.parent_resource_type_id, rt.short_name,
tree.deep_name || '.' || rt.short_name
from tree, resource_type rt
where tree.id = rt.parent_resource_type_id
)
select * from tree;
id | parent_id | short_name | deep_name
----+-----------+----------------+-----------------------------------
4 | | account | account
3 | | domain | domain
12 | | vdc | vdc
13 | 12 | cloudstack | vdc.cloudstack
9 | 13 | ipv4 | vdc.cloudstack.ipv4
5 | 13 | virtualmachine | vdc.cloudstack.virtualmachine
8 | 13 | volume | vdc.cloudstack.volume
6 | 5 | cpu | vdc.cloudstack.virtualmachine.cpu
15 | 9 | rfc1918 | vdc.cloudstack.ipv4.rfc1918
17 | 9 | nononymous | vdc.cloudstack.ipv4.nononymous
16 | 9 | tolerant | vdc.cloudstack.ipv4.tolerant
14 | 9 | tier-a | vdc.cloudstack.ipv4.tier-a
7 | 5 | ram | vdc.cloudstack.virtualmachine.ram
(13 rows)
How to fix the query so in result I get only leafs? eg. vdc.cloudstack.volume row and no vdc, vdc.cloudstack rows
UPD
rows with no children

Exclude the rows where deep_name has a superstring somewhere else in the table:
WITH RECURSIVE tree AS (...)
SELECT * FROM tree AS t1
WHERE NOT EXISTS (
SELECT 1 FROM tree AS t2
WHERE t2.deep_name
LIKE t1.deep_name || '.%'
);

Laurenz Albe's answer give me an idea. I think it would be more efficient to count childs than working with strings.
My solution is:
WITH RECURSIVE tree AS (...)
SELECT * FROM tree t1
WHERE not EXISTS ( SELECT 1 FROM tree t2 WHERE t1.id = t2.parent_id );

A leaf node is a child which is not itself a parent.
If all you want is a list of leaf notes you don't need the recursive CTE, you just need an anti-join in your preferred format.
If (as I imagine you do) you need the deep_name, I would anti-join the result of the recursive CTE to the raw source table on id = parent_id.
WITH RECURSIVE tree AS (...)
SELECT * FROM tree AS t1
WHERE NOT EXISTS (SELECT 1 FROM resource_type AS t2
WHERE t2.parent_resource_type_id = t1.id);

Related

Sorting columns with hierarchy and additional column for sorting

I have following structure:
id,
name,
parent_id,
order_by
and entries:
id | name | parent_id | order_by
----+-----------+-----------+----------
8 | Cat 1 | | 1
7 | Cat 2 | | 2
5 | Cat 3 | | 3
15 | Cat 1.1 | 8 | 1
17 | Cat 1.2 | 15 | 2
16 | Cat 2.1 | 8 | 1
20 | Cat 1.2.1 | 17 | 1
And I wanna output:
id | name | parent_id | order_by
----+-----------+-----------+----------
8 | Cat 1 | | 1
15 | Cat 1.1 | 8 | 1
17 | Cat 1.2 | 8 | 2
20 | Cat 1.2.1 | 17 | 1
7 | Cat 2 | | 2
16 | Cat 2.1 | 7 | 1
5 | Cat 3 | | 3
So sort main entries (without parent_id) using order_by column and sort in children using order_by column for children in one level.
Note: I assume that the for id = 16 the parent_id should be 7, not 8
You need a recursive query to go through the whole tree. You need a way to "remember" the main sort order and then sort by two different criteria: one for the "overall" sort order and one for each child level:
with recursive tree as (
select id, name, parent_id, order_by as main_order, null::int as child_order
from category
where parent_id is null
union all
select c.id, c.name, c.parent_id, p.main_order, c.order_by as child_order
from category c
join tree p on p.id = c.parent_id
)
select *
from tree
order by main_order, child_order nulls first;
By carrying the order_by from the root level to all children, we can keep all rows that belong to the same root together. The rows for one root are then sorted according to the fake child_order - the root rows will have null for that column and the nulls first puts them at the beginning of each group.
Online example: http://rextester.com/ZVLII98217

Multi-table recursive sql statement

I have been struggling to optimize a recursive call done purely in ruby. I have moved the data onto a postgresql database, and I would like to make use of the WITH RECURSIVE function that postgresql offers.
The examples that I could find all seems to use a single table, such as a menu or a categories table.
My situation is slightly different. I have a questions and an answers table.
+----------------------+ +------------------+
| questions | | answers |
+----------------------+ +------------------+
| id | | source_id | <- from question ID
| start_node (boolean) | | target_id | <- to question ID
| end_node (boolean) | +------------------+
+----------------------+
I would like to fetch all questions that's connected together by the related answers.
I would also like to be able to go the other way in the tree, e.g from any given node to the root node in the tree.
To give another example of a question-answer tree in a graphical way:
Q1
|-- A1
| '-- Q2
| |-- A2
| | '-- Q3
| '-- A3
| '-- Q4
'-- A4
'-- Q5
As you can see, a question can have multiple outgoing questions, but they can also have multiple incoming answers -- any-to-many.
I hope that someone has a good idea, or can point me to some examples, articles or guides.
Thanks in advance, everybody.
Regards,
Emil
This is far, far from ideal but I would play around recursive query over joins, like that:
WITH RECURSIVE questions_with_answers AS (
SELECT
q.*, a.*
FROM
questions q
LEFT OUTER JOIN
answers a ON (q.id = a.source_id)
UNION ALL
SELECT
q.*, a.*
FROM
questions_with_answers qa
JOIN
questions q ON (qa.target_id = q.id)
LEFT OUTER JOIN
answers a ON (q.id = a.source_id)
)
SELECT * FROM questions_with_answers WHERE source_id IS NOT NULL AND target_id IS NOT NULL;
Which gives me result:
id | name | start_node | end_node | source_id | target_id
----+------+------------+----------+-----------+-----------
1 | Q1 | | | 1 | 2
2 | A1 | | | 2 | 3
3 | Q2 | | | 3 | 4
3 | Q2 | | | 3 | 6
4 | A2 | | | 4 | 5
6 | A3 | | | 6 | 7
1 | Q1 | | | 1 | 8
8 | A4 | | | 8 | 9
2 | A1 | | | 2 | 3
3 | Q2 | | | 3 | 6
3 | Q2 | | | 3 | 4
4 | A2 | | | 4 | 5
6 | A3 | | | 6 | 7
8 | A4 | | | 8 | 9
3 | Q2 | | | 3 | 6
3 | Q2 | | | 3 | 4
6 | A3 | | | 6 | 7
4 | A2 | | | 4 | 5
6 | A3 | | | 6 | 7
4 | A2 | | | 4 | 5
(20 rows)
In fact you do not need two tables.
I would like to encourage you to analyse this example.
Maintaining one table instead of two will save you a lot of trouble, especially when it comes to recursive queries.
This minimal structure contains all the necessary information:
create table the_table (id int primary key, parent_id int);
insert into the_table values
(1, 0), -- root question
(2, 1),
(3, 1),
(4, 2),
(5, 2),
(6, 1),
(7, 3),
(8, 0), -- root question
(9, 8);
Whether the node is a question or an answer depends on its position in the tree. Of course, you can add a column with information about the type of node to the table.
Use this query to get answer for both your requests (uncomment adequate where condition):
with recursive cte(id, parent_id, depth, type, root) as (
select id, parent_id, 1, 'Q', id
from the_table
where parent_id = 0
-- and id = 1 <-- looking for list of a&q for root question #1
union all
select
t.id, t.parent_id, depth+ 1,
case when (depth & 1)::boolean then 'A' else 'Q' end, c.root
from cte c
join the_table t on t.parent_id = c.id
)
select *
from cte
-- where id = 9 <-- looking for root question for answer #9
order by id;
id | parent_id | depth | type | root
----+-----------+-------+------+------
1 | 0 | 1 | Q | 1
2 | 1 | 2 | A | 1
3 | 1 | 2 | A | 1
4 | 2 | 3 | Q | 1
5 | 2 | 3 | Q | 1
6 | 1 | 2 | A | 1
7 | 3 | 3 | Q | 1
8 | 0 | 1 | Q | 8
9 | 8 | 2 | A | 8
(9 rows)
The relationship child - parent is unambiguous and applies to both sides. There is no need to store this information twice. In other words, if we store information about parents, the information about children is redundant (and vice versa). It is one of the fundamental properties of the data structure called tree. See the examples:
-- find parent of node #6
select parent_id
from the_table
where id = 6;
-- find children of node #6
select id
from the_table
where parent_id = 6;

Full Outer Join on two columns is omitting rows

Some background, I am making a table in Postgres 9.5 that counts the number of actions performed by a user and grouping these actions by month using date_trunc(). The counts for each individual action are divided into separate tables, following this format:
Feedback table:
id | month | feedback_counted
----+---------+-------------------
1 | 2 | 3
1 | 3 | 10
1 | 4 | 7
1 | 5 | 2
Comments table:
id | month | comments_counted
----+---------+-------------------
1 | 4 | 12
1 | 5 | 4
1 | 6 | 57
1 | 7 | 12
Ideally, I would like to do a FULL OUTER JOIN of these tables ON the "id" and "month" columns at the same time and produce this query:
Combined table:
id | month | feedback_counted | comments_counted
----+---------+--------------------+-------------------
1 | 2 | 3 |
1 | 3 | 10 |
1 | 4 | 7 | 12
1 | 5 | 2 | 4
1 | 6 | | 57
1 | 7 | | 12
However, my current query does not capture the feedback dates, displaying it like such:
Rollup table:
id | month | feedback_counted | comments_counted
----+---------+--------------------+-------------------
| | |
| | |
1 | 4 | 7 | 12
1 | 5 | 2 | 4
1 | 6 | | 57
1 | 7 | | 12
This is my current statement, note that it uses date_trunc in place of month. I add the action counts later, the main issue is somewhere here.
CREATE TABLE rollup_table AS
SELECT c.id, c.date_trunc
FROM comments_counted c FULL OUTER JOIN feedback_counted f
ON c.id = f.id AND c.date_trunc = f.date_trunc
GROUP BY c.id, c.date_trunc, f.id, f.date_trunc;
I'm a bit of a novice with SQL and am not sure how to fix this, any help would be appreciated.
Replace ON c.id = f.id AND c.month = f.month with USING(id, month).
SELECT id, month, feedback_counted, comments_counted
FROM comments c
FULL OUTER JOIN feedback f
USING(id, month);
id | month | feedback_counted | comments_counted
----+-------+------------------+------------------
1 | 2 | 3 |
1 | 3 | 10 |
1 | 4 | 7 | 12
1 | 5 | 2 | 4
1 | 6 | | 57
1 | 7 | | 12
(6 rows)
Test it in db<>fiddle.
USING() basically is the same as ON, just that if the 2 tables share the same column names, you can use USING() instead of ON to save some typing effort. That being said, using USING() won't work. In Postgresql (not sure about other sql versions), you still need to specify c.id, and c.month, even with USING(). And as long as you specify the columns, Postgresql will only pull the rows where the values of these columns exist. That's why you will have missing rows under the full outer join.
Here is a way that at least works for me.
SELECT COALESCE(c.id, f.id) AS id,
COALESCE(c.month, f.month) AS month,
feedback_counted,
comments_counted
FROM comments c
FULL OUTER JOIN feedback f
ON c.id = f.id AND c.month = f.month;

Postgresql joining one column with 2 not null columns

I have 2 tables that I want match by ID.
EDIT: Table1:
| id | other columns A |
| 23 | ... |
| 27 | ... |
| 9 | ... |
| 50 | ... |
Table2:
| id_new | id_old | other columns B
| 23 | 7 | ...
| 27 | 8 | ...
| 33 | 9 | ...
Problem is that the second table contains 2 ID columns: first with new ID second with the old one - both can match the ID from first table.
EDIT: there are some rows from table A which ID not match neither id_new nor id_old. But I want them to retain in the new table.
This is my desired result:
| id | id_new | id_old | other columns A + B
| 23 | 23 | 7 | A + B
| 27 | 27 | 8 | A + B
| 9 | 33 | 9 | A + B
| 50 | -- | -- | A
I tried this one but it's a huge dataset and my query takes a long time to execute.
create table spoj2
as
select *
from table1
left join table2 on table1.id = table2.id_new
or table1.id = table2.id_old
Is this what you need?
select table1.id, IFNULL(T2A.id_new, T2B.id_new) as id_new
, IFNULL(T2A.id_old, T2B.id_old) as id_old
FROM table1
LEFT JOIN table2 as T2A ON table1.id = T2A.id_new
LEFT JOIN table2 as T2B ON table1.id = T2B.id_old
WITH t1(id,o_t1) AS ( VALUES
(23,'...'),
(27,'...'),
(9,'...')
), t2(id_new,id_old,o_t2) AS ( VALUES
(23,7,'...'),
(27,8,'...'),
(33,9,'...')
)
SELECT t1.id,t2.id_new,t2.id_old,t1.o_t1,t2.o_t2 FROM t1
INNER JOIN t2 ON t2.id_new = t1.id
UNION ALL
SELECT t1.id,t2.id_new,t2.id_old,t1.o_t1,t2.o_t2 FROM t1
INNER JOIN t2 ON t2.id_old = t1.id;
Result:
id | id_new | id_old | o_t1 | o_t2
----+--------+--------+------+------
23 | 23 | 7 | ... | ...
27 | 27 | 8 | ... | ...
9 | 33 | 9 | ... | ...
(3 rows)

PostgreSQL Query?

DB
| ID| VALUE | Parent | Position | lft | rgt |
|---|:------:|:-------:|:--------------:|:--------:|:--------:|
| 1 | A | | | 1 | 12 |
| 2 | B | 1 | L | 2 | 9 |
| 3 | C | 1 | R | 10 | 11 |
| 4 | D | 2 | L | 3 | 6 |
| 5 | F | 2 | R | 7 | 8 |
| 6 | G | 4 | L | 4 | 5 |
Get All Nodes directly under current Node in left side
SELECT "categories".* FROM "categories" WHERE ("categories"."position" = 'L') AND ("categories"."lft" >= 1 AND "categories"."lft" < 12) ORDER BY "categories"."lft"
output { B,D,G } incoorect!
Question !
how have Nodes directly under current Node in left and right side?
output-lft {B,D,F,G}
output-rgt {C}
It sounds like you're after something analogous to Oracle's CONNECT_BY statement, which is used to connect hierarchical data stored in a flat table.
It just so happens there's a way to do this with Postgres, using a recursive CTE.
here is the statement I came up with.
WITH RECURSIVE sub_categories AS
(
-- non-recursive term
SELECT * FROM categories WHERE position IS NOT NULL
UNION ALL
-- recursive term
SELECT c.*
FROM
categories AS c
JOIN
sub_categories AS sc
ON (c.parent = sc.id)
)
SELECT DISTINCT categories.value
FROM categories,
sub_categories
WHERE ( categories.parent = sub_categories.id
AND sub_categories.position = 'L' )
OR ( categories.parent = 1
AND categories.position = 'L' )
here is a SQL Fiddle with a working example.