How to get the top most parent in PostgreSQL - postgresql

I have a tree structure table with columns:
id,parent,name.
Given a tree A->B->C,
how could i get the most top parent A's ID according to C's ID?
Especially how to write SQL with "with recursive"?
Thanks!

WITH RECURSIVE q AS
(
SELECT m
FROM mytable m
WHERE id = 'C'
UNION ALL
SELECT m
FROM q
JOIN mytable m
ON m.id = q.parent
)
SELECT (m).*
FROM q
WHERE (m).parent IS NULL

To implement recursive queries, you need a Common Table Expression (CTE).
This query computes ancestors of all parent nodes. Since we want just the top level, we select where level=0.
WITH RECURSIVE Ancestors AS
(
SELECT id, parent, 0 AS level FROM YourTable WHERE parent IS NULL
UNION ALL
SELECT child.id, child.parent, level+1 FROM YourTable child INNER JOIN
Ancestors p ON p.id=child.parent
)
SELECT * FROM Ancestors WHERE a.level=0 AND a.id=C
If you want to fetch all your data, then use an inner join on the id, e.g.
SELECT YourTable.* FROM Ancestors a WHERE a.level=0 AND a.id=C
INNER JOIN YourTable ON YourTable.id = a.id

Assuming a table named "organization" with properties id, name, and parent_organization_id, here is what worked for me to get a list that included top level and parent level org ID's for each level.
WITH RECURSIVE orgs AS (
SELECT
o.id as top_org_id
,null::bigint as parent_org_id
,o.id as org_id
,o.name
,0 AS relative_depth
FROM organization o
UNION
SELECT
allorgs.top_org_id
,childorg.parent_organization_id
,childorg.id
,childorg.name
,allorgs.relative_depth + 1
FROM organization childorg
INNER JOIN orgs allorgs ON allorgs.org_id = childorg.parent_organization_id
) SELECT
*
FROM
orgs order by 1,5;

Related

How to use aggregate functions when using recursive query in postgresql

On multiple iteration on a recursive query in postgresql, I have got the following result when i run the below query
WITH recursive report AS (
select a.name, a.id, a.parentid, sum(b.id)
from table1 a
INNER JOIN table2 b on a.id=b.table1id
GROUP by a.name, a.id, a.parentid
), report2 AS (
SELECT , 0 as lvl
FROM report
WHERE parentid IS NULL
UNION ALL
SELECT child., parent.lvl + 1
FROM report child
JOIN report2 parent
ON parent.id = child.parentid
)
select * from report2
I want to sum the count column with the top most level, so my output should be like below,
What is the best possible way to get it.
If you calculate a path during recursion, like so:
WITH recursive report AS (
select a.name, a.id, a.parentid, sum(b.id) -- Is summing b.id the right thing here?
from table1 a
INNER JOIN table2 b on a.id=b.table1id
GROUP by a.name, a.id, a.parentid
), report2 AS (
SELECT report.*, 0 as lvl, array[report.id] as path_array
FROM report
WHERE parentid IS NULL
UNION ALL
SELECT child.*, parent.lvl + 1, report2.path_array||report.id
FROM report child
JOIN report2 parent
ON parent.id = child.parentid
)
select * from report2;
Do you really mean sum(b.id) and not count(*) in the report CTE?
You can get the sum of count for your top levels using this query as the main query from your recursion:
select t.name, sum(r.count) as total_count
from report2 r
join table1 t
on t.id = r.path_array[1]
group by t.name;

How to find in a many to many relation all the identical values in a column and join the table with other three tables?

I have a many to many relation with three columns, (owner_id,property_id,ownership_perc) and for this table applies (many owners have many properties).
So I would like to find all the owner_id who has many properties (property_id) and connect them with other three tables (Table 1,3,4) in order to get further information for the requested result.
All the tables that I'm using are
Table 1: owner (id_owner,name)
Table 2: owner_property (owner_id,property_id,ownership_perc)
Table 3: property(id_property,building_id)
Table 4: building(id_building,address,region)
So, when I'm trying it like this, the query runs but it returns empty.
SELECT address,region,name
FROM owner_property
JOIN property ON owner_property.property_id = property.id_property
JOIN owner ON owner.id_owner = owner_property.owner_id
JOIN building ON property.building_id=building.id_building
GROUP BY owner_id,address,region,name
HAVING count(owner_id) > 1
ORDER BY owner_id;
Only when I'm trying the code below, it returns the owner_id who has many properties (see image below) but without joining it with the other three tables:
SELECT a.*
FROM owner_property a
JOIN (SELECT owner_id, COUNT(owner_id)
FROM owner_property
GROUP BY owner_id
HAVING COUNT(owner_id)>1) b
ON a.owner_id = b.owner_id
ORDER BY a.owner_id,property_id ASC;
So, is there any suggestion on what I'm doing wrong when I'm joining the tables? Thank you!
This query:
SELECT owner_id
FROM owner_property
GROUP BY owner_id
HAVING COUNT(property_id) > 1
returns all the owner_ids with more than 1 property_ids.
If there is a case of duplicates in the combination of owner_id and property_id then instead of COUNT(property_id) use COUNT(DISTINCT property_id) in the HAVING clause.
So join it to the other tables:
SELECT b.address, b.region, o.name
FROM (
SELECT owner_id
FROM owner_property
GROUP BY owner_id
HAVING COUNT(property_id) > 1
) t
INNER JOIN owner_property op ON op.owner_id = t.owner_id
INNER JOIN property p ON op.property_id = p.id_property
INNER JOIN owner o ON o.id_owner = op.owner_id
INNER JOIN building b ON p.building_id = b.id_building
ORDER BY op.owner_id, op.property_id ASC;
Always qualify the column names with the table name/alias.
You can try to use a correlated subquery that counts the ownerships with EXISTS in the WHERE clause.
SELECT b1.address,
b1.region,
o1.name
FROM owner_property op1
INNER JOIN owner o1
ON o1.id_owner = op1.owner_id
INNER JOIN property p1
ON p1.id_property = op1.property_id
INNER JOIN building b1
ON b1.id_building = p1.building_id
WHERE EXISTS (SELECT ''
FROM owner_property op2
WHERE op2.owner_id = op1.owner_id
HAVING count(*) > 1);

How should I add fields without adding them to a GROUP BY?

I have a SQL statement that works as-is. I get an area name and the minimum value within that area. next, I need to add in a key so I can actually do something with the results. The key is necessary since names and values are unlikely to be unique.
select g.name, min(g.rndval) from
(
select p.rndval, a.name, p.id
from points p, areas a
where ST_WITHIN(p.geom, a.geom)
) AS g
group by g.name
When I add the Id field to the group by, the query returns multiple rows for each area, as expected since it's grouping by the name and id combination, and the results are no longer what I need. How should I add in the id field (p.id in the inner select)?
You can try:
WITH cte AS
( select p.rndval, a.name, p.id
from points p, areas a
where ST_WITHIN(p.geom, a.geom)
), cte_aggregated AS
(
SELECT name, min(rndval) AS min_value
FROM cte
GROUP BY name
)
SELECT DISTINCT c.rndval, c.name, c.id
FROM cte c
JOIN cte_aggregated ca
ON c.rndval = ca.min_value
AND c.name = ca.name;
You can solve this quite elegantly with a window function:
select name, rndval as min, id
from (
select a.name, p.rndval, p.id, rank() over (partition by a.name order by p.rndval) as rnk
from points p
join areas a on ST_Within(p.geom, a.geom)) as g
where rnk = 1;

Sorting rows by children?

I have this table:
CREATE TABLE items (
id SERIAL PRIMARY KEY,
data TEXT,
parent INT,
posted INT
);
Each item has a piece of data, a timestamp, and a parent. I'd like to select the top 10 root items (parent = 0), sorted by the timestamp of the most recent child.
If item #1 has a child #2 that has a child #3, #3 is considered a child of #1.
How can I do this?
EDIT:
The query has been rewritten to
first sort the child items
get the root parent id and the rank for each item
select the top 10 parents
select the details for the top 10 parents
Common Table expressions have been used to incrementally select the data following the above steps.
WITH recursive c AS
(
SELECT *
FROM seeds
UNION ALL
SELECT
T.id,
T.parent,
c.topParentID,
(c.child_level + 1),
c.child_rank
FROM items AS T
INNER JOIN c ON T.parent = c.id
WHERE T.id <> T.parent
)
, seeds AS
(
SELECT
id,
parent,
parent AS topParentID,
0 AS child_level,
rank() OVER (ORDER BY posted DESC) child_rank
FROM items
WHERE parent <> 0
ORDER BY posted DESC
)
, rank_level AS
(
SELECT DISTINCT
c2.id id,
c_ranks.min_child_rank child_rank,
c_roots.max_child_level root_level
FROM
(
SELECT
id,
MAX(child_level) max_child_level
FROM c
GROUP BY id
)
c_roots
INNER JOIN c c2 ON c_roots.id = c2.id
INNER JOIN
(
SELECT
id,
MIN(child_rank) min_child_rank
FROM c
GROUP BY id
)
c_ranks
ON c2.id = c_ranks.id
)
, top_10_parents AS
(
SELECT
c.topParentID id,
MIN(rl.child_rank) id_rank
FROM rank_level rl
INNER JOIN c ON rl.id = c.id AND c.child_level = rl.root_level
GROUP BY c.topParentID
ORDER BY MIN(rl.child_rank)
limit 10
)
SELECT
i.*
FROM
items i
INNER JOIN top_10_parents tp ON tp.id = i.id
ORDER BY tp.id_rank;
SQL Fiddle
Reference:
WITH Queries (Common Table Expressions) on PostgreSQL Manual

How to design a SQL recursive query?

How would I redesign the below query so that it will recursively loop through entire tree to return all descendants from root to leaves? (I'm using SSMS 2008). We have a President at the root. under him are the VPs, then upper management, etc., on down the line. I need to return the names and titles of each. But this query shouldn't be hard-coded; I need to be able to run this for any selected employee, not just the president. This query below is the hard-coded approach.
select P.staff_name [Level1],
P.job_title [Level1 Title],
Q.license_number [License 1],
E.staff_name [Level2],
E.job_title [Level2 Title],
G.staff_name [Level3],
G.job_title [Level3 Title]
from staff_view A
left join staff_site_link_expanded_view P on P.people_id = A.people_id
left join staff_site_link_expanded_view E on E.people_id = C.people_id
left join staff_site_link_expanded_view G on G.people_id = F.people_id
left join facility_view Q on Q.group_profile_id = P.group_profile_id
Thank you, this was most closely matching what I needed. Here is my CTE query below:
with Employee_Hierarchy (staff_name, job_title, id_number, billing_staff_credentials_code, site_name, group_profile_id, license_number, region_description, people_id)
as
(
select C.staff_name, C.job_title, C.id_number, C.billing_staff_credentials_code, C.site_name, C.group_profile_id, Q.license_number, R.region_description, A.people_id
from staff_view A
left join staff_site_link_expanded_view C on C.people_id = A.people_id
left join facility_view Q on Q.group_profile_id = C.group_profile_id
left join regions R on R.regions_id = Q.regions_id
where A.last_name = 'kromer'
)
select C.staff_name, C.job_title, C.id_number, C.billing_staff_credentials_code, C.site_name, C.group_profile_id, Q.license_number, R.region_description, A.people_id
from staff_view A
left join staff_site_link_expanded_view C on C.people_id = A.people_id
left join facility_view Q on Q.group_profile_id = C.group_profile_id
left join regions R on R.regions_id = Q.regions_id
WHERE C.STAFF_NAME IS NOT NULL
GROUP BY C.STAFF_NAME, C.job_title, C.id_number, C.billing_staff_credentials_code, C.site_name, C.group_profile_id, Q.license_number, R.region_description, A.people_id
ORDER BY C.STAFF_NAME
But I am wondering what is the purpose of the "Employee_Hierarchy"? When I replaced "staff_view" in the outer query with "Employee_Hierarchy", it only returned one record = "Kromer". So when/where can we use "Employee_Hierarchy"?
See:
SQL Server - Simple example of a recursive CTE
MSDN: Recursive Queries using Common Table Expression
SQL Server recursive CTE (this seems pretty much like exactly what you are working on!)
Update:
A proper recursive CTE consist of basically three things:
an anchor SELECT to begin with; that can select e.g. the root level employees (where the Reports_To is NULL), or it can select any arbitrary employee that you define, e.g. by a parameter
a UNION ALL
a recursive SELECT statement that selects from the same, typically self-referencing table and joins with the recursive CTE being currently built up
This gives you the ability to recursively build up a result set that you can then select from.
If you look at the Northwind sample database, it has a table called Employees which is self-referencing: Employees.ReportsTo --> Employees.EmployeeID defines who reports to whom.
Your CTE would look something like this:
;WITH RecursiveCTE AS
(
-- anchor query; get the CEO
SELECT EmployeeID, FirstName, LastName, Title, 1 AS 'Level', ReportsTo
FROM dbo.Employees
WHERE ReportsTo IS NULL
UNION ALL
-- recursive part; select next Employees that have ReportsTo -> cte.EmployeeID
SELECT
e.EmployeeID, e.FirstName, e.LastName, e.Title,
cte.Level + 1 AS 'Level', e.ReportsTo
FROM
dbo.Employees e
INNER JOIN
RecursiveCTE cte ON e.ReportsTo = cte.EmployeeID
)
SELECT *
FROM RecursiveCTE
ORDER BY Level, LastName
I don't know if you can translate your sample to a proper recursive CTE - but that's basically the gist of it: anchor query, UNION ALL, recursive query