SQL | Parent Child relationship in same table - postgresql

I have a table that has parent and child key relationship in the same table, I need to find Parent rows that don't have any children(for example row 1 have no other children) and the most recent children (for example like row 17 have 3 children i.e. 10,13,14 and we need to fetch most recent children only which is 10 )

You need one part related to be main parent:
SELECT parent.*
FROM MyTable AS parent
WHERE parent_id = 0
Next you need to find most recent direct child:
SELECT parent.*, child.*
FROM MyTable AS parent
LEFT JOIN MyTable AS child
ON child.parent_id = parent.id
WHERE parent_id = 0
AND RANK() OVER (PARTITION BY child.parent_id ORDER BY child.id DESC) = 1
And add no-child ones:
SELECT parent.*, child.*
FROM MyTable AS parent
LEFT JOIN MyTable AS child
ON child.parent_id = parent.id
WHERE parent_id = 0
AND (
RANK() OVER (PARTITION BY child.parent_id ORDER BY child.id DESC) = 1
OR child.id IS NULL)

Something like this should work, but may not be the most efficient way of doing it:
select {desired fields}
from your_table
where not exists
(
select 1 from your_table B
where B.parent_id = your_table.id
)
UNION
select {desired fields}
from your_table
where id in
(
select max(id)
from your_table B
where B.parent_id = your_table.id
)

Hmmm . . . You seem to want one row per parent. Either the most recent child or the parent if there are no children. You can actually do this with window functions:
select t.*
from (select t.*,
row_number() over (partition by coalesce(nullif(parent_id, 0), id
order by parent_id desc, id desc
) as seqnum
from t
) t
where seqnum = 1;

Related

Best way to repeat list of values for `IN` clauses

I need to use the same list of values in several IN clauses and I tried doing that with a WITH statement, but can't get it to work correctly.
Here's an example query:
SELECT * FROM parent WHERE
id IN (SELECT first_id FROM child WHERE id=119896 UNION ALL
SELECT second_id FROM child WHERE id=119896 UNION ALL
SELECT third_id FROM child WHERE id=119896) OR
id IN (SELECT was_first_id FROM parent WHERE id IN (SELECT first_id FROM child WHERE id=119896 UNION ALL
SELECT second_id FROM child WHERE id=119896 UNION ALL
SELECT third_id FROM child WHERE id=119896)) OR
id IN (SELECT was_second_id FROM parent WHERE id IN (SELECT first_id FROM child WHERE id=119896 UNION ALL
SELECT second_id FROM child WHERE id=119896 UNION ALL
SELECT third_id FROM child WHERE id=119896)) OR
id IN (SELECT was_third_id FROM parent WHERE id IN (SELECT first_id FROM child WHERE id=119896 UNION ALL
SELECT second_id FROM child WHERE id=119896 UNION ALL
SELECT third_id FROM child WHERE id=119896));
I was hoping to make it so that the 3 queries that are combined in the UNION ALL could be defined in a WITH and then re-used to simplify the query, and it would be nice if it improved performance as well.
Is there a good way to do this?
I would suggest
SELECT *
FROM parent
WHERE (
SELECT id IN (first_id, second_id, third_id)
FROM child
WHERE id = 119896
) OR (
SELECT id IN (was_first_id, was_second_id, was_third_id)
FROM parent
WHERE (
SELECT id IN (first_id, second_id, third_id)
FROM child
WHERE id = 119896
)
);
(I don't know about performance - as per comments, it's really bad)
An alternative, using the CTEs you suggested, would be
WITH child_ids AS (
SELECT UNNEST(ARRAY[first_id, second_id, third_id]) AS id
FROM child
WHERE id = 119896
), all_ids AS (
SELECT UNNEST(ARRAY[was_first_id, was_second_id, was_third_id]) AS id
FROM parent
JOIN child_ids USING (id) -- same as: WHERE id IN (SELECT id FROM child_ids)
UNION
TABLE child_ids
)
SELECT *
FROM parent
JOIN all_ids USING (id) -- same as: WHERE id IN (SELECT id FROM all_ids)
Based on the second solution by #Bergi, this also works as well:
WITH in_child(id) AS (
SELECT first_id FROM child WHERE id=119896 UNION ALL
SELECT second_id FROM child WHERE id=119896 UNION ALL
SELECT third_id FROM child WHERE id=119896
),
in_parent(id) AS (
(SELECT id FROM in_child) UNION ALL
(SELECT was_first_id FROM parent WHERE id IN (SELECT id FROM in_child)) UNION ALL
(SELECT was_second_id FROM parent WHERE id IN (SELECT id FROM in_child)) UNION ALL
(SELECT was_third_id FROM parent WHERE id IN (SELECT id FROM in_child))
)
SELECT * FROM parent WHERE
id IN (SELECT id FROM in_parent);

postgres hierarchy - count of child levels and sort by date of children or grandchildren

I would like to know how to write a postgres subquery so that the following table example will output what I need.
id parent_id postdate
1   -1 2015-03-10
2     1 2015-03-11 (child level 1)
3     1 2015-03-12 (child level 1)
4     3 2015-03-13 (child level 2)
5    -1 2015-03-14
6    -1 2015-03-15
7     6 2015-03-16 (child level 1)
If I want to sort all the root ids by child level 1 with a count of children(s) from the parent, the output would be something like this
id count  date
6   2    2015-03-15
1   4    2015-03-10
5   1    2015-03-14
The output is sorted by postdate based on the root's child. The 'date' being outputted is the date of the root's postdate. Even though id#5 has a more recent postdate, the rootid#6's child (id#7) has the most recent postdate because it is being sorted by child's postdate. id#5 doesnt have any children so it just gets placed at the end, sorted by date. The 'count' is the number children(child level 1), grandchildren(child level 2) and itself (root). For instance, id #2,#3,#4 all belong to id#1 so for id#1, the count would be 4.
My current subquery thus far:
SELECT p1.id,count(p1.id),p1.postdate
FROM mytable p1
LEFT JOIN mytable c1 ON c1.parent_id = p1.id AND p1.parent_id = -1
LEFT JOIN mytable c2 ON c2.parent_id = c1.id AND p1.parent_id = -1
GROUP BY p1.id,c1.postdate,p1.postdate
ORDER by c1.postdate DESC,p1.postdate DESC
create table mytable ( id serial primary key, parent_id int references mytable, postdate date );
create index mytable_parent_id_idx on mytable (parent_id);
insert into mytable (id, parent_id, postdate) values (1, null, '2015-03-10');
insert into mytable (id, parent_id, postdate) values (2, 1, '2015-03-11');
insert into mytable (id, parent_id, postdate) values (3, 1, '2015-03-12');
insert into mytable (id, parent_id, postdate) values (4, 3, '2015-03-13');
insert into mytable (id, parent_id, postdate) values (5, null, '2015-03-14');
insert into mytable (id, parent_id, postdate) values (6, null, '2015-03-15');
insert into mytable (id, parent_id, postdate) values (7, 6, '2015-03-16');
with recursive recu as (
select id as parent, id as root, null::date as child_postdate
from mytable
where parent_id is null
union all
select r.parent, mytable.id, mytable.postdate
from recu r
join mytable
on parent_id = r.root
)
select m.id, c.cnt, m.postdate, c.max_child_date
from mytable m
join ( select parent, count(*) as cnt, max(child_postdate) as max_child_date
from recu
group by parent
) c on c.parent = m.id
order by c.max_child_date desc nulls last, m.postdate desc;
You'll need a recursive query to count the elements in the subtrees:
WITH RECURSIVE opa AS (
SELECT id AS par
, id AS moi
FROM the_tree
WHERE parent_id IS NULL
UNION ALL
SELECT o.par AS par
, t.id AS moi
FROM opa o
JOIN the_tree t ON t.parent_id = o.moi
)
SELECT t.id
, c.cnt
, t.postdate
FROM the_tree t
JOIN ( SELECT par, COUNT(*) AS cnt
FROM opa o
GROUP BY par
) c ON c.par = t.id
ORDER BY t.id
;
UPDATE (it appears the OP also wants the maxdate per tree)
-- The same, but also select the postdate
-- --------------------------------------
WITH RECURSIVE opa AS (
SELECT id AS par
, id AS moi
, postdate AS postdate
FROM the_tree
WHERE parent_id IS NULL
UNION ALL
SELECT o.par AS par
, t.id AS moi
-- , GREATEST(o.postdate,t.postdate) AS postdate
, t.postdate AS postdate
FROM opa o
JOIN the_tree t ON t.parent_id = o.moi
)
SELECT t.id
, c.cnt
, t.postdate
, c.maxdate
FROM the_tree t
JOIN ( SELECT par, COUNT(*) AS cnt
, MAX(o.postdate) AS maxdate -- and obtain the max()
FROM opa o
GROUP BY par
) c ON c.par = t.id
ORDER BY c.maxdate, t.id
;
After looking at everyone's code, I created the subquery I needed. I can use PHP to vary the 'case when' code depending on the user's sort selection. For instance, the code below will sort the root nodes based on child level 1's postdate.
with recursive cte as (
select id as parent, id as root, null::timestamp as child_postdate,0 as depth
from mytable
where parent_id = -1
union all
select r.parent, mytable.id, mytable.postdate,depth+1
from cte r
join mytable
on parent_id = r.root
)
select m.id, c.cnt, m.postdate
from ssf.dtb_021 m
join ( select parent, count(*) as cnt, max(child_postdate) as max_child_date,depth
from cte
group by parent,depth
) c on c.parent = m.id
order by
case
when depth=2 then 1
when depth=1 then 2
else 0
end DESC,
c.max_child_date desc nulls last, m.postdate desc;
select
p.id,
(1+c.n) as parent_post_plus_number_of_subposts,
p.postdate
from
table as p
inner join
(
select
parent_id, count(*) as n, max(postdate) as _postdate
from table
group by parent_id
) as c
on p.id = c.parent_id
where p.parent_id = -1
order by c._postdate desc

How to use a Table type in query

I have 9000 row in News table and use this code for selecting 20 from it:
Select *
From (
Select *, ROW_NUMBER() OVER (ORDER BY DateSend DESC) AS Num
From News
Where SubjectID in(Select MenuSubject.SubjectID
From MenuSubject inner join Menu on MenuSubject.MenuID = Menu.MenuID)
) as myTable
where myTable.Num BETWEEN 100 and 120
But time is 28 second spent reading! Also, I test this query with out join table and get result at 1 second.
So, I want use Table type for select join table and use this in query. I made new Table type using the following code:
DECLARE #MyTable2 IntListTable
Insert Into #MyTable2
Select MenuSubject.SubjectID
From MenuSubject inner join Menu on MenuSubject.MenuID = Menu.MenuID
Select *
From (
Select *, ROW_NUMBER() OVER (ORDER BY DateSend DESC) AS Num
From News
Where SubjectID in #MyTable2
) as myTable
where myTable.Num BETWEEN 100 and 120
But get Error in
SubjectID in #MyTable2
Error:
Incorrect syntax near '#MyTable2'.
Edit:
I test my code with:
Select myTable.Title
or use this code instead join table:
Where SubjectID in(13,14,20,21,25,24,26,24,28,29,30,54,55,60,47,98,99,65,14,20,33,666,987,254)
get result at 1 second.
but use this code in query:
Select myTable.MoreText
time is 28 second spent reading!. why!?
Try this,
Select x.Num
From (
Select *, ROW_NUMBER() OVER (ORDER BY DateSend DESC) AS Num
From News
Where SubjectID in(Select MenuSubject.SubjectID
From MenuSubject inner join Menu on MenuSubject.MenuID = Menu.MenuID)
) x
where x.Num <21
WITH myTempTable as (Select MenuSubject.SubjectID
From MenuSubject inner join Menu on MenuSubject.MenuID = Menu.MenuID)
Select *
From (
Select *, ROW_NUMBER() OVER (ORDER BY DateSend DESC) AS Num
From News
Where SubjectID in (SELECT SubjectID FROM myTempTable)
) as myTable
where myTable.Num BETWEEN 100 and 120
You can try above query.
There is absolutely no need for a User-Defined Table Type in this query. It adds work but no actual benefit.
The problem is most likely the fact that you are using an IN list as those translate out to be an OR condition for each of the values. But an IN list isn't needed either.
This query can actually be simplified by rethinking it in terms of an INNER JOIN, which should be better as it will allow the Query Optimizer to do its job.
SELECT *
FROM (
SELECT nw.*, ROW_NUMBER() OVER (ORDER BY DateSend DESC) AS [Num]
FROM News nw
INNER JOIN (
MenuSubject
INNER JOIN Menu
ON MenuSubject.MenuID = Menu.MenuID
) ON MenuSubject.SubjectID = nw.SubjectID
) AS myTable
WHERE myTable.Num BETWEEN 100 AND 120;
One final simplification that can be made, though I doubt it is needed here since 9000 rows is almost no data at all, is to first dump the results to a local temporary table and then use that in the INNER JOIN:
CREATE TABLE #Subjects
(
SubjectID INT NOT NULL -- PRIMARY KEY -- test with and without PK to see if it helps
);
INSERT INTO #Subjects (SubjectID)
SELECT MenuSubject.SubjectID
FROM MenuSubject
INNER JOIN Menu
ON Menu.MenuID = MenuSubject.MenuID;
SELECT *
FROM (
SELECT nw.*, ROW_NUMBER() OVER (ORDER BY DateSend DESC) AS [Num]
FROM News nw
INNER JOIN #Subjects sub
ON sub.SubjectID = nw.SubjectID
) AS myTable
WHERE myTable.Num BETWEEN 100 AND 120;

Using two different where clauses

I would like to know how to use a different WHERE clause based on a CASE or IF. I'd prefer a CASE, as the rest of the statement is complex, and I don't like the idea of that complexity being in two places with only a minor difference. However, I know cases are only used for values. I've replicated a simple version of my issue below.
Essentially, I have three tables. The first contains the master information (MasterTable). The second contains a one-to-many relationship belonging to the master table (Table1). The third is a list of selectors indicating which of the records in Table1 are to be used in this instance. I want the most recent record of Table2 to drive what is selected from Table1, with precedence given to SubID over OrderNum.
MasterTable | MasterID, OtherInfo
Table1 | T1UniqueId, MasterID, SubID, Text, OrderNum
Table2 | T2UniqueId, MasterID, SubID, OrderNum, Date
SELECT MasterID, OtherInfo, SubID
FROM MasterTable
OUTER APPLY(
SELECT TOP 1 SubID FROM Table1
WHERE Table1.MasterID=MasterTable.MasterID
CASE
WHEN
(
SELECT TOP 1 SubID FROM Table2
WHERE Table2.MasterID=MasterTable.MasterID
ORDER BY Date DESC
) Is NULL
THEN Table1.OrderNum=
(
SELECT TOP 1 OrderNum
FROM Table2
WHERE Table2.MasterId=MasterTable.MasterId
ORDER BY Date DESC
)
ELSE Table1.SubId=
(
SELECT TOP 1 SubId
FROM Table2
WHERE Table2.MasterId=MasterTable.MasterId
ORDER BY Date DESC
)
END
) SubData
One quick rewrite of this would result in the following:
IF ((SELECT TOP 1 SubID FROM Table2 WHERE Table2.MasterID=MasterTable.MasterID ORDER BY Date DESC) IS NULL)
BEGIN
SELECT
MasterID, OtherInfo, SubID
FROM MasterTable
OUTER APPLY(
SELECT TOP 1 SubID FROM Table1
WHERE
Table1.MasterID=MasterTable.MasterID
AND Table1.OrderNum =
(
SELECT TOP 1 OrderNum
FROM Table2
WHERE Table2.MasterId=MasterTable.MasterId
ORDER BY Date DESC
)
) SubData
END
ELSE
BEGIN
SELECT
MasterID, OtherInfo, SubID
FROM MasterTable
OUTER APPLY(
SELECT TOP 1 SubID FROM Table1
WHERE
Table1.MasterID=MasterTable.MasterID
AND Table1.SubId=
(
SELECT TOP 1 SubId
FROM Table2
WHERE Table2.MasterId=MasterTable.MasterId
ORDER BY Date DESC
)
) SubData
END
But as you noted that makes it look ugly, because you now have that complexity in two places...
I guess you could also formulate it this way (untested, but this should keep your complex logic in one place):
SELECT
MasterID, OtherInfo, SubID
FROM MasterTable
OUTER APPLY(
SELECT TOP 1 SubID FROM Table1
WHERE Table1.MasterID=MasterTable.MasterID
AND
(
(
(
SELECT
TOP 1 SubID
FROM Table2
WHERE Table2.MasterID=MasterTable.MasterID
ORDER BY Date DESC
) IS NULL
AND
Table1.OrderNum =
(
SELECT TOP 1 OrderNum
FROM Table2
WHERE Table2.MasterId=MasterTable.MasterId
ORDER BY Date DESC
)
)
OR
(
Table1.SubId =
(
SELECT
TOP 1 SubId
FROM Table2
WHERE Table2.MasterId=MasterTable.MasterId
ORDER BY Date DESC
)
)
)
) SubData
If SubID and OrderNum in Table1 and Table2 are the same you can utilize simple query with nested select statement:
select m.MasterID, m.OtherInfo, (
select top 1 coalesce(t2.SubID, t2.OrderNum) from Table2 t2
where t2.MasterID = m.MasterID order by date desc
) as SubID
from MasterTable m;

Returning the parent/ child relationship on a self-joining table

I need to be able to return a list of all children given a parent Id at all levels using SQL.
The table looks something like this:
ID ParentId Name
---------------------------------------
1 null Root
2 1 Child of Root
3 2 Child of Child of Root
Give an Id of '1', how would I return the entire list...? There is no limitation on the depth of the nesting either...
Thanks,
Kieron
To get all children for a given #ParentId stored in that manner you could use a recursive CTE.
declare #ParentId int
--set #ParentId = 1
;WITH T AS
(
select 1 AS ID,null AS ParentId, 'Root' as [Name] union all
select 2,1,'Child of Root' union all
select 3,2,'Child of Child of Root'
),
cte AS
(
SELECT ID, ParentId, Name
FROM T
WHERE ParentId = #ParentId OR (ParentId IS NULL AND #ParentId IS NULL)
UNION ALL
SELECT T.ID, T.ParentId, T.Name
FROM T
JOIN cte c ON c.ID = T.ParentId
)
SELECT ID, ParentId, Name
FROM cte
OPTION (MAXRECURSION 0)