Displaying related child records on a single row - tsql

in DBs I'm familiar with, a query listing parents and children would look like this:
Parfirst ParLast Childfirst
Mary Smith Sally
Mary Smith Jim
Mary Smith Kim
However, I've been asked to create a report that looks like this:
Parfirst ParLast Child1 Child2 Child3 Child4
Mary Smith Sally Jim Kim
I'm at a loss as to how to accomplish this. Any suggestions are welcome.

You can do this using STRING_AGG function (if you are using SQL Server 2017 or newer) or to do some ugly XML magic using FOR XML:
SELECT t.ParFirst, t.ParLast, LEFT(t.Children, Len(t.Children)-1) As Children
FROM
(
SELECT DISTINCT p.ParFirst, p.ParLast,
(
SELECT c.ChildFirst + ',' AS [text()]
FROM dbo.Children c
WHERE c.ParentId = p.ParentId
ORDER BY c.ChildFirst
FOR XML PATH ('')
) l
FROM dbo.Parents p
) t
SELECT p.ParFirst, p.ParLast, STRING_ACC(c.ChildFirst, ',') as Children
FROM dbo.Parents p
LEFT JOIN dbo.Children c on c.ParentId = p.ParentId
GROUP BY p.ParFirst, p.ParLast
If you want to include every child name in it's own column, you must define these columns in your query and you won't be able to return any additional child, if there is more than the number of columns you defined. Use ROW_NUMBER like this:
select p.ParFirst, p.ParLast, c1.ChildFirst, c2.ChildFirst, c3.ChildFirst
from dbo.Parents p
outer apply ( select ChildFirst from (select ChildFirst, row_number() over (order by ChildFirst) as rowNo from dbo.Children c where c.ParentId = p.ParentId) t1 where t1.rowNo = 1 ) c1
outer apply ( select ChildFirst from (select ChildFirst, row_number() over(order by ChildFirst) as rowNo from dbo.Children c where c.ParentId = p.ParentId) t2 where t2.rowNo = 2 ) c2
outer apply ( select ChildFirst from (select ChildFirst, row_number() over(order by ChildFirst) as rowNo from dbo.Children c where c.ParentId = p.ParentId) t3 where t3.rowNo = 3 ) c3

Related

Join with adding new row

I have a query which returns next table with name first_table:
Name
ID
First
1
Second
2
And I need to join another table named second_table:
ID
ParentID
22
1
33
323
By the columns first_table."ID" = second_table."ParentID", so if first_table_id exists, I need to add one more row with its first_table."Name" value
So the result should be:
Name
ID
First
1
First
22
Second
2
You can do something like this (result here)
select t1.name,t1.id
from t1 join t2 on t1.id = t2.parent_id
union
select t1.name,t2.id
from t1 join t2 on t1.id = t2.parent_id
union
select t1.name,t1.id
from t1
where t1.id not in (select parent_id from t2)
order by name,id

Cascading sum hierarchy using recursive cte

I'm trying to perform recursive cte with postgres but I can't wrap my head around it. In terms of performance issue there are only 50 items in TABLE 1 so this shouldn't be an issue.
TABLE 1 (expense):
id | parent_id | name
------------------------------
1 | null | A
2 | null | B
3 | 1 | C
4 | 1 | D
TABLE 2 (expense_amount):
ref_id | amount
-------------------------------
3 | 500
4 | 200
Expected Result:
id, name, amount
-------------------------------
1 | A | 700
2 | B | 0
3 | C | 500
4 | D | 200
Query
WITH RECURSIVE cte AS (
SELECT
expenses.id,
name,
parent_id,
expense_amount.total
FROM expenses
WHERE expenses.parent_id IS NULL
LEFT JOIN expense_amount ON expense_amount.expense_id = expenses.id
UNION ALL
SELECT
expenses.id,
expenses.name,
expenses.parent_id,
expense_amount.total
FROM cte
JOIN expenses ON expenses.parent_id = cte.id
LEFT JOIN expense_amount ON expense_amount.expense_id = expenses.id
)
SELECT
id,
SUM(amount)
FROM cte
GROUP BY 1
ORDER BY 1
Results
id | sum
--------------------
1 | null
2 | null
3 | 500
4 | 200
You can do a conditional sum() for only the root row:
with recursive tree as (
select id, parent_id, name, id as root_id
from expense
where parent_id is null
union all
select c.id, c.parent_id, c.name, p.root_id
from expense c
join tree p on c.parent_id = p.id
)
select e.id,
e.name,
e.root_id,
case
when e.id = e.root_id then sum(ea.amount) over (partition by root_id)
else amount
end as amount
from tree e
left join expense_amount ea on e.id = ea.ref_id
order by id;
I prefer doing the recursive part first, then join the related tables to the result of the recursive query, but you could do the join to the expense_amount also inside the CTE.
Online example: http://rextester.com/TGQUX53703
However, the above only aggregates on the top-level parent, not for any intermediate non-leaf rows.
If you want to see intermediate aggregates as well, this gets a bit more complicated (and is probably not very scalable for large results, but you said your tables aren't that big)
with recursive tree as (
select id, parent_id, name, 1 as level, concat('/', id) as path, null::numeric as amount
from expense
where parent_id is null
union all
select c.id, c.parent_id, c.name, p.level + 1, concat(p.path, '/', c.id), ea.amount
from expense c
join tree p on c.parent_id = p.id
left join expense_amount ea on ea.ref_id = c.id
)
select e.id,
lpad(' ', (e.level - 1) * 2, ' ')||e.name as name,
e.amount as element_amount,
(select sum(amount)
from tree t
where t.path like e.path||'%') as sub_tree_amount,
e.path
from tree e
order by path;
Online example: http://rextester.com/MCE96740
The query builds up a path of all IDs belonging to a (sub)tree and then uses a scalar sub-select to get all child rows belonging to a node. That sub-select is what will make this quite slow as soon as the result of the recursive query can't be kept in memory.
I used the level column to create a "visual" display of the tree structure - this helps me debugging the statement and understanding the result better. If you need the real name of an element in your program you would obviously only use e.name instead of pre-pending it with blanks.
I could not get your query to work for some reason. Here's my attempt that works for the particular table you provided (parent-child, no grandchild) without recursion. SQL Fiddle
--- step 1: get parent-child data together
with parent_child as(
select t.*, amount
from
(select e.id, f.name as name,
coalesce(f.name, e.name) as pname
from expense e
left join expense f
on e.parent_id = f.id) t
left join expense_amount ea
on ea.ref_id = t.id
)
--- final step is to group by id, name
select id, pname, sum(amount)
from
(-- step 2: group by parent name and find corresponding amount
-- returns A, B
select e.id, t.pname, t.amount
from expense e
join (select pname, sum(amount) as amount
from parent_child
group by 1) t
on t.pname = e.name
-- step 3: to get C, D we union and get corresponding columns
-- results in all rows and corresponding value
union
select id, name, amount
from expense e
left join expense_amount ea
on e.id = ea.ref_id
) t
group by 1, 2
order by 1;

Avoiding Order By in T-SQL

Below sample query is a part of my main query. I found SORT operator in below query is consuming 30% of the cost.
To avoid SORT, there is need of creation of Indexes. Is there any other way to optimize this code.
SELECT TOP 1 CONVERT( DATE, T_Date) AS T_Date
FROM TableA
WHERE ID = r.ID
AND Status = 3
AND TableA_ID >ISNULL((
SELECT TOP 1 TableA_ID
FROM TableA
WHERE ID = r.ID
AND Status <> 3
ORDER BY T_Date DESC
), 0)
ORDER BY T_Date ASC
Looks like you can use not exists rather than the sorts. I think you'll probably get a better performance boost by use a CTE or derived table instead of the a scalar subquery.
select *
from r ... left outer join
(
select ID, min(t_date) as min_date from TableA t1
where status = 3 and not exists (
select 1 from TableA t2
where t2.ID = t1.ID
and t2.status <> 3 and t2.t_date > t1.t_date
)
group by ID
) as md on md.ID = r.ID ...
or
select *
from r ... left outer join
(
select t1.ID, min(t1.t_date) as min_date
from TableA t1 left outer join TableA t2
on t2.ID = t1.ID and t2.status <> 3
where t1.status = 3 and t1.t_date < t2.t_date
group by t1.ID
having count(t2.ID) = 0
) as md on md.ID = r.ID ...
It also appears that you're relying on an identity column but it's not clear what those values mean. I'm basically ignoring it and using the date column instead.
Try this:
SELECT TOP 1 CONVERT( DATE, T_Date) AS T_Date
FROM TableA a1
LEFT JOIN (
SELECT ID, MAX(TableA_ID) AS MaxAID
FROM TableA
WHERE Status <> 3
GROUP BY ID
) a2 ON a2.ID = a1.ID AND a1.TableA_ID > coalesce(a2.MAXAID,0)
WHERE a1.ID = r.ID AND a1.Status = 3
ORDER BY T_Date ASC
The use of TOP 1 in combination with the unexplained r alias concern me. There's almost certainly a MUCH better way to get this data into your results that doesn't involve doing this in a sub query (unless this is for an APPLY operation).

Sorting rows by children?

I have this table:
CREATE TABLE items (
id SERIAL PRIMARY KEY,
data TEXT,
parent INT,
posted INT
);
Each item has a piece of data, a timestamp, and a parent. I'd like to select the top 10 root items (parent = 0), sorted by the timestamp of the most recent child.
If item #1 has a child #2 that has a child #3, #3 is considered a child of #1.
How can I do this?
EDIT:
The query has been rewritten to
first sort the child items
get the root parent id and the rank for each item
select the top 10 parents
select the details for the top 10 parents
Common Table expressions have been used to incrementally select the data following the above steps.
WITH recursive c AS
(
SELECT *
FROM seeds
UNION ALL
SELECT
T.id,
T.parent,
c.topParentID,
(c.child_level + 1),
c.child_rank
FROM items AS T
INNER JOIN c ON T.parent = c.id
WHERE T.id <> T.parent
)
, seeds AS
(
SELECT
id,
parent,
parent AS topParentID,
0 AS child_level,
rank() OVER (ORDER BY posted DESC) child_rank
FROM items
WHERE parent <> 0
ORDER BY posted DESC
)
, rank_level AS
(
SELECT DISTINCT
c2.id id,
c_ranks.min_child_rank child_rank,
c_roots.max_child_level root_level
FROM
(
SELECT
id,
MAX(child_level) max_child_level
FROM c
GROUP BY id
)
c_roots
INNER JOIN c c2 ON c_roots.id = c2.id
INNER JOIN
(
SELECT
id,
MIN(child_rank) min_child_rank
FROM c
GROUP BY id
)
c_ranks
ON c2.id = c_ranks.id
)
, top_10_parents AS
(
SELECT
c.topParentID id,
MIN(rl.child_rank) id_rank
FROM rank_level rl
INNER JOIN c ON rl.id = c.id AND c.child_level = rl.root_level
GROUP BY c.topParentID
ORDER BY MIN(rl.child_rank)
limit 10
)
SELECT
i.*
FROM
items i
INNER JOIN top_10_parents tp ON tp.id = i.id
ORDER BY tp.id_rank;
SQL Fiddle
Reference:
WITH Queries (Common Table Expressions) on PostgreSQL Manual

Aggregate similar row

Suppose I've a table like this:
NAME REF1 REF2 DRCT
A (null) Ra D1
A Rb (null) D1
A (null) Rc D2
B Rd (null) D3
B (null) Re D3
I want aggregate this table in something like:
NAME REF1 REF2 DRCT
A Rb Ra D1
A (null) Rc D2
B Rd Re D3
As you can see, i want aggregate each row with same name. I've search through COALESCE and various aggregate functions but I haven't found what i was looking for. Any idea?
Assuming that what I ask in my previous comment is true, (only null or a given value for REF1 and REF2 for each NAME, DRCT pair), this seems to work:
select NAME, M_REF1, M_REF2, DRCT
from (
select A.NAME, coalesce(A.REF1, B.REF1) m_REF1,
coalesce(A.REF2, B.REF2) m_REF2, A.REF1 A_REF1, B.REF1 B_REF1,
A.REF2 A_REF2, B.REF2 B_REF2, A.DRCT
from Table1 A JOIN Table1 B on A.NAME = B.NAME AND A.DRCT = B.DRCT)
WHERE A_REF1 = m_REF1 AND B_REF2 = m_REF2
UNION
select A.NAME, A.REF1, A.REF2, A.DRCT
FROM Table1 A JOIN
(select NAME, DRCT, COUNT(*)
from Table1
group by NAME, DRCT
HAVING COUNT(*) = 1) B ON A.NAME = B.NAME AND A.DRCT = B.DRCT;
The union is used because the rows with only one record are not included in the first SELECT.
But this is somewhat simpler, and works too:
select A.NAME, coalesce(A.REF1, B.REF1) M_REF1, coalesce(A.REF2,B.REF2) M_REF2,A.DRCT
from Table1 A LEFT OUTER JOIN Table1 B ON A.DRCT = B.DRCT AND A.NAME = B.NAME
WHERE NVL2(A.REF1,0,1) = 1 AND NVL2(B.REF1,0,1) =0
AND NVL2(A.REF2,0,1) = 0 AND NVL2(B.REF2,0,1) = 1
UNION
select A.NAME, A.REF1, A.REF2, A.DRCT
FROM Table1 A JOIN
(select NAME, DRCT, COUNT(*)
from Table1
group by NAME, DRCT
HAVING COUNT(*) = 1) B ON A.NAME = B.NAME AND A.DRCT = B.DRCT;