A real recursion with CTE? - tsql

I just discovered CTE this afternoon and after enjoying them during 2 hours, I realized that they did not perform a common recursion like we all learned in any other language.
What I mean is, I always see recursion like a tree search. So i was expecting CTE to go all the way down to the first leaf it founds, but no. He works by layers. It begins by the head, then all the branches, then all the sub-branches, etc... and THEN the leaves.
Is there a way to make it search differently ? Perhaps did I miss something...
I work on SQL Server 2005 (non, I can't change for the 2008)
To make things clear, I don't want :
team1
team2
team3
team1-1
team3-1
team1-2
but
team1
team1-1
team1-2
team2
team3
team3-1
Thanks

You can build a column to sort by when you do the recursion.
Something like this:
declare #t table
(
ID int,
ParentID int,
Name varchar(10)
);
insert into #T values
(1, null, 'team1'),
(2, null, 'team2'),
(3, null, 'team3'),
(4, 1, 'team1-1'),
(5, 1, 'team1-2'),
(6, 3, 'team3-1');
with C as
(
select T.ID,
T.ParentID,
T.Name,
cast(right(100000 + row_number() over(order by T.ID), 5) as varchar(max)) as Sort
from #T as T
where T.ParentID is null
union all
select T.ID,
T.ParentID,
T.Name,
C.Sort+right(100000 + row_number() over(order by T.ID), 5)
from #T as T
inner join C
on T.ParentID = C.ID
)
select *
from C
order by Sort
Result:
ID ParentID Name Sort
----------- ----------- ---------- ------------
1 NULL team1 00001
4 1 team1-1 0000100001
5 1 team1-2 0000100002
2 NULL team2 00002
3 NULL team3 00003
6 3 team3-1 0000300001

Related

Unpivot Columns with Most Recent Record

Student Records are updated for subject and update date. Student can be enrolled in one or multiple subjects. I would like to get each student record with most subject update date and status.
CREATE TABLE Student
(
StudentID int,
FirstName varchar(100),
LastName varchar(100),
FullAddress varchar(100),
CityState varchar(100),
MathStatus varchar(100),
MUpdateDate datetime2,
ScienceStatus varchar(100),
SUpdateDate datetime2,
EnglishStatus varchar(100),
EUpdateDate datetime2
);
Desired query output, I am using CTE method but trying to find alternative and better way.
SELECT StudentID, FirstName, LastName, FullAddress, CityState, [SubjectStatus], UpdateDate
FROM Student
;WITH orginal AS
(SELECT * FROM Student)
,Math as
(
SELECT DISTINCT StudentID, FirstName, LastName, FullAddress, CityState,
ROW_NUMBER OVER (PARTITION BY StudentID, MathStatus ORDER BY MUpdateDate DESC) as rn
, _o.MathStatus as SubjectStatus, _o.MupdateDate as UpdateDate
FROM original as o
left join orignal as _o on o.StudentID = _o.StudentID
where _o.MathStatus is not null and _o.MUpdateDate is not null
)
,Science AS
(
...--Same as Math
)
,English AS
(
...--Same As Math
)
SELECT * FROM Math WHERE rn = 1
UNION
SELECT * FROM Science WHERE rn = 1
UNION
SELECT * FROM English WHERE rn = 1
First: storing data in a denormalized form is not recommended. Some data model redesign might be in order. There are multiple resources about data normalization available on the web, like this one.
Now then, I made some guesses about how your source table is populated based on the query you wrote. I generated some sample data that could show how the source data is created. Besides that I also reduced the number of columns to reduce my typing efforts. The general approach should still be valid.
Sample data
create table Student
(
StudentId int,
StudentName varchar(15),
MathStat varchar(5),
MathDate date,
ScienceStat varchar(5),
ScienceDate date
);
insert into Student (StudentID, StudentName, MathStat, MathDate, ScienceStat, ScienceDate) values
(1, 'John Smith', 'A', '2020-01-01', 'B', '2020-05-01'),
(1, 'John Smith', 'A', '2020-01-01', 'B+', '2020-06-01'), -- B for Science was updated to B+ month later
(2, 'Peter Parker', 'F', '2020-01-01', 'A', '2020-05-01'),
(2, 'Peter Parker', 'A+', '2020-03-01', 'A', '2020-05-01'), -- Spider-Man would never fail Math, fixed...
(3, 'Tom Holland', null, null, 'A', '2020-05-01'),
(3, 'Tom Holland', 'A-', '2020-07-01', 'A', '2020-05-01'); -- Tom was sick for Math, but got a second chance
Solution
Your question title already contains the word unpivot. That word actually exists in T-SQL as a keyword. You can learn about the unpivot keyword in the documentation. Your own solution already contains common table expression, these constructions should look familiar.
Steps:
cte_unpivot = unpivot all rows, create a Subject column and place the corresponding values (SubjectStat, Date) next to it with a case expression.
cte_recent = number the rows to find the most recent row per student and subject.
Select only those most recent rows.
This gives:
with cte_unpivot as
(
select up.StudentId,
up.StudentName,
case up.[Subject]
when 'MathStat' then 'Math'
when 'ScienceStat' then 'Science'
end as [Subject],
up.SubjectStat,
case up.[Subject]
when 'MathStat' then up.MathDate
when 'ScienceStat' then up.ScienceDate
end as [Date]
from Student s
unpivot ([SubjectStat] for [Subject] in ([MathStat], [ScienceStat])) up
),
cte_recent as
(
select cu.StudentId, cu.StudentName, cu.[Subject], cu.SubjectStat, cu.[Date],
row_number() over (partition by cu.StudentId, cu.[Subject] order by cu.[Date] desc) as [RowNum]
from cte_unpivot cu
)
select cr.StudentId, cr.StudentName, cr.[Subject], cr.SubjectStat, cr.[Date]
from cte_recent cr
where cr.RowNum = 1;
Result
StudentId StudentName Subject SubjectStat Date
----------- --------------- ------- ----------- ----------
1 John Smith Math A 2020-01-01
1 John Smith Science B+ 2020-06-01
2 Peter Parker Math A+ 2020-03-01
2 Peter Parker Science A 2020-05-01
3 Tom Holland Math A- 2020-07-01
3 Tom Holland Science A 2020-05-01

Is there a way to implement a many-to-one relationship in DB2?

I need to create a data structure like this:
Table 1
Code, Value, Offer_ID
I am creating a service that, for a given combination of "Code" and "Value", must return an Offer_ID that I preconfigured.
For example:
Code Value Offer_ID
------ ------- ----------
Age 30 OFF1
Age 30 OFF2
Province RM OFF2
Age 40 OFF3
Province TO OFF3
Age 40 OFF4
Province TO OFF4
Operator TIM OFF4
The calling service always calls me passing the Age, Province and operator values.
I have to look in this table if I find a specific Offer_ID for the three values ​​together (as OFF4), or for 2 (as OFF3) or for Age which is the only mandatory (OFF1).
So if the client passes me Province BO and operator WIND I have to return OFF1
How can I do ? How can I structure the tables and the query?
I hope I was able to expose the problem ...
Thanks 1000 to those who help me ... we are going crazy ... !!!
Try this:
with tab (age, province, operator, offer_id) as (values
(30, null, null, 'OFF1')
, (30, 'RM', null, 'OFF2')
, (40, 'TO', null, 'OFF3')
, (40, 'TO', 'TIM', 'OFF4')
)
, op_inp (age, province, operator) as (values
--(40, 'TO', 'TIM') --'OFF4'
(40, 'TO', 'VODAFONE') --'OFF3'
--(30, 'RM', 'VODAFONE') --'OFF2'
--(30, 'TO', 'VODAFONE') --'OFF1'
)
select offer_id /*Just for info*/, order_flag
from
(
select t.*, 3 as order_flag
from tab t
join op_inp o on o.age=t.age and o.province=t.province and o.operator=t.operator
union all
select t.*, 2 as order_flag
from tab t
join op_inp o on o.age=t.age and o.province=t.province --and t.operator is null
union all
select t.*, 1 as order_flag
from tab t
join op_inp o on o.age=t.age --and t.province is null and t.operator is null
)
order by order_flag desc
fetch first 1 row only
;

Postgresql dense ranking to start at 2 if there is an initial tie at 1

So i have a table and a query that ranks the cost of items and doesn't allows ties with position 1, if there is a tie at position 1 the ranking starts at 2.
Here is the schema with a sample data
CREATE TABLE applications
(id int, name char(10), cost int);
INSERT INTO applications
(id, name, cost)
VALUES
(1, 'nfhfjs', 10),
(2, 'oopdld', 20),
(3, 'Wedass', 14),
(4, 'djskck', 22),
(5, 'laookd', 25),
(6, 'mfjjf', 25),
(7, 'vfhgg', 28),
(8, 'nvopq', 29),
(9, 'nfhfj', 56),
(10, 'voapp', 56);
Here is the query
WITH start_tie AS (
SELECT
DENSE_RANK() OVER(ORDER BY cost DESC) cost_rank,
lead(cost,1) OVER (ORDER BY cost DESC) as next_app_cost
FROM
applications LIMIT 1
)
SELECT
*,
DENSE_RANK() OVER(ORDER BY cost DESC) cost_rank,
(CASE start_tie.cost_rank WHEN start_tie.next_app_cost THEN cost_rank+1 ELSE cost_rank END) AS right_cost_rank
FROM
applications;
my expected result is
id name cost cost_rank
10 voapp 56 2
9 nfhfj 56 2
8 nvopq 29 3
7 vfhgg 28 4
6 mfjjf 25 5
5 laookd 25 5
4 djskck 22 6
2 oopdld 20 7
3 Wedass 14 8
1 nfhfjs 10 9
Please modify the query to achieve the result.
SQL FIDDLE
All you need to do is to check if the highest cost is the same as the second-highest cost. And if that is the case, add 1 to all rank values:
with start_tie as (
select case
when cost = lead(cost) over (order by cost desc) then 1
else 0
end as tie_offset
from applications
order by cost desc
limit 1
)
select *,
dense_rank() over (order by cost desc) + (select tie_offset from start_tie) cost_rank
from applications;
Example: http://rextester.com/EKSLJK65530
If the number of ties defines the offset to be used for the "new" ranking, the offset could be calculated using this:
with start_tie as (
select count(*) - 1 as tie_offset
from applications a1
where cost = (select max(cost) from applications)
)
select *,
dense_rank() over(order by cost desc) + (select tie_offset from start_tie) cost_rank
from applications;
No tie at first, means more than one with rank 1
replace r.cost_rank+x.c-1 with r.cost_rank+1 if fixed start at 2 rank to regardless of how many are in tie ranks are
WITH r AS (
SELECT
*
,DENSE_RANK() OVER(ORDER BY cost DESC) cost_rank
FROM
applications
), x as (select count(*) as c from r where cost_rank=1)
SELECT
r.*, (CASE WHEN 1<x.c THEN r.cost_rank+x.c-1 ELSE r.cost_rank END) as fixed
FROM
r,x;

Hierarchical query rollup Rollup

I have the following table:
parent_id child_id child_class
1 2 1
1 3 1
1 4 2
2 5 2
2 6 2
Parent_id represents a folder id. Child id represents either a child folder (where child_class=1) or child file (where child_class=2).
I'd like to get a rollup counter (bottom up) of all files only (child_class=2) the following way. for example if C is a leaf folder (no child folders) with 5 files, and B is a parent folder of C that has 4 files in it, the counter on C should say 5 and the counter on B should say 9 (=5 from C plus 4 files in B) and so forth recursively going bottom up taking into consideration sibling folders etc.
In the example above I expect the results below (notice 3 is a child folder with no files in it):
parent_id FilesCounter
3 0
2 2
1 3
I prefer an SQL query for performance but function is also possible.
I tried mixing hirarchical query with rollup (sql 2008 r2) with no success so far.
Please advise.
This CTE should do the trick... Here is the SQLFiddle.
SELECT parent_id, child_id, child_class,
(SELECT COUNT(*) FROM tbl a WHERE a.parent_id = e.parent_id AND child_class <> 1) AS child_count
INTO tbl2
FROM tbl e
;WITH CTE (parent_id, child_id, child_class, child_count)
AS
(
-- Start with leaf nodes
SELECT parent_id, child_id, child_class, child_count
FROM tbl2
WHERE child_id NOT IN (SELECT parent_id from tbl)
UNION ALL
-- Recursively go up the chain
SELECT e.parent_id, e.child_id, e.child_class, e.child_count + d.child_count
FROM tbl2 e
INNER JOIN CTE AS d
ON e.child_id = d.parent_id
)
-- Statement that executes the CTE
SELECT FOLDERS.parent_id, max(ISNULL(child_count,0)) FilesCounter
FROM (SELECT parent_id FROM tbl2 WHERE parent_id NOT IN (select child_id from tbl2)
UNION
SELECT child_id FROM tbl2 WHERE child_class = 1) FOLDERS
LEFT JOIN CTE ON FOLDERS.parent_id = CTE.parent_id
GROUP BY FOLDERS.parent_id
Zak's answer was close, but the root folder did not rollup well. The following does the work:
with par_child as (
select 1 as parent_id, 2 as child_id, 1 as child_class
union all select 1, 3, 1
union all select 1, 4, 2
union all select 2, 5, 1
union all select 2, 6, 2
union all select 2, 10, 2
union all select 3, 11, 2
union all select 3, 7 , 2
union all select 5, 8 , 2
union all select 5, 9 , 2
union all select 5, 12, 1
union all select 5, 13, 1
)
, child_cnt as
(
select parent_id as root_parent_id, parent_id, child_id, child_class, 1 as lvl from par_child union all
select cc.root_parent_id, pc.parent_id, pc.child_id, pc.child_class, cc.lvl + 1 as lvl from
par_child pc join child_cnt cc on (pc.parent_id=cc.child_id)
),
distinct_folders as (
select distinct child_id as folder_id from par_child where child_class=1
)
select root_parent_id, count(child_id) as cnt from child_cnt where child_class=2 group by root_parent_id
union all
select folder_id, 0 from distinct_folders df where not exists (select 1 from par_child pc where df.folder_id=pc.parent_id)

Find duplicate row "details" in table

OrderId OrderCode Description
-------------------------------
1 Z123 Stuff
2 ABC999 Things
3 Z123 Stuff
I have duplicates in a table like the above. I'm trying to get a report of which Orders are duplicates, and what Order they are duplicates of, so I can figure out how they got into the database.
So ideally I'd like to get an output something like;
OrderId IsDuplicatedBy
-------------------------
1 3
3 1
I can't work out how to code this in SQL.
You can use the same table twice in one query and join on the fields you need to check against. T1.OrderID <> T2.OrderID is needed to not find a duplicate for the same row.
declare #T table (OrderID int, OrderCode varchar(10), Description varchar(50))
insert into #T values
(1, 'Z123', 'Stuff'),
(2, 'ABC999', 'Things'),
(3, 'Z123', 'Stuff')
select
T1.OrderID,
T2.OrderID as IsDuplicatedBy
from #T as T1
inner join #T as T2
on T1.OrderCode = T2.OrderCode and
T1.Description = T2.Description and
T1.OrderID <> T2.OrderID
Result:
OrderID IsDuplicatedBy
1 3
3 1