Getting depth-first traversal insted of breadth first in T-SQL - tsql

I have the following T-SQL function: https://gist.github.com/cwattengard/11365802
This returns data in a breadth-first traversal. Is there a simple way to make this function return its data in a depth-first traversal? I have a treeview-component that excpects this (legacy system).
I already have a similar stored procedure that returns the tree in a depth-first traversal, but it's using cursors and is really slow. (6-7 seconds as opposed to this function that takes less than a second on the same data).

I think I just had a eureka moment. If I add the Path variable already supplied by the CTE, and sort by that, I get what I want. The OrgID is a unique ID. So ordering by it would make it sort by the expected output for the user (chronologically) and be depth-first for the treeview.

http://sqlanywhere.blogspot.in/2012/10/example-recursive-union-tree-traversal.html
Here's a diagram showing the primary keys for a tree-structured table:
1
|
---------------------------------------
2 93 4 5
| | | |
-------------- ------------ -------- ------
6 7 8 9 10 11 12 13 14 15 16 17 18 19
| | | |
----- ----- ----- -----
27 26 25 24 23 22 21 20
Here's what the breadth-first and depth-first queries should return:
Breadth-First Depth-First
1 1
2 2
93 6
4 7
5 27
6 26
7 8
8 9
9 10
10 93
11 11
12 12
13 13
14 25
15 24
16 14
17 4
18 15
19 16
27 17
26 23
25 22
24 5
23 18
22 21
21 20
20 19

If you order the output by tekst will that do it?
First populate a table variable #unsorted inside the function; then finally return Select * from #unsorted order by tekst?

I know this is very late to the game, but seems like you could use hierarchyid to get a nice depth first search...
The github file referenced in the op appears to have gone missing, but the basic formula is
Put hierarchyid::GetRoot() as Foo in your CTE anchor query
Put cast (cte.Foo.ToString() + cast(row_number() over(order by ) as varchar) + '/' as hierarchyid) as Foo in your recursive query
order By Foo when you invoke the CTE
and the results come out depth first

Related

Trying to partition to remove rows where two columns don't match sql

How can I filter out rows within a group that do not have matching values in two columns?
I have a table A like:
CODE
US_ID
US_PRICE
NON_US_ID
NON_US_PRICE
5109
57
10
75
10
0206
85
11
58
11
0206
85
15
33
14
0206
85
41
22
70
T100
20
10
49
NULL
T100
20
38
64
38
Within each CODE group, I want to check whether US_PRICE = NON_US_PRICE and remove that row from the resulting table.
I tried:
SELECT *,
CASE WHEN US_PRICE != NON_US_PRICE OVER (PARTITION BY CODE) END
FROM A;
but I think I am missing something when I try to partition by CODE.
I want the resulting table to look like
CODE
US_ID
US_PRICE
NON_US_ID
NON_US_PRICE
0206
85
15
33
14
0206
85
41
22
70
T100
20
10
49
NULL
For provided sample, simple WHERE clause could produce such result:
SELECT *
FROM A
WHERE US_PRICE IS DISTINCT FROM NON_US_PRICE;
IS DISTINCT FROM handles NULLs comparing to != operator.

PostgreSQL query with UNIQUE values returned based on condition

Its an example of a table from PostgreSQL.
I learning the SQL query and cant find anything to help me pass this.
What I`m working to achieve is:
Return UNIQ(DISTINCT) values of WNR WHEN tdate >='2020-01-13 00:00:01.757000'
WNR tdate T1 T2 T3
2 '2020-01-06 00:05:23.229000' 8 18 15
2 '2020-01-06 00:05:23.725000' 11 4 7
2 '2020-01-06 00:05:31.578000' 19 12 6
3 '2020-01-13 00:00:01.655000' 9 9 3
3 '2020-01-13 00:00:01.757000' 5 11 16
3 '2020-01-13 00:00:05.778000' 16 17 16
4 '2020-01-20 00:00:11.925000' 18 13 4
4 '2020-01-20 00:00:12.177000' 18 3 15
4 '2020-01-20 00:00:12.694000' 7 12 7
5 '2020-01-27 00:00:04.860000' 19 3 14
5 '2020-01-27 00:00:05.056000' 14 18 8
5 '2020-01-27 00:00:05.107000' 18 7 14
Result expected should be 3,4,5
Thank you!
To select distinct values in Postgresql you can use DISTINCT clause.
From Postgresql documentation: SELECT DISTINCT eliminates duplicate rows from the result. SELECT DISTINCT ON eliminates rows that match on all the specified expressions. SELECT ALL (the default) will return all candidate rows, including duplicates. (See DISTINCT Clause below.)
SELECT DISTINCT WNR
FROM table_name
WHERE tdate >='2020-01-13 00:00:01.757000';

Maximum count of overlapping intervals in PostgreSQL

Suppose there is a table structured as follows:
id start end
--------------------
01 00:18 00:23
02 00:22 00:31
03 00:23 00:48
04 00:23 00:39
05 00:24 00:25
06 00:24 00:31
07 00:24 00:38
08 00:25 00:37
09 00:26 00:42
10 00:31 00:34
11 00:33 00:38
The objective is to compute the overall maximum number of rows having been active (i.e. between start and end) at any given moment in time. This would be relatively straightforward using a procedural algorithm, but I'm not sure how to do this in SQL.
According to the above example, this maximum value would be 8 and would correspond to the 00:31 timestamp where active rows were 2, 3, 4, 6, 7, 8, 9, 10 (as shown in the schema below).
Obtaining the timestamp(s) and the active rows corresponding to the maximum value is not important, all is needed is the actual value itself.
I was thinking of at first, using generate_series() to iterate every minute and get the count of active intervals for each, then take the max of this.
You can improve your idea and iterate only "start" values from the table because one of "start" points includes in time interval with maximum active rows.
select id, start,
(select count(1) from tbl t where tbl.start between t.start and t."end")
from tbl;
Here results
id start count
-----------------
1 00:18:00 1
2 00:22:00 2
3 00:23:00 4
4 00:23:00 4
5 00:24:00 6
6 00:24:00 6
7 00:24:00 6
8 00:25:00 7
9 00:26:00 7
10 00:31:00 8
11 00:33:00 7
So, this query gives you maximum number of rows having been active
select
max((select count(1) from tbl t where tbl.start between t.start and t."end"))
from tbl;
max
-----
8

While loop to add data for pivot

Currently i have a requirement which needs a table to look like this:
Instrument Long Short 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 ....
Fixed 41 41 35 35 35 35 35 35 35 53 25 25
Index 16 16 22 22 22 32 12 12 12 12 12 12
Credits 29 29 41 16 16 16 16 16 16 16 16 16
Short term 12 12 5 5 5 5 5 5 5 5 5 17
My worktable looks like the following:
Instrument Long Short Annual Coupon Maturity Date Instrument ID
Fixed 10 10 10 01/01/2025 1
Index 5 5 10 10/05/2016 2
Credits 15 15 16 25/06/2020 3
Short term 12 12 5 31/10/2022 4
Fixed 13 13 15 31/03/2030 5
Fixed 18 18 10 31/01/2019 6
Credits 14 14 11 31/12/2013 7
Index 11 11 12 31/10/2040 8
..... etc
So basically the long and the short in the pivot should be the sum of each distinct instrument ID. And then for each year i need to take the sum of each Annual Coupon until the maturity date year where the long and the coupon rate are added together.
My thinking was that i had to create a while loop which would populate a table with a record for each year for each instrument until the maturity date, so that i could then pivot using an sql pivot some how. Does this seem feasible? Any other ideas on the best way of doing this, particularly i might need help on the while loop?
The following solution uses a numbers table to unfold ranges in your table, performs some special processing on some of the data columns in the unfolded set, and finally pivots the results:
WITH unfolded AS (
SELECT
t.Instrument,
Long = SUM(z.Long ) OVER (PARTITION BY Instrument),
Short = SUM(z.Short) OVER (PARTITION BY Instrument),
Year = y.Number,
YearValue = t.AnnualCoupon + z.Long + z.Short
FROM YourTable t
CROSS APPLY (SELECT YEAR(t.MaturityDate)) x (Year)
INNER JOIN numbers y ON y.Number BETWEEN YEAR(GETDATE()) AND x.Year
CROSS APPLY (
SELECT
Long = CASE y.Number WHEN x.Year THEN t.Long ELSE 0 END,
Short = CASE y.Number WHEN x.Year THEN t.Short ELSE 0 END
) z (Long, Short)
),
pivoted AS (
SELECT *
FROM unfolded
PIVOT (
SUM(YearValue) FOR Year IN ([2013], [2014], [2015], [2016], [2017], [2018], [2019], [2020],
[2021], [2022], [2023], [2024], [2025], [2026], [2027], [2028], [2029], [2030],
[2031], [2032], [2033], [2034], [2035], [2036], [2037], [2038], [2039], [2040])
) p
)
SELECT *
FROM pivoted
;
It returns results for a static range years. To use it for a dynamically calculated year range, you'll first need to prepare the list of years as a CSV string, something like this:
SET #columnlist = STUFF(
(
SELECT ', [' + CAST(Number) + ']'
FROM numbers
WHERE Number BETWEEN YEAR(GETDATE())
AND (SELECT YEAR(MAX(MaturityDate)) FROM YourTable)
ORDER BY Number
FOR XML PATH ('')
),
1, 2, ''
);
then put it into the dynamic SQL version of the query:
SET #sql = N'
WITH unfolded AS (
...
PIVOT (
SUM(YearValue) FOR Year IN (' + #columnlist + ')
) p
)
SELECT *
FROM pivoted;
';
and execute the result:
EXECUTE(#sql);
You can try this solution at SQL Fiddle.

Subselect and Max

Alright, I've been trying to conceptualize this for a better part of the afternoon and still cannot figure out how to structure this subselect.
The data that I need to report are ages for a given student major grouped by the past 3 fiscal years. Each fiscal year has 3 semesters (summer, fall, spring). I need to have my query grouped on the fiscalyear and agerange fields and then count the distinct student id's.
I currently have this for my SQL statement:
Select COUNT(distinct StuID), AgeRange, FiscalYear
from tblStatic
where Campus like 'World%' and (enrl_act like 'REG%' or enrl_act like 'SCH%')
and StuMaj = 'LAWSC' and FiscalYear IN ('09/10', '10/11', '11/12')
group by FiscalYear, AgeRange
order by FiscalYear, AgeRange
So this is all fine and dandy except it doesn't match my headcount of students for the fiscalyear. The reason being, that people may cross over in the age ranges during the fiscal year and is adding them to my count twice.
How can I use a subselect to resolve this duplicate entry? The field I have been trying to get working is my semester field and using a max to find the max semester during a fiscalyear for a given student.
Data Sample:
Count AgeRange FiscalYear
3 1 to 19 09/10
20 20 to 23 09/10
60 24 to 29 09/10
96 30 to 39 09/10
34 40 to 49 09/10
14 50 to 59 09/10
3 60+ 09/10
2 1 to 19 10/11
24 20 to 23 10/11
73 24 to 29 10/11
109 30 to 39 10/11
43 40 to 49 10/11
11 50 to 59 10/11
2 60+ 10/11
1 1 to 19 11/12
17 20 to 23 11/12
75 24 to 29 11/12
123 30 to 39 11/12
44 40 to 49 11/12
14 50 to 59 11/12
2 60+ 11/12
Solution: (Just got this working and produced my headcounts that match what they are suppose to be)
Select COUNT(distinct S.StuID), AR.AgeRange, S.FiscalYear
from tblStatic S
INNER JOIN
( Select S.StuID, MIN(AgeRange) as AgeRange
From tblStatic S
Group By S.StuID) AR on S.StuID=AR.StuID
where Campus like 'World%' and (enrl_act like 'REG%' or
enrl_act like 'SCH%')
and StuMaj = 'LAWSC' and FiscalYear IN ('09/10', '10/11', '11/12')
group by S.FiscalYear, AR.AgeRange
order by S.FiscalYear, AR.AgeRange
Replace each student's age range with its maximum (or minimum, if you like) age range that fiscal year, then count them:
;
WITH sourceData AS (
SELECT
StudID,
MaxAgeRangeThisFiscalYear = MAX(AgeRange) OVER
(PARTITION BY StudID, FiscalYear),
FiscalYear
FROM tblStatic
WHERE Campus LIKE 'World%'
AND (enrl_act LIKE 'REG%' OR enrl_act LIKE 'SCH%')
AND StuMaj = 'LAWSC'
AND FiscalYear IN ('09/10', '10/11', '11/12')
)
SELECT
FiscalYear,
AgeRange = MaxAgeRangeThisFiscalYear,
Count = COUNT(DISTINCT StudID)
FROM sourceData
GROUP BY
FiscalYear,
MaxAgeRangeThisFiscalYear
ORDER BY
FiscalYear,
MaxAgeRangeThisFiscalYear