SQL Grouping Data by Intervals

SQL Grouping Data by Intervals - tsql

All,
I am looking for a scale-able way to "bucket" these staff IDs into either a "Left" or "Right" dimension for reporting purposes. I need to group the first 3 distinct IDs into the left group, the next 3 into the right group and so on, next 3 into the left group, and so on. The actual data set contains hundreds of IDs.
Thanks
Raw data:
Rank Faculty_Staff_ID
----------------------------
1 zcrm_315216
1 zcrm_315216
1 zcrm_315216
2 zcrm_315217
2 zcrm_315217
2 zcrm_315217
3 zcrm_315218
4 zcrm_315219
4 zcrm_315219
4 zcrm_315219
5 zcrm_319795
5 zcrm_319795
6 zcrm_315220
6 zcrm_315220
7 zcrm_315221
8 zcrm_315222
9 zcrm_315223
9 zcrm_315223
9 zcrm_315223
Desired output:
L_or_R Rank Faculty_Staff_ID
----------------------------------
L 1 zcrm_315216
L 1 zcrm_315216
L 1 zcrm_315216
L 2 zcrm_315217
L 2 zcrm_315217
L 2 zcrm_315217
L 3 zcrm_315218
R 4 zcrm_315219
R 4 zcrm_315219
R 4 zcrm_315219
R 5 zcrm_319795
R 5 zcrm_319795
R 6 zcrm_315220
R 6 zcrm_315220
L 7 zcrm_315221
L 8 zcrm_315222
L 9 zcrm_315223
L 9 zcrm_315223
L 9 zcrm_315223

You can follow
Make a RowNumber on Rank column number.
Let Rank group by 3 using CASE WHEN on the subquery.
Use CASE WHEN on main query grp % 2 = 0 to split L and R
You can try this query.
SELECT t.*,(CASE WHEN grp % 2 = 0 then 'R' ELSE 'L' END) 'L_or_R'
FROM T t
INNER JOIN (
SELECT rnk,SUM(CASE WHEN (rn -1)% 3 = 0 THEN 1 ELSE 0 END) OVER(ORDER BY rn) grp
FROM (
SELECT rnk,ROW_NUMBER() OVER(ORDER BY rnk) rn
FROM
(
SELECT DISTINCT Rank rnk
FROM T
)t
) t
) t1 on t.Rank = t1.rnk
sqlfiddle:https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=615c015a856b57511a2dcf0323f0d4a5

To solve this problem, there is enough the DENSE_RANK function and a bit of arithmetic.
CREATE TABLE T(
Rank INT,
Faculty_Staff_ID VARCHAR(50)
);
INSERT INTO T VALUES
(1,'zcrm_315216'),
(1,'zcrm_315216'),
(1,'zcrm_315216'),
(2,'zcrm_315217'),
(2,'zcrm_315217'),
(2,'zcrm_315217'),
(2,'zcrm_315217'),
(2,'zcrm_315217'),
(2,'zcrm_315217'),
(2,'zcrm_315217'),
(2,'zcrm_315217'),
(2,'zcrm_315217'),
(2,'zcrm_315217'),
(2,'zcrm_315217'),
(2,'zcrm_315217'),
(2,'zcrm_315217'),
(2,'zcrm_315217'),
(2,'zcrm_315217'),
(2,'zcrm_315217'),
(2,'zcrm_315217'),
(2,'zcrm_315217'),
(3,'zcrm_315218'),
(4,'zcrm_315219'),
(4,'zcrm_315219'),
(4,'zcrm_315219'),
(5,'zcrm_319795'),
(5,'zcrm_319795'),
(6,'zcrm_315220'),
(6,'zcrm_315220'),
(7,'zcrm_315221'),
(8,'zcrm_315222'),
(10,'zcrm_315223'),
(21,'zcrm_315223'),
(23,'zcrm_315223'),
(25,'zcrm_315223'),
(25,'zcrm_315223'),
(27,'zcrm_315223');
SELECT *,
IIF(((DENSE_RANK() OVER (ORDER BY Rank) - 1) / 3) % 2 = 0, 'L', 'R') L_or_R
FROM T
ORDER BY Rank;
Welcome here to check.

Related

Get count of values in different subgroups

I need to delete some rows in the dataset, of which the speed equals zero and lasting over N times (let's assume N is 2).
The structure of the table demo looks like:
id
car
speed
time
1
foo
0
1
2
foo
0
2
3
foo
0
3
4
foo
1
4
5
foo
1
5
6
foo
0
6
7
bar
0
1
8
bar
0
2
9
bar
5
3
10
bar
5
4
11
bar
5
5
12
bar
5
6
Then I hope to generate a table like the one below by using window_function:
id
car
speed
time
lasting
1
foo
0
1
3
2
foo
0
2
3
3
foo
0
3
3
4
foo
1
4
2
5
foo
1
5
2
6
foo
0
6
1
7
bar
0
1
2
8
bar
0
2
2
9
bar
5
3
4
10
bar
5
4
4
11
bar
5
5
4
12
bar
5
6
4
Then I can easily exclude those rows by using WHERE NOT (speed = 0 AND lasting > 2)
Put the code I tried here, but it didn't return the value I expected and I guess those FROM (SELECT ... FROM (SELECT ... might not be the best practice to solve the problem:
SELECT g3.*, count(id) OVER (PARTITION BY car, cumsum ORDER BY id) as num
FROM (SELECT g2.*, sum(grp2) OVER (PARTITION BY car ORDER BY id) AS cumsum
FROM (SELECT g1.*, (CASE ne0 WHEN 0 THEN 0 ELSE 1 END) AS grp2
FROM (SELECT g.*, speed - lag(speed, 1, 0) OVER (PARTITION BY car) AS ne0
FROM (SELECT *, row_number() OVER (PARTITION BY car) AS grp FROM demo) g ) g1 ) g2 ) g3
ORDER BY id;

You can use window function LAG() to check for the previous speed value for each row and SUM() window function to create the groups for the continuous values.
Then with COUNT() window function you can count the number of rows in each group so that you can filter out the rows with 0 speed in the groups that have more than 2 rows:
SELECT id, car, speed, time
FROM (
SELECT *, COUNT(*) OVER (PARTITION BY car, grp) counter
FROM (
SELECT *, SUM(flag::int) OVER (PARTITION BY car ORDER BY time) grp
FROM (
SELECT *, speed <> LAG(speed, 1, speed - 1) OVER (PARTITION BY car ORDER BY time) flag
FROM demo
) t
) t
) t
WHERE speed <> 0 OR counter <= 2
ORDER BY id;
See the demo.

How to enumerate rows by division?

I have the following table
id num sub_id
1 3 1
1 5 2
1 1 1
1 4 2
2 1 5
2 2 5
I want to get this result
id num sub_id number
1 3 1 1
1 5 2 2
1 1 1 1
1 4 2 2
2 1 5 1
2 2 5 1
I tried to do this row_number() over (partition by id order by num,sub_id DESC) but th result is obviosly differs

I don't understand your business because you don't explain your logic and information about that, but maybe this query helps you?
Result and info: dbfiddle
with recursive
cte_r as (
select id,
num,
sub_id,
row_number() over () as rn
from test),
cte as (
select id,
num,
sub_id,
rn,
rn as grp
from cte_r
where rn = 1
union all
select cr.id,
cr.num,
cr.sub_id,
cr.rn,
case
when cr.id != c.id then 1
when cr.id = c.id and cr.sub_id = c.sub_id then c.grp
when cr.id = c.id and cr.sub_id > c.sub_id then c.grp + 1
when cr.id = c.id and cr.sub_id < c.sub_id then 1
end
from cte c,
cte_r cr
where c.rn = cr.rn - 1)
select id,
num,
sub_id,
grp
from cte
order by id

It looks like you actually want to ignore the num column and then use DENSE_RANK on sub_id:
SELECT *, dense_rank() AS number OVER (PARTITION BY id ORDER BY sub_id) FROM …;

Refer to current row in window function

Is it possible to refer to the current row in a window partition? I want to do something like the following:
SELECT min(ABS(variable - CURRENT.variable)) over (order by criterion RANGE UNBOUNDED PRECEDING)
That is, i want to find in the given partition the variable which is closest to the current value. Is is possible to do something like that?
As an example, from:
criterion | variable
1 2
2 4
3 2
4 7
5 6
We would obtain:
null
2
0
3
1
Thanks

As far as I know, this cannot be done with window functions.
But it can be done with a self join:
SELECT a.id,
a.variable,
min(abs(a.variable - b.variable))
FROM mydata a
LEFT JOIN mydata b
ON (b.criterion < a.criterion)
GROUP BY a.id, a.variable
ORDER BY a.id;

If I understand correctly:
with t (v) as (values (-5),(-2),(0),(1),(3),(10))
select v,
least(
v - lag(v) over (order by v),
lead(v) over (order by v) - v
) as closest
from t
;
v | closest
----+---------
-5 | 3
-2 | 2
0 | 1
1 | 1
3 | 2
10 | 7

Hope this could help you (pay attention for performance problems).
I tried this in MSSQL (at bottom you'll find POSTGRESQL version):
CREATE TABLE TX (CRITERION INT, VARIABILE INT);
INSERT INTO TX VALUES (1,2), (2,4),(3,2),(4,7), (5,6);
SELECT CRITERION, MIN_DELTA FROM
(
SELECT TX.CRITERION
, MIN(ABS(B.TX2_VAR - TX.VARIABILE)) OVER (PARTITION BY TX.CRITERION) AS MIN_DELTA
, RANK() OVER (PARTITION BY TX.CRITERION ORDER BY ABS(B.TX2_VAR - TX.VARIABILE) ) AS MIN_RANK
FROM TX
CROSS APPLY (SELECT TX2.CRITERION AS TX2_CRIT, TX2.VARIABILE AS TX2_VAR FROM TX TX2 WHERE TX2.CRITERION < TX.CRITERION) B
) C
WHERE MIN_RANK=1
ORDER BY CRITERION
;
Output:
CRITERION MIN_DELTA
----------- -----------
2 2
3 0
4 3
5 1
POSTGRESQL Version (tested on Rextester http://rextester.com/VMGJ87600):
CREATE TABLE TX (CRITERION INT, VARIABILE INT);
INSERT INTO TX VALUES (1,2), (2,4),(3,2),(4,7), (5,6);
SELECT * FROM TX;
SELECT CRITERION, MIN_DELTA FROM
(
SELECT TX.CRITERION
, MIN(ABS(B.TX2_VAR - TX.VARIABILE)) OVER (PARTITION BY TX.CRITERION) AS MIN_DELTA
, RANK() OVER (PARTITION BY TX.CRITERION ORDER BY ABS(B.TX2_VAR - TX.VARIABILE) ) AS MIN_RANK
FROM TX
LEFT JOIN LATERAL (SELECT TX2.CRITERION AS TX2_CRIT, TX2.VARIABILE AS TX2_VAR FROM TX TX2 WHERE TX2.CRITERION < TX.CRITERION) B ON TRUE
) C
WHERE MIN_RANK=1
ORDER BY CRITERION
;
DROP TABLE TX;
Output:
criterion variabile
1 1 2
2 2 4
3 3 2
4 4 7
5 5 6
criterion min_delta
1 1 NULL
2 2 2
3 3 0
4 4 3
5 5 1

Overlapping condition for case-when

I have the following query:
SELECT case
when tbl.id % 2 = 0 then 'mod-2'
when tbl.id % 3 = 0 then 'mod-3'
when tbl.id % 5 = 0 then 'mod-5'
else 'mod-x'
end as odds, tbl.id from some_xyz_table tbl;
If the table has Id 5,6,7 then it is returning output as (copied from pg-admin):
"mod-5";5
"mod-2";6
"mod-x";7
But, here I can see 6 is divisible by both 2 and 3. And my expected output is:
"mod-5";5
"mod-2";6
"mod-3";6 <-- this
"mod-x";7
Is there any way to modify this query to obtain such output? Any alternate solution will do for me.

You could do this with UNION queries [EDIT changed it to use UNION ALL]:
SELECT 'mod-5', id FROM tbl -- divisible by 5
WHERE id %5 = 0
UNION ALL
SELECT 'mod-2', id FROM tbl -- divisible by 2
WHERE id %2 = 0
UNION ALL
SELECT 'mod-3', id FROM tbl -- divisible by 3
WHERE id %3 = 0
UNION ALL
SELECT 'mod-x',id FROM tbl -- not divisible by 5,3 or 2
WHERE id %5 <> 0 AND id%2 <> 0 AND id % 3 <> 0

How to read all records recursively and show by level depth TSQL

Is there a way to read records recursively in similar table and order by depth level?
#table:
id int | parent int | value string
--------------------------------------------
1 -1 some
2 1 some2
3 2 some3
4 2 some4
5 3 some5
6 4 some6
7 3 some5
8 3 some5
9 8 some5
10 8 some5
So is there a way to recursively select where result table would look like this.
select * from #table where id=3
id int | parent int | value string | depth
--------------------------------------------------------
3 2 some3 0
5 3 some5 1
7 3 some5 1
8 3 some5 1
9 8 some5 2
10 8 some5 2
So if I choose id=3 I would see recursion for id=3 and children
Thank you

;with C as
(
select id,
parent,
value,
0 as depth
from YourTable
where id = 3
union all
select T.id,
T.parent,
T.value,
C.depth + 1
from YourTable as T
inner join C
on T.parent = C.id
)
select *
from C
SE-Data

You can accomplish using CTEs, in particular rCTEs.
See this, and this for more information.
Example to follow:
WITH sampleCTE (id, parent, value, depth)
AS (
-- Anchor definition
SELECT id
, parent
, value
, 0
FROM #table
WHERE id = #targetId
-- Recursive definition
UNION ALL
SELECT child.id
, child.parent
, child.value
, sampleCTE.depth + 1
FROM #table child
INNER JOIN sampleCTE ON sampleCTE.id = child.parent
)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

SQL Grouping Data by Intervals - tsql

Related

Get count of values in different subgroups

How to enumerate rows by division?

Refer to current row in window function

Overlapping condition for case-when

How to read all records recursively and show by level depth TSQL

Categories

Resources