What's wrong with this Partition By

What's wrong with this Partition By - tsql

I have a query that uses a partition by over a time column, however the result is a bit unexpected, what's wrong here ? Why do I get more than one 1 on RN ? (one for 21:00:02:100 and the other for 21:00:02:600)
SELECT TOP 500
ROW_NUMBER() OVER(
PARTITION BY [Date], CAST([Time] AS Time(0))
ORDER BY [DATE] ASC, CAST([Time] AS Time(0)) ASC
) RN,
[DATE],
[Time]
FROM [DB]..[TABLE]
ORDER BY [Date] ASC,
[Time] ASC,
[RN] ASC
Results:
**1 2010-10-03 21:00:02.100**
2 2010-10-03 21:00:02.100
3 2010-10-03 21:00:02.200
4 2010-10-03 21:00:02.200
5 2010-10-03 21:00:02.200
4 2010-10-03 21:00:02.500
**1 2010-10-03 21:00:02.600**
2 2010-10-03 21:00:02.600
3 2010-10-03 21:00:02.600
5 2010-10-03 21:00:02.700
6 2010-10-03 21:00:02.700
7 2010-10-03 21:00:02.700
8 2010-10-03 21:00:02.700
9 2010-10-03 21:00:02.700
10 2010-10-03 21:00:02.700
11 2010-10-03 21:00:02.700
12 2010-10-03 21:00:02.700
13 2010-10-03 21:00:02.700
14 2010-10-03 21:00:02.700
15 2010-10-03 21:00:02.700
16 2010-10-03 21:00:02.700
17 2010-10-03 21:00:02.700
18 2010-10-03 21:00:02.700
19 2010-10-03 21:00:02.700
20 2010-10-03 21:00:02.700
21 2010-10-03 21:00:02.700
22 2010-10-03 21:00:02.700

You are using CASTing to time(0) for your ordering which rounds (truncates?) the time to second precision. It is working exactly as advertised...
Edit:
It makes no sense to have the same PARTITION BY and ORDER BY...
My guess is that you are trying to partition by the second, and want rows numbers in that interval
Try this:
ROW_NUMBER() OVER(
PARTITION BY [Date], CAST([Time] AS Time(0))
ORDER BY [DATE], [Time]
) RN
If you get duplicates row numbers crossing the 0.5 second boundary, use this to force truncate rather then ROUND
ROW_NUMBER() OVER(
PARTITION BY [Date], CAST([Time] - '00:00:00.5000' AS Time(0))
ORDER BY [DATE], [Time]
) RN

Thanks a lot for your feedback, turns out that cast is rounding it and therefore its not working (giving me two times the 1). Subtracting from [TIME] didn't work for me, got an error. At the end I used this code to get it working as wanted:
ROW_NUMBER() OVER(
PARTITION BY CONVERT(nvarchar(8), [Time], 8)
ORDER BY [Date], [Time]) RN
FROM [DB]..[TABLE]

Related

SQL DimDate with Fiscal Year not =454 or 445 but with Calendar Range

I have a script that is working fine with the exception of the fiscal time start/end. For example, year0601 should be the 1st day of the fiscal year and year0530 should be the last day of the fiscal year every year. However, when the script runs, it has an offset. Has anyone scripted this? I'm thinking I can repurpose the calendar login and add +5 to the month logic but wanted to check here first as I've seen a number of 454 and 445 solutions and perhaps I am applying it incorrectly. Thank You for your assistance.
Here is the link to the original code:
https://www.codeproject.com/Articles/647950/%2FArticles%2F647950%2FCreate-and-Populate-Date-Dimension-for-Data-Wareho
`
/*******************************************************************************************************************************************************/
/* Loop on days in interval*/
WHILE (DATEPART(yy,#CurrentDate) <= #LastYear)
BEGIN
/*SET fiscal Month*/
SELECT #FiscalMonth = CASE
/*Use this section for a 4-5-4 calendar. Every leap year the result will be a 4-5-5*/
--WHEN #FiscalWeekOfYear BETWEEN 1 AND 4 THEN 1 /*4 weeks*/
--WHEN #FiscalWeekOfYear BETWEEN 5 AND 9 THEN 2 /*5 weeks*/
--WHEN #FiscalWeekOfYear BETWEEN 10 AND 13 THEN 3 /*4 weeks*/
--WHEN #FiscalWeekOfYear BETWEEN 14 AND 17 THEN 4 /*4 weeks*/
--WHEN #FiscalWeekOfYear BETWEEN 18 AND 22 THEN 5 /*5 weeks*/
--WHEN #FiscalWeekOfYear BETWEEN 23 AND 26 THEN 6 /*4 weeks*/
--WHEN #FiscalWeekOfYear BETWEEN 27 AND 30 THEN 7 /*4 weeks*/
--WHEN #FiscalWeekOfYear BETWEEN 31 AND 35 THEN 8 /*5 weeks*/
--WHEN #FiscalWeekOfYear BETWEEN 36 AND 39 THEN 9 /*4 weeks*/
--WHEN #FiscalWeekOfYear BETWEEN 40 AND 43 THEN 10 /*4 weeks*/
--WHEN #FiscalWeekOfYear BETWEEN 44 AND (48+#LeapWeek) THEN 11 /*5 weeks*/
--WHEN #FiscalWeekOfYear BETWEEN (49+#LeapWeek) AND (52+#LeapWeek) THEN 12 /*4 weeks (5 weeks on leap year)*/
/*Use this section for a 4-4-5 calendar. Every leap year the result will be a 4-5-5*/
WHEN #FiscalWeekOfYear BETWEEN 1 AND 4 THEN 1 /*4 weeks*/
WHEN #FiscalWeekOfYear BETWEEN 5 AND 8 THEN 2 /*4 weeks*/
WHEN #FiscalWeekOfYear BETWEEN 9 AND 13 THEN 3 /*5 weeks*/
WHEN #FiscalWeekOfYear BETWEEN 14 AND 17 THEN 4 /*4 weeks*/
WHEN #FiscalWeekOfYear BETWEEN 18 AND 21 THEN 5 /*4 weeks*/
WHEN #FiscalWeekOfYear BETWEEN 22 AND 26 THEN 6 /*5 weeks*/
WHEN #FiscalWeekOfYear BETWEEN 27 AND 30 THEN 7 /*4 weeks*/
WHEN #FiscalWeekOfYear BETWEEN 31 AND 34 THEN 8 /*4 weeks*/
WHEN #FiscalWeekOfYear BETWEEN 35 AND 39 THEN 9 /*5 weeks*/
WHEN #FiscalWeekOfYear BETWEEN 40 AND 43 THEN 10 /*4 weeks*/
WHEN #FiscalWeekOfYear BETWEEN 44 AND (47+#leapWeek) THEN 11 /*4 weeks (5 weeks on leap year)*/
WHEN #FiscalWeekOfYear BETWEEN (48+#leapWeek) AND (52+#leapWeek) THEN 12 /*5 weeks*/
END
/*SET Fiscal Quarter*/
SELECT #FiscalQuarter = CASE
WHEN #FiscalMonth BETWEEN 1 AND 3 THEN 1
WHEN #FiscalMonth BETWEEN 4 AND 6 THEN 2
WHEN #FiscalMonth BETWEEN 7 AND 9 THEN 3
WHEN #FiscalMonth BETWEEN 10 AND 12 THEN 4
END
SELECT #FiscalQuarterName = CASE
WHEN #FiscalMonth BETWEEN 1 AND 3 THEN 'First'
WHEN #FiscalMonth BETWEEN 4 AND 6 THEN 'Second'
WHEN #FiscalMonth BETWEEN 7 AND 9 THEN 'Third'
WHEN #FiscalMonth BETWEEN 10 AND 12 THEN 'Fourth'
END
/*Set Fiscal Year Name*/
SELECT #FiscalYearName = 'FY ' + CONVERT(VARCHAR, #FiscalYear)
INSERT INTO #tb (PeriodDate, FiscalDayOfYear, FiscalWeekOfYear, fiscalMonth, FiscalQuarter, FiscalQuarterName, FiscalYear, FiscalYearName) VALUES
(#CurrentDate, #FiscalDayOfYear, #FiscalWeekOfYear, #FiscalMonth, #FiscalQuarter, #FiscalQuarterName, #FiscalYear, #FiscalYearName)
/*SET next day*/
SET #CurrentDate = DATEADD(dd, 1, #CurrentDate)
SET #FiscalDayOfYear = #FiscalDayOfYear + 1
SET #FiscalWeekOfYear = ((#FiscalDayOfYear-1) / 7) + 1
IF (#FiscalWeekOfYear > (52+#LeapWeek))
BEGIN
/*Reset a new year*/
SET #FiscalDayOfYear = 1
SET #FiscalWeekOfYear = 1
SET #FiscalYear = #FiscalYear + 1
IF ( EXISTS (SELECT * FROM #leapTable WHERE #FiscalYear = leapyear))
BEGIN
SET #LeapWeek = 1
END
ELSE
BEGIN
SET #LeapWeek = 0
END
END
END
/*******************************************************************************************************************************************************/
/*Set first and last days of the fiscal months*/
UPDATE #tb
SET
FiscalFirstDayOfMonth = minmax.StartDate,
FiscalLastDayOfMonth = minmax.EndDate
FROM
#tb t,
(
SELECT FiscalMonth, FiscalQuarter, FiscalYear, MIN(PeriodDate) AS StartDate, MAX(PeriodDate) AS EndDate
FROM #tb
GROUP BY FiscalMonth, FiscalQuarter, FiscalYear
) minmax
WHERE
t.FiscalMonth = minmax.FiscalMonth AND
t.FiscalQuarter = minmax.FiscalQuarter AND
t.FiscalYear = minmax.FiscalYear
/*Set first and last days of the fiscal quarters*/
UPDATE #tb
SET
FiscalFirstDayOfQuarter = minmax.StartDate,
FiscalLastDayOfQuarter = minmax.EndDate
FROM
#tb t,
(
SELECT FiscalQuarter, FiscalYear, min(PeriodDate) as StartDate, max(PeriodDate) as EndDate
FROM #tb
GROUP BY FiscalQuarter, FiscalYear
) minmax
WHERE
t.FiscalQuarter = minmax.FiscalQuarter AND
t.FiscalYear = minmax.FiscalYear
/*Set first and last days of the fiscal years*/
UPDATE #tb
SET
FiscalFirstDayOfYear = minmax.StartDate,
FiscalLastDayOfYear = minmax.EndDate
FROM
#tb t,
(
SELECT FiscalYear, min(PeriodDate) as StartDate, max(PeriodDate) as EndDate
FROM #tb
GROUP BY FiscalYear
) minmax
WHERE
t.FiscalYear = minmax.FiscalYear
/*Set FiscalYearMonth*/
UPDATE #tb
SET
FiscalMonthYear =
CASE FiscalMonth
WHEN 1 THEN 'Jun'
WHEN 2 THEN 'Jul'
WHEN 3 THEN 'Aug'
WHEN 4 THEN 'Sep'
WHEN 5 THEN 'Oct'
WHEN 6 THEN 'Nov'
WHEN 7 THEN 'Dec'
WHEN 8 THEN 'Jan'
WHEN 9 THEN 'Feb'
WHEN 10 THEN 'Mar'
WHEN 11 THEN 'Apr'
WHEN 12 THEN 'May'
END + '-' + CONVERT(VARCHAR, FiscalYear)
/*Set FiscalMMYYYY*/
UPDATE #tb
SET
FiscalMMYYYY = RIGHT('0' + CONVERT(VARCHAR, FiscalMonth),2) + CONVERT(VARCHAR, FiscalYear)
/*******************************************************************************************************************************************************/

PostgreSQL window function & difference between dates

Suppose I have data formatted in the following way (FYI, total row count is over 30K):
customer_id order_date order_rank
A 2017-02-19 1
A 2017-02-24 2
A 2017-03-31 3
A 2017-07-03 4
A 2017-08-10 5
B 2016-04-24 1
B 2016-04-30 2
C 2016-07-18 1
C 2016-09-01 2
C 2016-09-13 3
I need a 4th column, let's call it days_since_last_order which, in the case where order_rank = 1 then 0 else calculate the number of days since the previous order (with rank n-1).
So, the above would return:
customer_id order_date order_rank days_since_last_order
A 2017-02-19 1 0
A 2017-02-24 2 5
A 2017-03-31 3 35
A 2017-07-03 4 94
A 2017-08-10 5 38
B 2016-04-24 1 0
B 2016-04-30 2 6
C 2016-07-18 1 79
C 2016-09-01 2 45
C 2016-09-13 3 12
Is there an easier way to calculate the above with a window function (or similar) rather than join the entire dataset against itself (eg. on A.order_rank = B.order_rank - 1) and doing the calc?
Thanks!

use the lag window function
SELECT
customer_id
, order_date
, order_rank
, COALESCE(
DATE(order_date)
- DATE(LAG(order_date) OVER (PARTITION BY customer_id ORDER BY order_date))
, 0)
FROM <table_name>

TSQL - First and last number in range

I have table with:
1
2
3
4
5
6
9
10
11
12
and I need to receive:
1-6
9-12
How I can do that?
I need to see that I have two or more range of number i table and that from 1 to 6 and from 9 to 12.

SELECT
CONCAT(MIN(A.b), '-', max(A.b))
FROM
(
SELECT
*,
ROW_NUMBER() OVER (ORDER BY b) RowId
FROM
(VALUES (1), (2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12)) a(b)
--WHERE
--(a.b >= 1 AND a.b <= 6) OR
--(a.b >= 9 AND a.b <= 12)
) A
GROUP BY
A.b - A.RowId

How can I evaluate data over time in Postgresql?

I need to find users who have posted three times or more, three months in a row. I wrote this query:
select count(id), owneruserid, extract(month from creationdate) as postmonth from posts
group by owneruserid, postmonth
having count(id) >=3
order by owneruserid, postmonth
And I get this:
count owneruserid postmonth
36 -1 1
23 -1 2
45 -1 3
41 -1 4
18 -1 5
24 -1 6
31 -1 7
78 -1 8
83 -1 9
17 -1 10
88 -1 11
127 -1 12
3 6 11
3 7 12
4 8 1
8 8 12
4 12 4
3 12 5
3 22 2
4 22 4
(truncated)
Which is great. How can I query for users who posted three times or more, three months or more in a row? Thanks.

This is called the Islands and Gaps problem, specifically it's an Island problem with a date range. You should,
Fix this question up.
Flag it to be sent to dba.stackexchange.com
To solve this,
Create a pseudo column with a window that has 1 if the row preceding it does not correspond to the preceding mont
Create groups out of that with COUNT()
Check to make sure the count(*) for the group is greater than or equal to three.
Query,
SELECT l.id, creationdaterange, count(*)
FROM (
SELECT t.id,
t.creationdate,
count(range_reset) OVER (PARTITION BY t.id ORDER BY creationdate) AS creationdaterange
FROM (
SELECT id,
creationdate,
CASE
WHEN date_trunc('month',creationdate::date)::date - interval '1 month' = date_trunc('month',lag(creationdate))::date OVER (PARTITION BY id ORDER BY creationdate)
THEN 1
END AS range_reset
FROM post
ORDER BY id, creationdate
) AS t;
) AS l
GROUP BY t.id, creationdaterange
HAVING count(*) >= 3;

Fun with row_number() - Redshift Postgres - Time sequence and restarting numbering

I am looking to number streaks within my data, the goal is to find where at least 3 consecutive streaks are flagged by the np.
Here is a subset of my data:
drop table if exists bi_test;
create table test (id varchar(12),rd date,np decimal);
insert into test
select 'aaabbbccc', '2016-07-25'::date, 0 union all
select 'aaabbbccc', '2016-08-01'::date, 0 union all
select 'aaabbbccc', '2016-08-08'::date, 0 union all
select 'aaabbbccc', '2016-08-15'::date, 0 union all
select 'aaabbbccc', '2016-08-22'::date, 1 union all
select 'aaabbbccc', '2016-08-29'::date, 0 union all
select 'aaabbbccc', '2016-09-05'::date, 1 union all
select 'aaabbbccc', '2016-09-12'::date, 0 union all
select 'aaabbbccc', '2016-09-19'::date, 1;
I am hoping to use row_number() and count(), but it doesn't seem to be giving me the result I want.
select
*
,row_number() over (partition by t.id order by t.rd) all_ctr
,count(t.id) over (partition by t.id) all_count
,row_number() over (partition by t.id,t.np order by t.rd) np_counter
,count(t.id) over (partition by t.id,t.np) np_non_np
from
bi_adhoc.test t
order by
t.rd;
Here are my results, and the desired result:
id rd np all_ctr all_count np_counter np_non_np **Desired**
aaabbbccc 7/25/2016 0 1 9 1 6 **1**
aaabbbccc 8/1/2016 0 2 9 2 6 **2**
aaabbbccc 8/8/2016 0 3 9 3 6 **3**
aaabbbccc 8/15/2016 0 4 9 4 6 **4**
aaabbbccc 8/22/2016 1 5 9 1 3 **1**
aaabbbccc 8/29/2016 0 6 9 5 6 **1**
aaabbbccc 9/5/2016 1 7 9 2 3 **1**
aaabbbccc 9/12/2016 0 8 9 6 6 **1**
aaabbbccc 9/19/2016 1 9 9 3 3 **1**

One way to do this would be to calculate the lag (np) value in a CTE, and then compare the current np and lagged np to detect a streak. This may not be the most optimal way, but seems to work fine.
with source_cte as
(
select
*
,row_number() over (partition by t.id order by t.rd) row_num
,lag(np,1) over (partition by t.id order by t.rd) as prev_np
from
bi_adhoc.test t
)
, streak_cte as
(
select
*,
case when np=prev_np or prev_np is NULL then 1 else 0 end as is_streak
from
source_cte
)
select
*,
case when is_streak=1 then dense_rank() over (partition by id, is_streak order by rd) else 1 end as desired
from
streak_cte
order by
rd;

First, I added some additional data to help fully illustrate the problem...
drop table if exists bi_adhoc.test;
create table bi_adhoc.test (id varchar(12),period date,hit decimal);
insert into bi_adhoc.test
select 'aaabbbccc', '2016-07-25'::date, 0 union all
select 'aaabbbccc', '2016-08-01'::date, 0 union all
select 'aaabbbccc', '2016-08-08'::date, 0 union all
select 'aaabbbccc', '2016-08-15'::date, 1 union all
select 'aaabbbccc', '2016-08-22'::date, 1 union all
select 'aaabbbccc', '2016-08-29'::date, 0 union all
select 'aaabbbccc', '2016-09-05'::date, 0 union all
select 'aaabbbccc', '2016-09-12'::date, 1 union all
select 'aaabbbccc', '2016-09-19'::date, 0 union all
select 'aaabbbccc', '2016-09-26'::date, 1 union all
select 'aaabbbccc', '2016-10-03'::date, 1 union all
select 'aaabbbccc', '2016-10-10'::date, 1 union all
select 'aaabbbccc', '2016-10-17'::date, 1 union all
select 'aaabbbccc', '2016-10-24'::date, 1 union all
select 'aaabbbccc', '2016-10-31'::date, 0 union all
select 'aaabbbccc', '2016-11-07'::date, 0 union all
select 'aaabbbccc', '2016-11-14'::date, 0 union all
select 'aaabbbccc', '2016-11-21'::date, 0 union all
select 'aaabbbccc', '2016-11-28'::date, 0 union all
select 'aaabbbccc', '2016-12-05'::date, 1 union all
select 'aaabbbccc', '2016-12-12'::date, 1;
Then the key was to figure out what a streak was and how to identify each streak so I could partition the data to have something to partition the data.
select
*
,case
when t1.hit = 1 then row_number() over (partition by t1.id,t1.hit_partition order by t1.period)
when t1.hit = 0 then row_number() over (partition by t1.id,t1.miss_partition order by t1.period)
else null
end desired
from
(
select
*
,row_number() over (partition by t.id order by t.id,t.period)
,case
when t.hit = 1 then row_number() over (partition by t.id, t.hit order by t.period)
else null
end hit_counter
,case
when t.hit = 1 then row_number() over (partition by t.id order by t.id,t.period) - row_number() over (partition by t.id, t.hit order by t.period)
else null
end hit_partition
,case
when t.hit = 0 then row_number() over (partition by t.id, t.hit order by t.period)
else null
end miss_counter
,case
when t.hit = 0 then row_number() over (partition by t.id order by t.id,t.period) - row_number() over (partition by t.id, t.hit order by t.period)
else null
end miss_partition
from
bi_adhoc.test t
) t1
order by
t1.id
,t1.period;
The result of this:
id period hit row_number hit_counter hit_partition miss_counter miss_partition desired
aaabbbccc 2016-07-25 0 1 NULL NULL 1 0 1
aaabbbccc 2016-08-01 0 2 NULL NULL 2 0 2
aaabbbccc 2016-08-08 0 3 NULL NULL 3 0 3
aaabbbccc 2016-08-15 1 4 1 3 NULL NULL 1
aaabbbccc 2016-08-22 1 5 2 3 NULL NULL 2
aaabbbccc 2016-08-29 0 6 NULL NULL 4 2 1
aaabbbccc 2016-09-05 0 7 NULL NULL 5 2 2
aaabbbccc 2016-09-12 1 8 3 5 NULL NULL 1
aaabbbccc 2016-09-19 0 9 NULL NULL 6 3 1
aaabbbccc 2016-09-26 1 10 4 6 NULL NULL 1
aaabbbccc 2016-10-03 1 11 5 6 NULL NULL 2
aaabbbccc 2016-10-10 1 12 6 6 NULL NULL 3
aaabbbccc 2016-10-17 1 13 7 6 NULL NULL 4
aaabbbccc 2016-10-24 1 14 8 6 NULL NULL 5
aaabbbccc 2016-10-31 0 15 NULL NULL 7 8 1
aaabbbccc 2016-11-07 0 16 NULL NULL 8 8 2
aaabbbccc 2016-11-14 0 17 NULL NULL 9 8 3
aaabbbccc 2016-11-21 0 18 NULL NULL 10 8 4
aaabbbccc 2016-11-28 0 19 NULL NULL 11 8 5
aaabbbccc 2016-12-05 1 20 9 11 NULL NULL 1
aaabbbccc 2016-12-12 1 21 10 11 NULL NULL 2

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

What's wrong with this Partition By - tsql

Related

SQL DimDate with Fiscal Year not =454 or 445 but with Calendar Range

PostgreSQL window function & difference between dates

TSQL - First and last number in range

How can I evaluate data over time in Postgresql?

Fun with row_number() - Redshift Postgres - Time sequence and restarting numbering

Categories

Resources