Show every week of the Year even if there is no data - tsql

I have query that pulls data by week and groups it together. But i does not display weeks that doesn't have any data. I want show all weeks even if they don't have data as null maybe
Here is the query if someone can help me with this it will awesome
SELECT
DATEADD (week, datediff(week, 0, StartDate), -1) as 'WeekOf'
,DATEADD (week, datediff(week, 0, StartDate), +5) as 'to'
,DATEPART(wk, StartDate) as 'WeekNumber'
FROM [DESOutage].[dbo].[OPSInterruption]
Where StartDate > '2020-01-01' and EndDate <'2020-02-01'
Group by DATEADD (week, datediff(week, 0, StartDate), -1),DATEPART(wk, StartDate),DATEADD (week, datediff(week, 0, StartDate), +5)
***************Output***************
As you could see week 2 and 4 is missing out since there is no data being returned. I would still like to see week 2 and 4 in the output with maybe 0 as result.
WeekOf to WeekNumber
2019-12-29 00:00:00.000 2020-01-04 00:00:00.000 1
2020-01-12 00:00:00.000 2020-01-18 00:00:00.000 3
2020-01-26 00:00:00.000 2020-02-01 00:00:00.000 5

You probably need a calendar table. Here is a quick way of generating one, with an untested implementation of your code. I am assuming that the StartDate may contain a time component thus the need to coalesce the dates.
DECLARE #StartYear DATETIME = '20200101'
DECLARE #days INT = 366
;WITH
E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), -- 1*10^1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), -- 1*10^2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), -- 1*10^4 or 10,000 rows
E8(N) AS (SELECT 1 FROM E4 a, E4 b), -- 1*10^8 or 100,000,000 rows
Tally(N) AS (SELECT TOP (#Days) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E8),
Calendar AS (
SELECT StartOfDay = DATEADD(dd,N-1,#StartYear),
EndOfDay = DATEADD(second, -1, DATEADD(dd,N ,#StartYear))
FROM Tally)
SELECT DATEADD (week, datediff(week, 0, COALESCE(x.StartDate, c.StartOfDay) ), -1) as 'WeekOf'
, DATEADD (week, datediff(week, 0, COALESCE(x.StartDate, c.StartOfDay)), +5) as 'to'
, DATEPART(wk, COALESCE(x.StartDate, c.StartOfDay)) as 'WeekNumber'
FROM Calendar c
INNER JOIN [DESOutage].[dbo].[OPSInterruption] x
ON x.StartDate > c.StartOfDay AND x.StartDate <= c.EndOfDay
WHERE c.StartOfDay > '2020-01-01' AND c.StartOfDay <'2020-02-01'
GROUP BY DATEADD (week, datediff(week, 0, COALESCE(x.StartDate, c.StartOfDay)), -1),
DATEPART(wk, COALESCE(x.StartDate, c.StartOfDay)),
DATEADD (week, datediff(week, 0, COALESCE(x.StartDate, c.StartOfDay)), +5)

Related

Cohort Analysis with RedShift by Month

I am trying to build a cohort analysis for monthly retention but experiencing challenge getting the Month Number column right. The month number is supposed to return month(s) user transacted i.e 0 for registration month, 1 for the first month after registration month, 2 for the second month until the last month but currently, it returns negative month numbers in some cells.
It should be like this table:
cohort_month total_users month_number percentage
---------- ----------- -- ------------ ---------
January 100 0 40
January 341 1 90
January 115 2 90
February 103 0 73
February 100 1 40
March 90 0 90
Here is the SQL:
with cohort_items as (
select
extract(month from insert_date) as cohort_month,
msisdn as user_id
from mfscore.t_um_user_detail where extract(year from insert_date)=2020
order by 1, 2
),
user_activities as (
select
A.sender_msisdn,
extract(month from A.insert_date)-C.cohort_month as month_number
from mfscore.t_wm_transaction_logs A
left join cohort_items C ON A.sender_msisdn = C.user_id
where extract(year from A.insert_date)=2020
group by 1, 2
),
cohort_size as (
select cohort_month, count(1) as num_users
from cohort_items
group by 1
order by 1
),
B as (
select
C.cohort_month,
A.month_number,
count(1) as num_users
from user_activities A
left join cohort_items C ON A.sender_msisdn = C.user_id
group by 1, 2
)
select
B.cohort_month,
S.num_users as total_users,
B.month_number,
B.num_users * 100 / S.num_users as percentage
from B
left join cohort_size S ON B.cohort_month = S.cohort_month
where B.cohort_month IS NOT NULL
order by 1, 3
I think the RANK window function is the right solution. So the idea is to assigne a rank to months of user activities for each user, order by year and month.
Something like:
WITH activity_per_user AS (
SELECT
user_id,
event_date,
RANK() OVER (PARTITION BY user_id ORDER BY DATE_PART('year', event_date) , DATE_PART('month', event_date) ASC) AS month_number
FROM user_activities_table
)
RANK number starts from 1, so you may want to substract 1.
Then, you can group by user_id and month_number to get the number of interactions for each user per month from the subscription (adapt to your use case accordingly).
SELECT
user_id,
month_number,
COUNT(1) AS n_interactions
FROM activity_per_user
GROUP BY 1, 2
Here is the documentation:
https://docs.aws.amazon.com/redshift/latest/dg/r_WF_RANK.html

Yearly total for items over multiple years

I am trying to find the total per year
For example
Start date End Date Total Value
1 07/01/14 01/01/15 $10,000
2 08/01/13 12/01/14 $10,000
3 03/01/13 05/01/15 $10,000
As you can see, Some items are over multiple years. Is there a way to find out what the total value is per year.
Solution should be:
item 3
2013- $3600
2014-$4800
2015-1600
Then a summation would be down for all three items to give a yearly total.
What I have so far:
I have a rolling summation code which is shown below.
case when
(
[begin date] >= dateadd(mm,0,DATEADD(mm,DATEDIFF(mm,0,getdate()),0))
and [end date] >= dateadd(mm,0,DATEADD(mm,DATEDIFF(mm,0,getdate()),0))
)
OR
(
[Begin Date] < dateadd(mm,0,DATEADD(mm,DATEDIFF(mm,0,getdate()),0))
and [End Date] >= dateadd(mm,0,DATEADD(mm,DATEDIFF(mm,0,getdate()),0))
)
then [Totalvalue]/nullif(DATEDIFF(mm,[begin date],[end date]),0)
else 0
end [Current Month]
I dono how you got that total values for item 3
but for item 3 i hope it should be
2013 = 3704
2014 = 4444
2015 = 1852
Dono how efficient this code is just have a try
CREATE TABLE #tblName
(
itemid INT,
startdate DATETIME,
endate DATETIME,
value int
)
INSERT INTO #tblName
VALUES (1,'2014/07/01','2015/01/01',10000),
(2,'2013/08/01','2014/12/01',10000),
(3,'2013/03/01','2015/05/01',10000)
DECLARE #mindate DATETIME,
#maxdate DATETIME
SELECT #mindate = Min(startdate),
#maxdate = Max(endate)
FROM #tblName
SELECT *
FROM #tblName;
WITH cte
AS (SELECT #mindate startdate
UNION ALL
SELECT Dateadd(mm, 1, startdate) startdate
FROM cte
WHERE startdate <= Dateadd(mm, -1, #maxdate))
SELECT a.value * ( ( convert(numeric(22,6),a.cnt) / convert(numeric(22,6),c.total) ) * 100 ) / 100,a.itemid,a.startdate
FROM (SELECT Avg(value) value,
Count(1) cnt,
itemid,
Year(a.startdate) startdate
FROM cte a
JOIN #tblName b
ON a.startdate BETWEEN b.startdate AND b.endate
GROUP BY itemid,
Year(a.startdate)) a
JOIN(SELECT Sum(cnt) total,
itemid
FROM (SELECT Avg(value) value,
Count(1) cnt,
itemid,
Year(a.startdate) startdate
FROM cte a
JOIN #tblName b
ON a.startdate BETWEEN b.startdate AND b.endate
GROUP BY itemid,
Year(a.startdate)) B
GROUP BY itemid) C
ON a.itemid = c.itemid
WHERE a.itemid = 3

Getting the start date and end date of all weeks by giving the year

I have a query which displays a set of 52 numbers along with the respective dates of that week
SELECT kkk, TO_CHAR (start_date, 'DD-MON-YYYY'),
TO_CHAR (start_date + 6, 'DD-MON-YYYY') AS end_day
FROM (SELECT TRUNC (TRUNC (TO_DATE ('2014', 'YYYY'), 'YYYY') + 1 * 7,
'IW'
)
- 1 start_date,
ROWNUM AS kkk
FROM DUAL
CONNECT BY ROWNUM <= 52);
but the problem with this query is that I am only getting the first week dates but not for the next consecutive weeks.Please help
Try like this,
SELECT kkk,
TO_CHAR(start_date, 'DD-MON-YYYY') start_date,
TO_CHAR(start_date + 6, 'DD-MON-YYYY') end_day
FROM(
SELECT TRUNC(Trunc(to_date('2014', 'YYYY'),'YYYY')+ LEVEL * 7,'IW')-1 start_date ,
ROWNUM kkk
FROM duaL
CONNECT BY LEVEL <= 52
);
Instead of ROWNUM you can also use LEVEL, Like,
SELECT TRUNC(Trunc(to_date('2014', 'YYYY'),'YYYY')+ LEVEL * 7,'IW')-1 start_date ,
level kkk
FROM duaL
CONNECT BY LEVEL <= 52

Get First and Last Day of Any Year

I'm currently trying to get the first and last day of any year. I have data from 1950 and I want to get the first day of the year in the dataset to the last day of the year in the dataset (note that the last day of the year might not be December 31rst and same with the first day of the year).
Initially I thought I could use a CTE and call DATEPART with the day of the year selection, but this wouldn't partition appropriately. I also tried a CTE self-join, but since the last day or first day of the year might be different, this also yields inaccurate results.
For instance, using the below actually generates some MINs in the MAX and vice versa, though in theory it should only grab the MAX date for the year and the MIN date for the year:
;WITH CT AS(
SELECT Points
, Date
, DATEPART(DY,Date) DA
FROM Table
WHERE DATEPART(DY,Date) BETWEEN 363 AND 366
OR DATEPART(DY,Date) BETWEEN 1 AND 3
)
SELECT MIN(c.Date) MinYear
, MAX(c.Date) MaxYear
FROM CT c
GROUP BY YEAR(c.Date)
You want something like this for the first day of the year:
dateadd(year, datediff(year,0, c.Date), 0)
and this for the last day of the year:
--first day of next year -1
dateadd(day, -1, dateadd(year, datediff(year,0, c.Date) + 1, 0)
try this
for getting first day ,last day of the year && firstofthe next_year
SELECT
DATEADD(yy, DATEDIFF(yy,0,getdate()), 0) AS Start_Of_Year,
dateadd(yy, datediff(yy,-1, getdate()), -1) AS Last_Day_Of_Year,
DATEADD(yy, DATEDIFF(yy,0,getdate()) + 1, 0) AS FirstOf_the_NextYear
so putting this in your query
;WITH CT AS(
SELECT Points
, Date
, DATEPART(DY,Date) DA
FROM Table
WHERE DATEPART(DY,Date) BETWEEN
DATEPART(day,DATEADD(yy, DATEDIFF(yy,0,getdate()), 0)) AND
DATEPART(day,dateadd(yy, datediff(yy,-1, getdate()), -1))
)
SELECT MIN(c.Date) MinYear
, MAX(c.Date) MaxYear
FROM CT c
GROUP BY YEAR(c.Date)
I should refrain from developing in the evenings because I solved it, and it's actually quite simple:
SELECT MIN(Date)
, MAX(Date)
FROM Table
GROUP BY YEAR(Date)
I can put these values into a CTE and then JOIN on the dates and get what I need:
;WITH CT AS(
SELECT MIN(Date) Mi
, MAX(Date) Ma
FROM Table
GROUP BY YEAR(Date)
)
SELECT c.Mi
, m.Points
, c.Ma
, f.Points
FROM CT c
INNER JOIN Table m ON c.Mi = m.Date
INNER JOIN Table f ON c.Ma = f.Date

Understanding TSQL Coalesce function

I am trying to select all 12 months / year. And I thought following TSQL code would do this. However, this does not include all months like I want. What is the cause of this? This is modified code:
DECLARE #END_YEAR VARCHAR(10)
DECLARE #END_MONTH VARCHAR(10)
SET #END_YEAR = '2010'
SET #END_MONTH = '10'
DECLARE #TheMonthLastDate DATETIME
DECLARE #TempDate DATETIME
SET #TempDate = '2010-11-01 00:00:00.000'
SET #TheMonthLastDate = '2010-11-01 00:00:00.000'
;with months
as
(
select dateadd(month, -1, dateadd(day, datediff(day, 0, #TempDate), 0)) as m
union all
select dateadd(month, -1, m)
from months
where m > dateadd(month, -12, #TempDate)
)
,yourTable(DateOpened, DateClosed)
as
(select TSK_START_DATE, BTK_CLOSED_DATE
FROM [PROC].ALL_AUDIT
WHERE
(BTK_CLOSED_DATE < #TheMonthLastDate OR
TSK_START_DATE < #TheMonthLastDate
)
)
select yt.DateClosed 'r2', m.m 'r3',
month(coalesce(yt.DateClosed, m.m)) as 'MonthClosed',
year(coalesce(yt.DateClosed, m.m)) as 'YearClosed'
from months m
left join yourTable yt
on
( datepart(year, yt.DateClosed) = DATEPART(year, m.m)
and datepart(month, yt.DateClosed) = DATEPART(month, m.m)
or
datepart(year, yt.DateOpened) = DATEPART(year, m.m)
and datepart(month, yt.DateOpened) = DATEPART(month, m.m)
)
AND year(coalesce(yt.DateClosed, m.m)) = 2010
order by yt.DateClosed
So above query does not return all months. But if I change above WHERE lines to:
FROM [PROC].ALL_AUDIT
WHERE
BTK_CLOSED_DATE < #TheMonthLastDate
then this query does return all 12 months. How can this be?
Output that I want and that I see when WHERE is BTK_CLOSED_DATE < #TheMonthLastDate:
r2 r3 MonthClosed YearClosed
NULL 2010-06-01 00:00:00.000 6 2010
NULL 2009-11-01 00:00:00.000 11 2009
2010-01-06 20:02:19.127 2010-01-01 00:00:00.000 1 2010
2010-01-27 23:13:45.570 2010-01-01 00:00:00.000 1 2010
2010-02-15 14:49:14.427 2010-02-01 00:00:00.000 2 2010
2010-02-15 14:49:14.427 2009-12-01 00:00:00.000 2 2010
But if I instead use WHERE:
(BTK_CLOSED_DATE < #TheMonthLastDate OR
TSK_START_DATE < #TheMonthLastDate
)
then I see:
r2 r3 MonthClosed YearClosed
NULL 2010-10-01 00:00:00.000 10 2010
NULL 2010-09-01 00:00:00.000 9 2010
NULL 2010-09-01 00:00:00.000 9 2010
NULL 2010-08-01 00:00:00.000 8 2010
NULL 2010-08-01 00:00:00.000 8 2010
...
So notice that in first result I see NULL for June 2010, which is what I want.
I think the problem has something to do with the fact that my data contains 2009-2011 data, but I only compare months and not years. How would I add in this additional logic?
The only place where you are reducing the data is with the WHERE clause you have already identified. Therefore, the reason you are not getting all the months you expect is down to the column TSK_START_DATE not being less than #TheMonthLastDate for all months.
To prove this hypothesis, run the following section of your query (I have commented out part of the where clause and removed everything under 'yourTable' cte). The results should show you what is being returned in the TSK_Start_Date column for your missing months and help you identify why the rows are missing when applying the < #TheMonthLastDate clause.
DECLARE #END_YEAR VARCHAR(10)
DECLARE #END_MONTH VARCHAR(10)
SET #END_YEAR = '2010'
SET #END_MONTH = '10'
DECLARE #TheMonthLastDate DATETIME
DECLARE #TempDate DATETIME
SET #TempDate = '2010-11-01 00:00:00.000'
SET #TheMonthLastDate = '2010-11-01 00:00:00.000'
;with months
as
(
select dateadd(month, -1, dateadd(day, datediff(day, 0, #TempDate), 0)) as m
union all
select dateadd(month, -1, m)
from months
where m > dateadd(month, -12, #TempDate)
)
,yourTable(DateOpened, DateClosed)
as
(select TSK_START_DATE, BTK_CLOSED_DATE
FROM [PROC].ALL_AUDIT
WHERE
(BTK_CLOSED_DATE < #TheMonthLastDate OR
--TSK_START_DATE < #TheMonthLastDate
)
)
select * , #TheMonthLastDate TheMonthLastDate from yourTable