T-SQL get months from ranges of date

T-SQL get months from ranges of date - tsql

I have a Table with id and start date and end date. i want insert into another table, end of each month between the start data and end date and the ID, e.g.
ID Start Date End Date
1 2012-01-01 2012-03-31
2 2012-10-01 2012-12-31
Results
ID MONTH END
1 2012-01-31
1 2012-02-29
1 2012-03-31
2 2012-10-31
2 2012-11-30
2 2012-12-31

This answer makes some assumptions - no end-dates greater than start-dates, but you should see how it works. It creates a recursive union CTE and uses that to figure out the end dates
CREATE TABLE #Dates
(
ID INT IDENTITY PRIMARY KEY,
START_DATE DATETIME2(0) NOT NULL,
END_DATE DATETIME2(0) NOT NULL
)
INSERT INTO #Dates VALUES ('2012-01-01', '2012-03-31'), ('2012-10-01','2012-12-31')
WITH MONTHS ([ID],[Month],[Date], [End])
AS
(
SELECT ID, DATEPART(m,START_DATE) AS [Month], START_DATE AS [Date], DATEADD(s,-1,DATEADD(m,DATEDIFF(m,0,START_DATE)+1,0)) as [End]
FROM #Dates
UNION ALL
SELECT D.ID, DATEPART(m,DATEADD(m,1,[Date])),DATEADD(m,1,[Date]), DATEADD(s,-1,DATEADD(m,DATEDIFF(m,0,DATEADD(m,1,[Date]))+1,0)) as [End]
FROM #Dates D
INNER JOIN MONTHS M
ON D.ID = M.ID
WHERE DATEADD(m,1,[Date]) < [END_DATE]
)
SELECT *
FROM MONTHS ORDER BY ID, Date
DROP TABLE #Dates

Related

DATE ADD function in PostgreSQL

I currently have the following code in Microsoft SQL Server to get users that viewed on two days in a row.
WITH uservideoviewvideo (date, user_id) AS (
SELECT DISTINCT date, user_id
FROM clickstream_videos
WHERE event_name ='video_play'
and user_id IS NOT NULL
)
SELECT currentday.date AS date,
COUNT(currentday.user_id) AS users_view_videos,
COUNT(nextday.user_id) AS users_view_next_day
FROM userviewvideo currentday
LEFT JOIN userviewvideo nextday
ON currentday.user_id = nextday.user_id AND DATEADD(DAY, 1,
currentday.date) = nextday.date
GROUP BY currentday.date
I am trying to get the DATEADD function to work in PostgreSQL but I've been unable to figure out how to get this to work. Any suggestions?

I don't think PostgreSQL really has a DATEADD function. Instead, just do:
+ INTERVAL '1 day'
SQL Server:
Add 1 day to the current date November 21, 2012
SELECT DATEADD(day, 1, GETDATE()); # 2012-11-22 17:22:01.423
PostgreSQL:
Add 1 day to the current date November 21, 2012
SELECT CURRENT_DATE + INTERVAL '1 day'; # 2012-11-22 17:22:01
SELECT CURRENT_DATE + 1; # 2012-11-22 17:22:01
http://www.sqlines.com/postgresql/how-to/dateadd
EDIT:
It might be useful if you're using a dynamic length of time to create a string and then cast it as an interval like:
+ (col_days || ' days')::interval

You can use date + 1 to do the equivalent of dateadd(), but I do not think that your query does what you want to do.
You should use window functions, instead:
with plays as (
select distinct date, user_id
from clickstream_videos
where event_name = 'video_play'
and user_id is not null
), nextdaywatch as (
select date, user_id,
case
when lead(date) over (partition by user_id
order by date) = date + 1 then 1
else 0
end as user_view_next_day
from plays
)
select date,
count(*) as users_view_videos,
sum(user_view_next_day) as users_view_next_day
from nextdaywatch
group by date
order by date;

Gaps and Islands - get a list of dates unemployed over a date range with Postgresl

I have a table called Position, in this table, I have the following, dates are inclusive (yyyy-mm-dd), below is a simplified view of the employment dates
id, person_id, start_date, end_date , title
1 , 1 , 2001-12-01, 2002-01-31, 'admin'
2 , 1 , 2002-02-11, 2002-03-31, 'admin'
3 , 1 , 2002-02-15, 2002-05-31, 'sales'
4 , 1 , 2002-06-15, 2002-12-31, 'ops'
I'd like to be able to calculate the gaps in employment, assuming some of the dates overlap to produce the following output for the person with id=1
person_id, start_date, end_date , last_position_id, gap_in_days
1 , 2002-02-01, 2002-02-10, 1 , 10
1 , 2002-06-01, 2002-06-14, 3 , 14
I have looked at numerous solutions, UNIONS, Materialized views, tables with generated calendar date ranges, etc. I really am not sure what is the best way to do this. Is there a single query where I can get this done?

step-by-step demo:db<>fiddle
You just need the lead() window function. With this you are able to get a value (start_date in this case) to the current row.
SELECT
person_id,
end_date + 1 AS start_date,
lead - 1 AS end_date,
id AS last_position_id,
lead - (end_date + 1) AS gap_in_days
FROM (
SELECT
*,
lead(start_date) OVER (PARTITION BY person_id ORDER BY start_date)
FROM
positions
) s
WHERE lead - (end_date + 1) > 0
After getting the next start_date you are able to compare it with the current end_date. If they differ, you have a gap. These positive values can be filtered within the WHERE clause.
(if 2 positions overlap, the diff is negative. So it can be ignored.)

first you need to find what dates overlaps Determine Whether Two Date Ranges Overlap
then merge those ranges as a single one and keep the last id
finally calculate the ranges of days between one end_date and the next start_date - 1
SQL DEMO
with find_overlap as (
SELECT t1."id" as t1_id, t1."person_id", t1."start_date", t1."end_date",
t2."id" as t2_id, t2."start_date" as t2_start_date, t2."end_date" as t2_end_date
FROM Table1 t1
LEFT JOIN Table1 t2
ON t1."person_id" = t2."person_id"
AND t1."start_date" <= t2."end_date"
AND t1."end_date" >= t2."start_date"
AND t1.id < t2.id
), merge_overlap as (
SELECT
person_id,
start_date,
COALESCE(t2_end_date, end_date) as end_date,
COALESCE(t2_id, t1_id) as last_position_id
FROM find_overlap
WHERE t1_id NOT IN (SELECT t2_id FROM find_overlap WHERE t2_ID IS NOT NULL)
), cte as (
SELECT *,
LEAD(start_date) OVER (partition by person_id order by start_date) next_start
FROM merge_overlap
)
SELECT *,
DATE_PART('day',
(next_start::timestamp - INTERVAL '1 DAY') - end_date::timestamp
) as days
FROM cte
WHERE next_start IS NOT NULL
OUTPUT
| person_id | start_date | end_date | last_position_id | next_start | days |
|-----------|------------|------------|------------------|------------|------|
| 1 | 2001-12-01 | 2002-01-31 | 1 | 2002-02-11 | 10 |
| 1 | 2002-02-11 | 2002-05-31 | 3 | 2002-06-15 | 14 |

Generating series Postgres

I want to be able to generate groups of row by days, weeks, month or depending on the interval I set
Following this solution, it works when granularity is by month. But trying the interval of 1 week, no records are being returned.
This is the rows on my table
This is the current query I have for per month interval, which works perfectly.
SELECT *
FROM (
SELECT day::date
FROM generate_series(timestamp '2018-09-01'
, timestamp '2018-12-01'
, interval '1 month') day
) d
LEFT JOIN (
SELECT date_trunc('month', created_date)::date AS day
, SUM(escrow_amount) AS profit, sum(total_amount) as revenue
FROM (
select distinct on (order_id) order_id, escrow_amount, total_amount, create_time from order_item
WHERE created_date >= date '2018-09-01'
AND created_date <= date '2018-12-01'
-- AND ... more conditions
) t2 GROUP BY 1
) t USING (day)
ORDER BY day;
Result from this query
And this is the per week interval query. I will reduce the range to two months for brevity.
SELECT *
FROM (
SELECT day::date
FROM generate_series(timestamp '2018-09-01'
, timestamp '2018-11-01'
, interval '1 week') day
) d
LEFT JOIN (
SELECT date_trunc('week', created_date)::date AS day
, SUM(escrow_amount) AS profit, sum(total_amount) as revenue
FROM (
select distinct on (order_id) order_id, escrow_amount, total_amount, create_time from order_item
WHERE created_date >= date '2018-09-01'
AND created_date <= date '2018-11-01'
-- AND ... more conditions
) t2 GROUP BY 1
) t USING (day)
ORDER BY day;
Take note that I have records from October, but the result here doesn't show anything for October dates.
Any idea what I am missing here?

Results from your first query are not truncated to the begin of the week.
date_trunc('2018-09-01'::date, 'week')::date
is equal to
'2018-08-27'::date
so your join using day is not working
'2018-09-01'::date <> '2018-08-27'::date
Your query should look more like that:
SELECT *
FROM (
SELECT day::date
FROM generate_series(date_trunc('week',timestamp '2018-09-01') --series begin trunc
, timestamp '2018-11-01'
, interval '1 week') day
) d
LEFT JOIN (
SELECT date_trunc('week', created_date::date)::date AS day
, SUM(escrow_amount) AS profit, sum(total_amount) as revenue
FROM (
select distinct on (order_id) order_id, escrow_amount, total_amount, create_time from order_item
WHERE created_date::date >= date '2018-09-01'
AND created_date::date <= date '2018-11-01'
-- AND ... more conditions
) t2 GROUP BY 1
) t USING (day)
WHERE day >= '2018-09-01' --to skip days from begining of the week to the begining of the series before trunc
ORDER BY day;

t-sql select max value between two columns, or col one when col two is null

This is not easy for me to describe in the title (please forgive me), but here is my problem:
Suppose you have the following table:
CREATE TABLE Subscriptions (product char(3), start_date datetime, end_date datetime);
INSERT INTO #Subscriptions
VALUES('ABC', '2015-01-28 00:00:00', '2016-02-15 00:00:00'),
('ABC', '2016-02-04 12:08:00', NULL),
('DEF', '2013-04-15 00:00:00', '2013-06-10 00:00:00'),
('GHI', '2013-01-11 00:00:00', '2013-04-08 00:00:00');
Now I want to find out for how long a subscription has been either active or passive. I thus need to select the newest end_dates grouped by product, BUT if end_date is null, then I want start_date.
So - I have:
product start_date end_date
ABC 28-01-2015 00:00 15-02-2016 00:00
ABC 04-02-2016 12:08 NULL
DEF 15-04-2013 00:00 10-06-2013 00:00
GHI 11-01-2013 00:00 08-04-2013 00:00
What I want to find in my query:
product relevant_date
ABC 04-02-2016 12:08
DEF 10-06-2013 00:00
GHI 08-04-2013 00:00
I have tried using a union, and that seems to work, but it is very slow, and my question is: is there a more efficient way to solve this (I am using MS SQL Server 2012):
SELECT [product]
,MAX([start_date]) AS start_date
,NULL AS [end_date]
,MAX([start_date]) AS relevant_date
FROM Subscriptions
where end_date IS NULL
GROUP BY product
UNION
SELECT [product]
,NULL
,MAX([end_date])
,MAX([end_date])
FROM Subscriptions
where end_date IS not NULL and product not in (SELECT product FROM Subscriptions
where end_date IS NULL)
GROUP BY product
(If you have a suggestion for another title for my question, I am also all ears!)

For version 2012 or higher you can use a combination of distinct, first_value and isnull, like this:
SELECT DISTINCT
product,
FIRST_VALUE(ISNULL(end_date,start_date))
OVER(PARTITION BY product
ORDER BY ISNULL(end_date, '9999-12-31') DESC) AS EndDate
FROM Subscriptions
Results:
product EndDate
ABC 04.02.2016 12:08:00
DEF 10.06.2013 00:00:00
GHI 08.04.2013 00:00:00
For versions between 2008 and 2012, you can use a cte with row_number to get the same effect:
;WITH CTE AS
(
SELECT product,
ISNULL(end_date,start_date) As relevant_date,
ROW_NUMBER() OVER(PARTITION BY product ORDER BY ISNULL(end_date, '9999-12-31') DESC) As rn
FROM Subscriptions
)
SELECT product,
relevant_date
FROM CTE
WHERE rn = 1
See a live demo on rextester.

If the second ABC row is showing the incorrect start_date then this query should work
SELECT S.product
, relevant_date = MAX(ISNULL(S.end_date,S.start_date))
FROM dbo.Subscriptions S
GROUP BY S.product

This should do it:
select s1.product,MAX(case when useStartDate=1 then s1.startDate else s1.endDate end) 'SubscriptionDate'
from #Subscriptions s1
join (select s2s1.product, max(case when s2s1.endDate is null then 1 else 0 end) 'useStartDate' from #Subscriptions s2s1 group by s2s1.product) s2 on s1.product=s2.product
group by s1.product

Last Working Day is showing null while on weekend

Here is my code but its showing null while today is friday. But I would like to get last working day.
-- Insert statements for procedure here
--Below is the param you would pass
DECLARE #dateToEvaluate date=GETDATE();
--Routine
DECLARE #startDate date=CAST('1/1/'+CAST(YEAR(#dateToEvaluate) AS char(4)) AS date); -- let's get the first of the year
WITH
tally(n) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL))-1 FROM sys.all_columns),
dates AS (
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS dt_id,
DATEADD(DAY,n,#startDate) AS dt,
DATENAME(WEEKDAY,DATEADD(DAY,n,#startdate)) AS dt_name
FROM tally
WHERE n<366 --arbitrary
AND DATEPART(WEEKDAY,DATEADD(DAY,n,#startDate)) NOT IN (6)
AND DATEADD(DAY,n,#startDate) NOT IN (SELECT CAST(HolidayDate AS date) FROM Holiday)),
curr_id(id) AS (SELECT dt_id FROM dates WHERE dt=#dateToEvaluate)
SELECT d.dt
FROM dates AS d
CROSS JOIN
curr_id c
WHERE d.dt_id+1=c.id

The code below will take any date and "walk backward" to find the previous week day (M-F) which is not in the #holidays table.
declare #currentdate datetime = '2015-03-22'
declare #holidays table (holiday datetime)
insert #holidays values ('2015-03-20')
;with cte as (
select
#currentdate k
union all
select
dateadd(day, -1, k)
from cte
where
k = #currentdate
or ((datepart(dw, k) + ##DATEFIRST - 1 - 1) % 7) + 1 > 5 --determine day of week independent of culture
or k in (select holiday from #holidays)
)
select min(k) from cte

The dates table doesn't have any FRIDAY dates in it. Change the NOT IN (6) to NOT IN (1, 7). This will remove Saturday and Sundays from the dates table.