How does this Time Difference Calculation work? - tsql

I wanted to display the difference in HH:MM:SS between two datetime fields in SQL Server 2014.
I found a solution in this Stack Overflow post. And it works perfectly. But I want to understand the "why" of how this arrives at the correct answer.
T-SQL:
SELECT y.CustomerID ,
y.createDate ,
y.HarvestDate ,
y.DateDif ,
DATEDIFF ( DAY, 0, y.DateDif ) AS [Days] ,
DATEPART ( HOUR, y.DateDif ) AS [Hours] ,
DATEPART ( MINUTE, y.DateDif ) AS [Minutes]
FROM (
SELECT x.createDate - x.HarvestDate AS [DateDif] ,
x.createDate ,
x.HarvestDate ,
x.CustomerID
FROM (
SELECT CustomerID ,
HarvestDate ,
createDate
FROM dbo.CustomerHarvestReports
WHERE HarvestDate >= DATEADD ( MONTH, -6, GETDATE ())
) AS [x]
) AS [y]
ORDER BY DATEDIFF ( DAY, 0, y.DateDif ) DESC;
Results:
1239090 2017-11-07 08:51:03.870 2017-10-14 11:39:49.540 1900-01-24 21:11:14.330 23 21 11
1239090 2017-11-07 08:51:04.823 2017-10-19 11:17:48.320 1900-01-19 21:33:16.503 18 21 33
1843212 2017-10-27 19:14:02.070 2017-10-21 10:49:57.733 1900-01-07 08:24:04.337 6 8 24
1843212 2017-10-27 19:14:03.057 2017-10-21 10:49:57.733 1900-01-07 08:24:05.323 6 8 24
The first column in Customer ID - the second and third columns are the columns I wanted to calculate the time difference between. The third column is the difference between the two columns - and one of the points in the code in which I do not understand.
If you subtract two datetime fields like this create date - harvestdate, why does it default to the year 1900?
And regarding DATEDIFF ( DAY, 0 , y.DateDiff) - what does the 0 mean? Does the 0 set the date as '01-01-1900'?
It works - for that I am grateful. I was hoping I could get an explanation as to why this behavior works?

I've added some comments that should explain it:
SELECT y.CustomerID ,
y.createDate ,
y.HarvestDate ,
y.DateDif ,
DATEDIFF ( DAY, 0, y.DateDif ) AS [Days] , -- calculates the number of whole days between 0 and the difference
DATEPART ( HOUR, y.DateDif ) AS [Hours] , -- the number of hours between the two dates has already been cleverly
-- calculated in [DateDif], therefore, all that is required is to extract
-- that figure using DATEPART
DATEPART ( MINUTE, y.DateDif ) AS [Minutes] -- same explanation as [Hours]
FROM (
SELECT x.createDate - x.HarvestDate AS [DateDif] , -- calculates the difference expressed as a datetime;
-- 0 is '1900-01-01 00:00:00.000' as a datetime, so the
-- resulting datetime will be that plus the difference
x.createDate ,
x.HarvestDate ,
x.CustomerID
FROM (
SELECT CustomerID ,
HarvestDate ,
createDate
FROM dbo.CustomerHarvestReports
WHERE HarvestDate >= DATEADD ( MONTH, -6, GETDATE ())
) AS [x]
) AS [y]
ORDER BY DATEDIFF ( DAY, 0, y.DateDif ) DESC;

Related

Generating series Postgres

I want to be able to generate groups of row by days, weeks, month or depending on the interval I set
Following this solution, it works when granularity is by month. But trying the interval of 1 week, no records are being returned.
This is the rows on my table
This is the current query I have for per month interval, which works perfectly.
SELECT *
FROM (
SELECT day::date
FROM generate_series(timestamp '2018-09-01'
, timestamp '2018-12-01'
, interval '1 month') day
) d
LEFT JOIN (
SELECT date_trunc('month', created_date)::date AS day
, SUM(escrow_amount) AS profit, sum(total_amount) as revenue
FROM (
select distinct on (order_id) order_id, escrow_amount, total_amount, create_time from order_item
WHERE created_date >= date '2018-09-01'
AND created_date <= date '2018-12-01'
-- AND ... more conditions
) t2 GROUP BY 1
) t USING (day)
ORDER BY day;
Result from this query
And this is the per week interval query. I will reduce the range to two months for brevity.
SELECT *
FROM (
SELECT day::date
FROM generate_series(timestamp '2018-09-01'
, timestamp '2018-11-01'
, interval '1 week') day
) d
LEFT JOIN (
SELECT date_trunc('week', created_date)::date AS day
, SUM(escrow_amount) AS profit, sum(total_amount) as revenue
FROM (
select distinct on (order_id) order_id, escrow_amount, total_amount, create_time from order_item
WHERE created_date >= date '2018-09-01'
AND created_date <= date '2018-11-01'
-- AND ... more conditions
) t2 GROUP BY 1
) t USING (day)
ORDER BY day;
Take note that I have records from October, but the result here doesn't show anything for October dates.
Any idea what I am missing here?
Results from your first query are not truncated to the begin of the week.
date_trunc('2018-09-01'::date, 'week')::date
is equal to
'2018-08-27'::date
so your join using day is not working
'2018-09-01'::date <> '2018-08-27'::date
Your query should look more like that:
SELECT *
FROM (
SELECT day::date
FROM generate_series(date_trunc('week',timestamp '2018-09-01') --series begin trunc
, timestamp '2018-11-01'
, interval '1 week') day
) d
LEFT JOIN (
SELECT date_trunc('week', created_date::date)::date AS day
, SUM(escrow_amount) AS profit, sum(total_amount) as revenue
FROM (
select distinct on (order_id) order_id, escrow_amount, total_amount, create_time from order_item
WHERE created_date::date >= date '2018-09-01'
AND created_date::date <= date '2018-11-01'
-- AND ... more conditions
) t2 GROUP BY 1
) t USING (day)
WHERE day >= '2018-09-01' --to skip days from begining of the week to the begining of the series before trunc
ORDER BY day;

Group Records By Time Period

I'm trying to take a time frame the user selects and then group the selection into time periods - in this case: 2 weeks.
For instance, today is 5/4/2018 and if I set that as my start date and 5/31/2018 as my end date, I get the following:
DECLARE #StartDate DATE ,
#EndDate DATE ,
#ToDate DATE;
SET #StartDate = GETDATE ();
SET #EndDate = '20180531';
SET #ToDate = DATEADD ( DAY, 1, #EndDate );
SELECT dd.Date ,
ROW_NUMBER () OVER ( ORDER BY DATEPART ( WEEK, dd.Date )) AS [rownumb]
FROM dbo.DateDimension AS [dd]
WHERE dd.Date >= #StartDate
AND dd.Date < #ToDate;
And the results look like:
Date rownumb
2018-05-04 1
2018-05-05 2
2018-05-06 3
2018-05-07 4
2018-05-08 5
2018-05-09 6
2018-05-10 7
2018-05-11 8
2018-05-12 9
2018-05-13 10
2018-05-14 11
2018-05-15 12
2018-05-16 13
2018-05-17 14
2018-05-18 15
2018-05-19 16
2018-05-20 17
2018-05-21 18
2018-05-22 19
2018-05-23 20
2018-05-24 21
2018-05-25 22
2018-05-26 23
2018-05-27 24
2018-05-28 25
2018-05-29 26
2018-05-30 27
2018-05-31 28
I was playing around with ROW_NUMBER ( along with RANK and DENSE_RANK ) but I have not been able to get these functions to accomplish what I am looking for but what I am hoping to do is have an additional column called "TimePeriod" where the dates are grouped together in 2-week increments ( or 14 days ) so that 5/4/18 through 5/17/18 have a value of 1 for the "TimePeriod" column and 5/18/18 through 5/31/18 would have a value of 2 for the "TimePeriod" column. And this should be dynamic so that wider date ranges are grouped in periods of two weeks with each period increasing by 1.
Suggestions?
If there's no requirement to use the ordering and ranking functions in sql, you can easily implement as below.
get the total number of days between the start and end date
for each date you subtract the days difference of the current date from the total days difference, then divide this by 14
so this basically will give you the interval (2 weeks) to which the current date belongs, it's zero based so you might want to add a 1 to it
DECLARE #StartDate DATE ,
#EndDate DATE ,
#ToDate DATE;
DECLARE #DaysDiff INT;
SET #StartDate = GETDATE ();
SET #EndDate = '20180531';
SET #ToDate = DATEADD ( DAY, 1, #EndDate );
--GET the difference in days between the start and end date
SET #DaysDiff = DATEDIFF( Day, #StartDate,#ToDate )
SELECT dd.Date ,
( #DaysDiff - DATEDIFF(Day,dd.Date,#ToDate) )/14
FROM dbo.DateDimension AS [dd]
WHERE dd.Date >= #StartDate
AND dd.Date < #ToDate;

2 T-SQL queries that should be the same are not

I have the 2 below queries that should produce the same result as far as I can tell but they are actually producing vastly different numbers. Why is "Between" dates not the same as specifying the month and year of those dates?
What could be causing this?
SELECT [Account]
, SUM([Amount]) AS [Amount]
FROM [Table]
WHERE [Account] = 'Specific Account'
AND Month([Date]) = 5
AND Year([Date]) = 2015
GROUP BY [Account]
Sum Result: -1,500,000
SELECT [Account]
, SUM([Amount]) AS [Amount]
FROM [Table]
WHERE [Account] = 'Specific Account'
AND [Date] BETWEEN '2015-05-01' AND '2015-05-31'
GROUP BY [Account]
Sum Result: 350,000
I need the first one to be correct because I need to group the results by Month and Year, which would be cumbersome using the second query.
Query that I need ultimately:
SELECT [Account]
, Month([Date]) AS [Month]
, Year([Date]) AS [Year]
, SUM([Amount]) AS [Amount]
FROM [Table]
GROUP BY [Account]
, Month([Date])
, Year([Date])
[Date] BETWEEN '2015-05-01' AND '2015-05-31'
will only include rows on the 31st where the time component is midnight and omit the rest of the day.
You should forget about BETWEEN as there is no valid string literal that you can put on the right that will work correctly for datetime,smalldatetime,datetime2(0)..datetime2(7) and use
WHERE [Date] >= '2015-05-01' AND [Date] < '2015-06-01'
Try below for your first case, where you are getting more rows.
AND (Month([Date]) = 5 AND Year([Date]) = 2015)
instead of
AND Month([Date]) = 5 AND Year([Date]) = 2015
==Update==
I would suggest to use CONVERT function. And you should revise your query like below
CONVERT(varchar(10),DATE_COLUMN,112) between '20150501' and '20150531'

Rolling sum per time interval per group

Table, data and task as follows.
See SQL-Fiddle-Link for demo-data and estimated results.
create table "data"
(
"item" int
, "timestamp" date
, "balance" float
, "rollingSum" float
)
insert into "data" ( "item", "timestamp", "balance", "rollingSum" ) values
( 1, '2014-02-10', -10, -10 )
, ( 1, '2014-02-15', 5, -5 )
, ( 1, '2014-02-20', 2, -3 )
, ( 1, '2014-02-25', 13, 10 )
, ( 2, '2014-02-13', 15, 15 )
, ( 2, '2014-02-16', 15, 30 )
, ( 2, '2014-03-01', 15, 45 )
I need to get all rows in an defined time interval. The above table doesn't hold a record per item for each possible date - only dates on which changes applied are recorded ( it is possible that there are n rows per timestamp per item )
If the given interval does not fit exactly on stored timestamps, the latest timestamp before startdate ( nearest smallest neighbour ) should be used as start-balance/rolling-sum.
estimated results ( time interval: startdate = '2014-02-13', enddate = '2014-02-20' )
"item", "timestamp" , "balance", "rollingSum"
1 , '2014-02-13' , -10 , -10
1 , '2014-02-15' , 5 , -5
1 , '2014-02-20' , 2 , -3
2 , '2014-02-13' , 15 , 15
2 , '2014-02-16' , 15 , 30
I checked questions like this and googled a lot, but didn't found a solution yet.
I don't think it's a good idea to extend "data" table with one row per missing date per item, thus the complete interval ( smallest date <-----> latest date per item may expand over several years ).
Thanks in advance!
select sum(balance)
from table
where timestamp >= (select max(timestamp) from table where timestamp <= 'startdate')
and timestamp <= 'enddate'
Don't know what you mean by rolling-sum.
here is an attempt. Seems it gives the right result, not so beautiful. Would have been easier in sqlserver 2012+:
declare #from date = '2014-02-13'
declare #to date = '2014-02-20'
;with x as
(
select
item, timestamp, balance, row_number() over (partition by item order by timestamp, balance) rn
from (select item, timestamp, balance from data
union all
select distinct item, #from, null from data) z
where timestamp <= #to
)
, y as
(
select item,
timestamp,
coalesce(balance, rollingsum) balance ,
a.rollingsum,
rn
from x d
cross apply
(select sum(balance) rollingsum from x where rn <= d.rn and d.item = item) a
where timestamp between '2014-02-13' and '2014-02-20'
)
select item, timestamp, balance, rollingsum from y
where rollingsum is not null
order by item, rn, timestamp
Result:
item timestamp balance rollingsum
1 2014-02-13 -10,00 -10,00
1 2014-02-15 5,00 -5,00
1 2014-02-20 2,00 -3,00
2 2014-02-13 15,00 15,00
2 2014-02-16 15,00 30,00

Get First and Last Day of Any Year

I'm currently trying to get the first and last day of any year. I have data from 1950 and I want to get the first day of the year in the dataset to the last day of the year in the dataset (note that the last day of the year might not be December 31rst and same with the first day of the year).
Initially I thought I could use a CTE and call DATEPART with the day of the year selection, but this wouldn't partition appropriately. I also tried a CTE self-join, but since the last day or first day of the year might be different, this also yields inaccurate results.
For instance, using the below actually generates some MINs in the MAX and vice versa, though in theory it should only grab the MAX date for the year and the MIN date for the year:
;WITH CT AS(
SELECT Points
, Date
, DATEPART(DY,Date) DA
FROM Table
WHERE DATEPART(DY,Date) BETWEEN 363 AND 366
OR DATEPART(DY,Date) BETWEEN 1 AND 3
)
SELECT MIN(c.Date) MinYear
, MAX(c.Date) MaxYear
FROM CT c
GROUP BY YEAR(c.Date)
You want something like this for the first day of the year:
dateadd(year, datediff(year,0, c.Date), 0)
and this for the last day of the year:
--first day of next year -1
dateadd(day, -1, dateadd(year, datediff(year,0, c.Date) + 1, 0)
try this
for getting first day ,last day of the year && firstofthe next_year
SELECT
DATEADD(yy, DATEDIFF(yy,0,getdate()), 0) AS Start_Of_Year,
dateadd(yy, datediff(yy,-1, getdate()), -1) AS Last_Day_Of_Year,
DATEADD(yy, DATEDIFF(yy,0,getdate()) + 1, 0) AS FirstOf_the_NextYear
so putting this in your query
;WITH CT AS(
SELECT Points
, Date
, DATEPART(DY,Date) DA
FROM Table
WHERE DATEPART(DY,Date) BETWEEN
DATEPART(day,DATEADD(yy, DATEDIFF(yy,0,getdate()), 0)) AND
DATEPART(day,dateadd(yy, datediff(yy,-1, getdate()), -1))
)
SELECT MIN(c.Date) MinYear
, MAX(c.Date) MaxYear
FROM CT c
GROUP BY YEAR(c.Date)
I should refrain from developing in the evenings because I solved it, and it's actually quite simple:
SELECT MIN(Date)
, MAX(Date)
FROM Table
GROUP BY YEAR(Date)
I can put these values into a CTE and then JOIN on the dates and get what I need:
;WITH CT AS(
SELECT MIN(Date) Mi
, MAX(Date) Ma
FROM Table
GROUP BY YEAR(Date)
)
SELECT c.Mi
, m.Points
, c.Ma
, f.Points
FROM CT c
INNER JOIN Table m ON c.Mi = m.Date
INNER JOIN Table f ON c.Ma = f.Date