Redshift how to subtract the number of minutes from two different date columns - amazon-redshift

I would like to subtract two date columns and get the difference in minutes. Based on the table below, we can see that a notification has an ideal_date of 11/29 1pm and we noticed that the actual_date was sent on 12/30 1pm that means that it took 24 hours for the notification to be sent, meaning it took 1440 minutes for the notification to be sent out.
I tried the following query but I'm not getting what I need.
select n.ideal_date,
n.actual_date,
abs(date_part(minute,n.ideal_date) - date_part(minute,n.actual_date)) as minutes
from table_date n
id
ideal_date
actual_date
minutes
58
12/29/2021,1:00pm
12/30/2021, 1:00pm
1440 mins

You want DATEDIFF(). https://docs.aws.amazon.com/redshift/latest/dg/r_DATEDIFF_function.html
select n.ideal_date,
n.actual_date,
abs(DATEDIFF(minute,n.ideal_date,n.actual_date)) as minutes
from table_date n

Related

How do I compare two TIMESTAMP columns to check for a difference of at most 15 minutes?

I'm using PostGres 9.5. I have a column in my table, article, of type TIMESTAMP. I would like to write a query in which one of the conditions is to compare two articles whose dates are separated by at most 15 minutes. I tried this ...
where extract(minute from a2.created_on - a1.created_on) < 15
but I'm realizing this is incorrect. This returns articles separted by 15 minutes but also articles separated by an hour and 15 minutes and two hours, 15 minutes, etc. How do I refine my condition so that it only considers articles separated by 15 minutes?
It should be more simple:
WHERE a2.created_on - a1.created_on < '15min'
Difference of two timestamp values is a interval value.

Issue selecting timespan values in KDB

I am facing an issue while selecting the following following timespan :
t:([] date:2#.z.d ; time: 10D21:28:47.425287000 10D12:18:23.287989000 )
date time
--------------------------------
2018.03.15 10D21:28:47.425287000
2018.03.15 10D12:18:23.287989000
when i run the following query, i am not getting the second record back
select from t where time within (12:00;13:00)
I am expecting the 2nd record from the table :
date time
-------------------------------
2018.03.15 10D12:18:23.287989000
Is the 10 in the time value 10D12:18:23.287989000 intentional ?
The reason behind the data not coming back is the time (type timespan ) is actually not the nano seconds since midnight ; as per the table it is 10 days plus nanos since midnight
To select the data only on the basis of time :
q)select from t where (`time$(`date$0)+time) within (12:00;13:00)
date time
-------------------------------
2018.03.15 10D12:18:23.287989000
Try adding the date and time from the table , you would see the date forwarded by 10 days
q)select date+time from t
date
-----------------------------
2018.03.25D21:28:47.425287000
2018.03.25D12:18:23.287989000
The timespan is basically nDhh:mm:ss.sssssssss , where n is relative to midnight. If its 0 then it's current day otherwise its +/- n days (depending on whether n is positive or negative).
try running the following , it will return you the difference between the 2 timestamps as a timespan with n=10.
q)2018.03.25D10:12:00.000000000 - 2018.03.15D10:00:00.000000000
10D00:12:00.000000000
Although you should fix your timestamps (there shouldn't be a 10D) if you're in a situation where you can't fix the upstream data but you believe the timestamps to actually be correct, then you can strip away the 10D as follows:
q)update mod[;`long$10D]time from t
date time
-------------------------------
2018.03.16 0D21:28:47.425287000
2018.03.16 0D12:18:23.287989000

inconsistency between month, day, second representation of interval data type

I understand why postgresql uses month,day and second fields to representate the sql interval datatype. A month is not always the same length and a day can have 23, 24 or 25 hours if a daylight savings time adjustment is involved. this is from postgresql documentation.
But I then do not understand why this is not consequently handled both for months and days. see the following query which calculates an exact interval where the number of seconds between two points in time is exactly calculatable:
select ('2017-01-01'::timestamp-'2016-01-01'::timestamp); -->366 days.
postgresql chooses to give a result in days. not in months and not in seconds.
But why is the result days and not seconds? it is NOT defined how long days are (they can be 23,24 or 25 hours long). so why does he not give output in seconds?
Then since the length of months is also not defined, why doesn't postgresql give an output of 12 month instead of 366 days?
He does not care that the length of days is not defined, but obviously he cares that the length of month is not defined.
Why this asymmetrie?
For further explanation, see this query:
select ('10 days'::interval-'24 hours'::interval); --> 10 days -24:00:00
you see that postgresql correctly refuses to answer with 9 days. He is pretty aware of the problem that days and hours cannot be interchanged. But then again why does the first query return days?
I can't answer your question, but I think I can point you in the right direction. I think the book SQL-99 Complete, Really is the most accessible source for understanding SQL intervals. It's available online: https://mariadb.com/kb/en/sql-99/08-temporal-values/.
SQL standards describe two kinds of intervals: year-month intervals and day-time intervals. It does this to prevent month parts and day parts from appearing in the same interval, because, as you already know, the number of days in a month is ambiguous. The number of days in the interval '3' month depends on which three months you're talking about.
I think this is the verbose, standard SQL way to write your first query.
select cast(timestamp '2017-01-01' - timestamp '2016-01-01' as interval day to hour) as new_column;
new_column
interval day to hour
--
366 days
I suspect that you'll find that SQL standards have rules for what a SQL dbms is supposed to do when things like interval day to hour are omitted. PostgreSQL might or might not follow those rules.
postgresql chooses to give a result in days. not in months and not in seconds.
Standard SQL prevents month parts and day parts from appearing in the same interval. Also, the range of valid seconds is from 0 to 59.
select interval '59' second;
interval
interval second
--
00:00:59
select interval '60' second;
interval
interval second
--
00:01:00

Joining time series events with daily 'shift' data?

What is the best practice for joining 'shift' data and other time series data in Tableau? I am working with multiple geo data (from LA to India, UK, NY, Malaysia, Australia, China etc), and a lot of employees work past midnight.
For example, an employee has shift at 9 PM to 6 AM on 2016-07-31. The 'report date' is 2016-07-31 but no time zone information is provided.
This employee does work and there are events (time stamps in UTC) between 2016-07-31 21:00 to 2016-08-01 06:00. When I look at the events though, 7/31 will only have the events between 21:00 and 23:59. If I filter for just July, my calculations will be skewed (the event data will be cut off at midnight even though the shift extended to 6 AM).
I need to make calculations based upon the total time an employee was actually engaged with work (productive) and the total time they were paid. The request is for this to be daily/weekly/monthly.
If anyone can help me out here or give me some talking points to explain this to my superiors, it would be appreciated. This seems like it must be a common scenario. Do I need to request for a new raw data format or is there something I can do on my end?
the shift data only looks like this:
id date regular_hours overtime_hours total_hours
abc 2016-06-17 8 0.52 8.52
abc 2016-06-18 7.64 0.83 8.47
abc 2016-06-19 7.87 0.23 8.1
the event data is more detailed (30 minute interval data on events handled and the time it took to complete those events in seconds):
id date interval events event_duration
abc 2016-06-17 01:30:00 4 688
abc 2016-06-17 02:00:00 6 924
abc 2016-06-17 02:30:00 10 1320
So, you sum up the event_duration for an entire day and you get a number of seconds which was actually spent doing work. You can then compare this to amount of time that the employee was paid to see how efficient the staffing is.
My concern is that the event data has the date and the time (UTC). The payroll data only has a date without any time zone information. This causes inaccuracies when blending data in Tableau because some shifts cross midnight. Is there a way around this or do I need to propose new data requirements?
(FYI - people have been calculating it just based on the date for years most likely without considering time zones before. My assumption is that they just did not realize that this could cause inaccurate results)

iCalendar (RFC5545) recurrence rule for year

How can I make an event occur every year for some selected days, Like starting 45 days then skip 15 days, then select 30 days then skip 30 days, then select 30 days then skip 30 days, then select 30 days then skip 30 days,then select 30 days then skip 30 days,then select 30 days then skip 30 days
RRULE:FREQ=YEARLY;BYYEARDAY=1,2,..,45,61,62,...,90,120,121....
Is this the right procedure?
combination of freq=yearly and byyearday is indeed the way forward, though it does not look ot be the case for you, you could also consider BYWEEKNO.