How to date trunc in HANA

How to date trunc in HANA - date

I have a query to get the count of buses which travel less than 100 km per day. So I use the query in PostgreSQL
select day,count(*)as bus_count from(
SELECT date_trunc('hour',start_time)::timestamp::date as day,bus_id,sum(distance_two_points) as distance
FROM public.datatable where start_time >= '2015-09-05 00:00:00' and start_time <= '2015-09-05 23:59:59'
group by day,bus_id
) as A where distance<=250000 group by day
The query returns the result
day bus_id distance
___ ________ _________
"2015-09-05 00:00:00" 1 523247
"2015-09-05 00:00:00" 2 135114
"2015-09-05 00:00:00" 3 178560
"2015-09-05 00:00:00" 4 400071
"2015-09-05 00:00:00" 5 312832
"2015-09-05 00:00:00" 6 237075
So I now want to use this same query (achieving same results) in SAP HANA but there is no date trunc function and I also tried
SELECT EXTRACT (DAY FROM TO_DATE (START_TIME, 'YYYY-MM-DD')) "extract" as day,
bus_id, sum(distance_two_points) as distance
FROM public.datatable
where start_time >= '2015-09-05 00:00:00' and start_time <= '2015-09-05 23:59:59'
group by day,bus_id
) as A where distance<=250000 group by day
Any help is appreciated.

SELECT SERIES_ROUND('2013-05-24', 'INTERVAL 1 YEAR', ROUND_DOWN) "result" FROM DUMMY;
SELECT SERIES_ROUND('04:25:01', 'INTERVAL 10 MINUTE') "result" FROM DUMMY;
The SERIES_ROUND from SAP Hana provides similar functionalities as DATE_TRUNC in other vendors.
https://help.sap.com/docs/SAP_HANA_PLATFORM/4fe29514fd584807ac9f2a04f6754767/435ec476ab494ad6b8409f22abec13fe.html?version=2.0.00

Converting to a non-datetime data type is usually not a good idea (additional parsing, encoding, semantics...).
Instead use a less granular datetime data type: daydate in this case.
create column table datatab (start_time seconddate, bus_id int, distance_two_points decimal (10, 2));
insert into datatab values (to_seconddate('05.09.2015 13:12:00'), 1, 50.2);
insert into datatab values (to_seconddate('05.09.2015 13:22:00'), 1, 1.2);
insert into datatab values (to_seconddate('05.09.2015 15:32:00'), 1, 24);
insert into datatab values (to_seconddate('05.09.2015 13:12:00'), 1, 50.2);
insert into datatab values (to_seconddate('05.09.2015 14:22:00'), 2, 1.2);
insert into datatab values (to_seconddate('05.09.2015 16:32:00'), 2, 24);
select to_seconddate(day) as day,count(*) as bus_count from(
SELECT to_date(start_time) as day, bus_id, sum(distance_two_points) as distance
FROM datatab
where start_time between '2015-09-05 00:00:00' and '2015-09-05 23:59:59'
group by to_date(start_time),bus_id
) as A
where distance<=250000
group by day;
The inner query gives you:
DAY BUS_ID DISTANCE
2015-09-05 1 75.40
2015-09-05 2 25.20
So, your seconddate "start_time" is now aggregated as daydate and then converted back to 'seconddate'.

What I prefer is using the seconds_between() or nano100_between() function.
select now(),
add_seconds( to_date('1970.01.01', 'YYYY.MM.DD'),
round(
SECONDS_BETWEEN(
to_date('1970.01.01', 'YYYY.MM.DD'),
now()
)/3600
)*3600
)
from dummy;
This looks a bit ugly but given the to_date() is calculated just once and not for each row and the seconds arithmetic is close to how Hana stores the value internally, it should be the most efficient of the lot.
Also it is the most flexible, round by second, minute, hour, day,... everything below year is fine.
PS: round() supports all round and truncate options.

Assuming your start_time is of some data/time type (e.g. SECONDDATE) you could use
...TO_NVARCHAR(START_TIME, 'YYYY-MM-DD') AS DAY...
Instead of date_trunc... in PostgreSQL

Why don't you use CAST() conversion function?
select
cast( now() as date ) myDate
from dummy;

Related

How to get financial year wise periods for a given date range

My financial year start from 01-Jul to 30-Jun every year.
I want to find out all financial year wise periods for a given date range.
Let's say, The date range is From_Date:16-Jun-2021 To_Date 31-Aug-2022. Then my output should be like
Start_Date, End_date
16-Jun-2021, 30-Jun-2021
01-Jul-2021, 30-Jun-2022
01-jul-2022, 31-Aug-2022
Please help me query. First record Start_Date must start from From_Date and Last record End_Date must end at To_Date

This should work for the current century.
with t(fys, fye) as
(
select (y + interval '6 months')::date,
(y + interval '1 year 6 months - 1 day')::date
from generate_series ('2000-01-01'::date, '2100-01-01', interval '1 year') y
),
periods (period_start, period_end) as
(
select
case when fys < '16-Jun-2021'::date then '16-Jun-2021'::date else fys end,
case when fye > '31-Aug-2022'::date then '31-Aug-2022'::date else fye end
from t
)
select * from periods where period_start < period_end;
period_start
period_end
2021-06-16
2021-06-30
2021-07-01
2022-06-30
2022-07-01
2022-08-31
Looks well as a parameterized query too with '16-Jun-2021' and '31-Aug-2022' replaced by parameter placeholders.

You want to create multiple records from one record (your date range). To accomplish this, you will need some kind of helper table.
In this example I created that helper table using GENERATE_SERIES and use it to join it to your date range, with some logic to get the dates you want.
dbfiddle
--Generate a range of fiscal years
WITH FISCAL_YEARS AS (
SELECT
CONCAT(SEQUENCE.YEAR, '-07-01')::DATE AS FISCAL_START,
CONCAT(SEQUENCE.YEAR + 1, '-06-30')::DATE AS FISCAL_END
FROM GENERATE_SERIES(2000, 2030) AS SEQUENCE (YEAR)
),
--Your date range
DATE_RANGE AS (
SELECT
'2021-06-16'::DATE AS RANGE_START,
'2022-08-31'::DATE AS RANGE_END
)
SELECT
--Case statement in case the range_start is later
--than the start of the fiscal year
CASE
WHEN RANGE_START > FISCAL_START
THEN RANGE_START
ELSE FISCAL_START
END AS START_DATE,
--Case statement in case the range_end is earlier
--than the end of the fiscal year
CASE
WHEN RANGE_END < FISCAL_END
THEN RANGE_END
ELSE FISCAL_END
END AS END_DATE
FROM FISCAL_YEARS
JOIN DATE_RANGE
--Join to get all relevant fiscal years
ON FISCAL_YEARS.FISCAL_START BETWEEN DATE_RANGE.RANGE_START AND DATE_RANGE.RANGE_END
OR FISCAL_YEARS.FISCAL_END BETWEEN DATE_RANGE.RANGE_START AND DATE_RANGE.RANGE_END

Dynamic value passing in Postgres

Here is a complex query where i need to pass some dates as dynamic to this, As of now i have hardcoded this '2021-08-01' AND '2022-07-31' these 2 dates.
But i have to pass this dates dynamically in such a way that next dates ie, 2022-06 month , thew dates passed will be '2021-07-01' and '2022-06-30' , basically 12 months behind data.
if we take 2022-05 then the passed date should be '2021-06-01' and '2022-05-31'.
How can we achieve this ? Any suggestions or help will be much appreciated.
below is the query for reference
WITH base as
(
SELECT created_at as period ,order_number, TRIM(email) as email ,is_first_order
FROM orders
WHERE created_at::DATE BETWEEN '2021-08-01' AND '2022-07-31'
)
,base_agg as
(
select TO_CHAR(period,'YYYY-MM') as period
,COUNT(DISTINCT email)FILTER(WHERE is_first_order IS TRUE) as new_users
,COUNT(DISTINCT order_number)FILTER(WHERE is_first_order IS FALSE) as returning_orders
FROM base
GROUP BY 1
)
,base_cumulative as
(
SELECT ROW_NUMBER() OVER(ORDER BY PERIOD DESC ) as rno
,period
,new_users
,returning_orders
,sum("new_users")over (order by "period" asc rows between unbounded preceding and current row) as "cumulative_total"
from base_agg
)
SELECT
(SELECT period FROM base_cumulative WHERE rno=1) period
,(SELECT cumulative_total FROM base_cumulative WHERE rno=1) as cumulated_customers
,SUM(returning_orders) as returning_orders
,SUM(returning_orders)/NULLIF((SELECT cumulative_total FROM base_cumulative WHERE rno=1),0) as rate
FROM base_cumulative

You can calculate the end of current month based on NOW() and some logic, the same can be applied with the rest of the calculation
select date_trunc('month', now())::date + interval '1 month - 1 day' end_of_this_month,
date_trunc('month', now())::date + interval '1 month - 1 day'::interval - '1 year'::interval + '1 day'::interval first_day_of_prev_year_month
;
Result
end_of_this_month | first_day_of_prev_year_month
---------------------+------------------------------
2022-08-31 00:00:00 | 2021-09-01 00:00:00
(1 row)

DATE ADD function in PostgreSQL

I currently have the following code in Microsoft SQL Server to get users that viewed on two days in a row.
WITH uservideoviewvideo (date, user_id) AS (
SELECT DISTINCT date, user_id
FROM clickstream_videos
WHERE event_name ='video_play'
and user_id IS NOT NULL
)
SELECT currentday.date AS date,
COUNT(currentday.user_id) AS users_view_videos,
COUNT(nextday.user_id) AS users_view_next_day
FROM userviewvideo currentday
LEFT JOIN userviewvideo nextday
ON currentday.user_id = nextday.user_id AND DATEADD(DAY, 1,
currentday.date) = nextday.date
GROUP BY currentday.date
I am trying to get the DATEADD function to work in PostgreSQL but I've been unable to figure out how to get this to work. Any suggestions?

I don't think PostgreSQL really has a DATEADD function. Instead, just do:
+ INTERVAL '1 day'
SQL Server:
Add 1 day to the current date November 21, 2012
SELECT DATEADD(day, 1, GETDATE()); # 2012-11-22 17:22:01.423
PostgreSQL:
Add 1 day to the current date November 21, 2012
SELECT CURRENT_DATE + INTERVAL '1 day'; # 2012-11-22 17:22:01
SELECT CURRENT_DATE + 1; # 2012-11-22 17:22:01
http://www.sqlines.com/postgresql/how-to/dateadd
EDIT:
It might be useful if you're using a dynamic length of time to create a string and then cast it as an interval like:
+ (col_days || ' days')::interval

You can use date + 1 to do the equivalent of dateadd(), but I do not think that your query does what you want to do.
You should use window functions, instead:
with plays as (
select distinct date, user_id
from clickstream_videos
where event_name = 'video_play'
and user_id is not null
), nextdaywatch as (
select date, user_id,
case
when lead(date) over (partition by user_id
order by date) = date + 1 then 1
else 0
end as user_view_next_day
from plays
)
select date,
count(*) as users_view_videos,
sum(user_view_next_day) as users_view_next_day
from nextdaywatch
group by date
order by date;

Generate dates for postgres

i have a table
and i have a range from '2019-01-02' to '2019-01-04'
I need to generate ID and DATES (generated) from my table which started_at and ended_at (nullable) between the given range
result must be like this:
ID 4 from table is not included in result because it's started_at and ended_at not in range '2019-01-02' and '2019-01-04'
I need query that will do that in postgres.

Use generate_series()
select t.id, g.dt::date
from the_table t
cross join generate_series(t.started_at::date + 1,
least(t.ended_at::date, date '2019-01-04'),
interval '1 day') as g(dt)
where t.started_at >= date '2019-01-02'
and t.started_at < date '2019-01-04';

Worked this variant:
select t.id, g.dt::date from the_table t
cross join generate_series(t.started_at::date + 1,
least(t.ended_at::date, date '2019-01-04'), interval '1 day') as g(dt)
where dt >= date '2019-01-02' and dt < date '2019-01-04';

postgresql list of time slots from 'Monday' | 09:00:00 | 11:00:00

I’m building a booking system where a user will set their availability eg: I’m available Monday’s from 9am to 11am, Tuesdays from 9am to 5pm etc… and need to generate a list of time slots 15mins apart from their availability.
I have the following table (but am flexible to changing this):
availabilities(day_of_week text, start_time: time, end_time: time)
which returns records like:
‘Monday’ | 09:00:00 | 11:00:00
‘Monday’ | 13:00:00 | 17:00:00
‘Tuesday’ | 08:00:00 | 17:00:00
So I’m trying to build a stored procedure to generate a list of time slots so far I've got this:
create or replace function timeslots ()
return setof timeslots as $$
declare
rec record;
begin
for rec in select * from availabilities loop
/*
convert 'Monday' | 09:00:00 | 11:00:00 into:
2020-02-03 09:00:00
2020-02-03 09:15:00
2020-02-03 09:30:00
2020-02-03 09:45:00
2020-02-03 10:00:00
and so on...
*/
return next
end loop
$$ language plpgsql stable;
I return a setof instead of a table as I'm using Hasura and it needs to return a setof so I just create a blank table.
I think I'm on the right track but am currently stuck on:
how do I create a timestamp from 'Monday' 09:00:00 for the next monday as I only care about timeslots from today onwards?
how do I convert 'Monday' | 09:00:00 | 11:00:00 into a list of time slots 15 mins apart?

how do I create a timestamp from 'Monday' 09:00:00 for the next monday
as I only care about timeslots from today onwards?
You can use date_trunc for this (see this question for more info):
SELECT date_trunc('week', current_date) + interval '1 week';
From the docs re week:
The number of the ISO 8601 week-numbering week of the year. By
definition, ISO weeks start on Mondays
So taking this value and adding a week gives next Monday (you may need to ammend this behaviour based upon what you want to do if today is monday!).
how do I convert 'Monday' | 09:00:00 | 11:00:00 into a list of time
slots 15 mins apart?
This is a little tricker; generate_series will give you the timeslots but the trick is getting it into a result set. The following should do the job (I have included your sample data; change the values bit to refer to your table) - dbfiddle :
with avail_times as (
select
date_trunc('week', current_date) + interval '1 week' + case day_of_week when 'Monday' then interval '0 day' when 'Tuesday' then interval '1 day' end + start_time as start_time,
date_trunc('week', current_date) + interval '1 week' + case day_of_week when 'Monday' then interval '0 day' when 'Tuesday' then interval '1 day' end + end_time as end_time
from
(
values
('Monday','09:00:00'::time,'11:00:00'::time),
('Monday','13:00:00'::time,'17:00:00'::time),
('Tuesday','08:00:00'::time,'17:00:00'::time)
) as availabilities (day_of_week,
start_time,
end_time) )
select
g.ts
from
(
select
start_time,
end_time
from
avail_times) avail,
generate_series(avail.start_time, avail.end_time - interval '1ms', '15 minutes') g(ts);
A few notes:
The CTE avail_times is used to simplify things; it generates two columns (start_time and end_time) which are the full timestamps (so including the date). In this example the first row is "2020-02-03 09:00:00, 2020-02-03 11:00:00" (I'm running this on 2020-02-02 so 2020-02-03 is next Monday).
The way I'm converting 'monday' etc to a day of the week is a bit of a hack (and I have not bothered to do the full week); there is probably a better way but storing the day of week as an integer would make this simpler.
I subtract 1ms from the end time because I'm assuming you dont want this in the result set.
The main query is using a LATERAL Subquery. See this question for more info.
Aditional Question
how to adjust this so I can pass in a start and end date so I can get
time slots for a particular period
You could do something like the following (just adjust the dates CTE to return whatever days you want to include; you could convert to a function or just pass the dates in as parameters).
Note that as #Belayer mentions my original solution did not cater for shifts over midnight so this addresses that too.
with dates as (
select
day
from
generate_series('2020-02-20'::date, '2020-03-10'::date, '1 day') as day ),
availabilities as (
select
*
from
(
values (1,'09:00:00'::time,'11:00:00'::time),
(1,'13:00:00'::time,'17:00:00'::time),
(2,'08:00:00'::time,'17:00:00'::time),
(3,'23:00:00'::time,'01:00:00'::time)
) as availabilities
(day_of_week, -- 1 = monday
start_time,
end_time) ) ,
avail_times as (
select
d.day + start_time as start_time,
case
end_time > start_time
when true then d.day
else d.day + interval '1 day' end + end_time as end_time
from
availabilities a
inner join dates d on extract(ISODOW from d.day) = a.day_of_week )
select
g.ts
from
(
select
start_time,
end_time
from
avail_times) avail,
generate_series(avail.start_time, avail.end_time - interval '1ms', '15 minutes') g(ts)
order by
g.ts;

The following uses much of the techniques mentioned by #Brits. They present some very good information, so I'll not repeat but suggest you review it (and the links).
I do however take a slightly different approach. First a couple table changes. I use the ISO day of week 1-7 (Monday-Sunday) rather than the day name. The day name is easily extracted for the dater later.
Also I use interval instead to time for start and end times. ( A time data type works for most scenarios but there is one it doesn't (more later).
One thing your description does not make clear is whether the ending time is included it the available time or not. If included the last interval would be 11:00-11:15. If excluded the last interval is 10:45-11:00. I have assumed to excluded it. In the final results the end time is to be read as "up to but not including".
-- setup
create table availabilities (weekday integer, start_time interval, end_time interval);
insert into availabilities (weekday , start_time , end_time )
select wkday
, start_time
, end_time
from (select *
from (values (1, '09:00'::interval, '11:00'::interval)
, (1, '13:00'::interval, '17:00'::interval)
, (2, '08:00'::interval, '17:00'::interval)
, (3, '08:30'::interval, '10:45'::interval)
, (4, '10:30'::interval, '12:45'::interval)
) as v(wkday,start_time,end_time)
) r ;
select * from availabilities;
The Query
It begins with a CTE (next_week) generates a entry for each day of the week beginning Monday and the appropriate ISO day number for it. The main query joins these with the availabilities table to pick up times for matching days. Finally that result is cross joined with a generated timestamp to get the 15 minute intervals.
-- Main
with next_week (wkday,tm) as
(SELECT n+1, date_trunc('week', current_date) + interval '1 week' + n*interval '1 day'
from generate_series (0, 6) n
)
select to_char(gdtm,'Day'), gdtm start_time, gdtm+interval '15 min' end_time
from ( select wkday, tm, start_time, end_time
from next_week nw
join availabilities av
on (av.weekday = nw.wkday)
) s
cross join lateral
generate_series(start_time+tm, end_time+tm- interval '1 sec', interval '15 min') gdtm ;
The outlier
As mentioned there is one scenario where a time data type does not work satisfactory, but you may not nee it. What happens when a shift worker says they available time is 23:00-01:30. Believe me when a shift worker goes to work at 22:00 of Friday, 01:30 is still Friday night, even though the calendar might not agree. (I worked that shift for many years.) The following using interval handles that issue. Loading the same data as prior with an addition for the this case.
insert into availabilities (weekday, start_time, end_time )
select wkday
, start_time
, end_time + case when end_time < start_time
then interval '1 day'
else interval '0 day'
end
from (select *
from (values (1, '09:00'::interval, '11:00'::interval)
, (1, '13:00'::interval, '17:00'::interval)
, (2, '08:00'::interval, '17:00'::interval)
, (3, '08:30'::interval, '10:45'::interval)
, (5, '23:30'::interval, '02:30'::interval) -- Friday Night - Saturday Morning
) as v(wkday,start_time,end_time)
) r
;
select * from availabilities;
Hope this helps.