How to get count for every 1 hour interval - postgresql

select
count("Status") as Total_Count
from "dbo"
where "Status" = 'Pass'
and "StartDateTime" BETWEEN '2020-11-01 15:00:00' AND '2020-11-01 16:00:00'
group by "Status"
How to get data for every 1 hour interval as in the image above? As currently i changing the time interval manualy. I want get the counts from 12am to 12am next day with 1 hour interval.

demo: db<>fiddle
When you truncate the start time with date_trunc() at the hour part, all times will be normalized to full hours. This can be used as the GROUP BY criterion.
SELECT
COUNT(*)
FROM
t
GROUP BY date_trunc('hour', starttime)
To format the time column as you expect, you can use the to_char() function:
SELECT
to_char(date_trunc('hour', starttime), 'HH12:MI:SS AM') || ' - ' || to_char(date_trunc('hour', starttime) + interval '1 hour', 'HH12:MI:SS AM'),
COUNT(*)
FROM
t
GROUP BY date_trunc('hour', starttime)

Related

Dynamic value passing in Postgres

Here is a complex query where i need to pass some dates as dynamic to this, As of now i have hardcoded this '2021-08-01' AND '2022-07-31' these 2 dates.
But i have to pass this dates dynamically in such a way that next dates ie, 2022-06 month , thew dates passed will be '2021-07-01' and '2022-06-30' , basically 12 months behind data.
if we take 2022-05 then the passed date should be '2021-06-01' and '2022-05-31'.
How can we achieve this ? Any suggestions or help will be much appreciated.
below is the query for reference
WITH base as
(
SELECT created_at as period ,order_number, TRIM(email) as email ,is_first_order
FROM orders
WHERE created_at::DATE BETWEEN '2021-08-01' AND '2022-07-31'
)
,base_agg as
(
select TO_CHAR(period,'YYYY-MM') as period
,COUNT(DISTINCT email)FILTER(WHERE is_first_order IS TRUE) as new_users
,COUNT(DISTINCT order_number)FILTER(WHERE is_first_order IS FALSE) as returning_orders
FROM base
GROUP BY 1
)
,base_cumulative as
(
SELECT ROW_NUMBER() OVER(ORDER BY PERIOD DESC ) as rno
,period
,new_users
,returning_orders
,sum("new_users")over (order by "period" asc rows between unbounded preceding and current row) as "cumulative_total"
from base_agg
)
SELECT
(SELECT period FROM base_cumulative WHERE rno=1) period
,(SELECT cumulative_total FROM base_cumulative WHERE rno=1) as cumulated_customers
,SUM(returning_orders) as returning_orders
,SUM(returning_orders)/NULLIF((SELECT cumulative_total FROM base_cumulative WHERE rno=1),0) as rate
FROM base_cumulative
You can calculate the end of current month based on NOW() and some logic, the same can be applied with the rest of the calculation
select date_trunc('month', now())::date + interval '1 month - 1 day' end_of_this_month,
date_trunc('month', now())::date + interval '1 month - 1 day'::interval - '1 year'::interval + '1 day'::interval first_day_of_prev_year_month
;
Result
end_of_this_month | first_day_of_prev_year_month
---------------------+------------------------------
2022-08-31 00:00:00 | 2021-09-01 00:00:00
(1 row)

Postgresql - query to get difference in data count

I have two tables, today's_table and yeterday's_table.
I need to compare the data for an interval of 15 mins at exact same times for today and yesterday.
For example, for below data let's I need to check from 00:00:00 and 00:15:00 on 20201202 and 20201202. So difference should come out as '3' since the yesterday's_table has 8 records and today's_table has 5 records.
today's_table:
Yesterday's table:
I tried something like; (consider now() is 00:15:00)
select count(*) from yeterday's_table where time between now() - interval "24 hours" and now() - interval "23 hours 45 mins"
minus
select count(*) from today's_table where time = now() - interval "15 minutes";
is there any other way to do this?
You can easily do this with subqueries:
SELECT b.c - a.c
FROM (select count(*) as c from yeterdays_table where time between now() - interval '24 hours' and now() - interval '23 hours 45 mins') a,
(select count(*) as c from todays_table where time = now() - interval '15 minutes') b;
Bear in mind you need to single-quote your intervals, and your table names cannot have quotes in them.

postgresql list of time slots from 'Monday' | 09:00:00 | 11:00:00

I’m building a booking system where a user will set their availability eg: I’m available Monday’s from 9am to 11am, Tuesdays from 9am to 5pm etc… and need to generate a list of time slots 15mins apart from their availability.
I have the following table (but am flexible to changing this):
availabilities(day_of_week text, start_time: time, end_time: time)
which returns records like:
‘Monday’ | 09:00:00 | 11:00:00
‘Monday’ | 13:00:00 | 17:00:00
‘Tuesday’ | 08:00:00 | 17:00:00
So I’m trying to build a stored procedure to generate a list of time slots so far I've got this:
create or replace function timeslots ()
return setof timeslots as $$
declare
rec record;
begin
for rec in select * from availabilities loop
/*
convert 'Monday' | 09:00:00 | 11:00:00 into:
2020-02-03 09:00:00
2020-02-03 09:15:00
2020-02-03 09:30:00
2020-02-03 09:45:00
2020-02-03 10:00:00
and so on...
*/
return next
end loop
$$ language plpgsql stable;
I return a setof instead of a table as I'm using Hasura and it needs to return a setof so I just create a blank table.
I think I'm on the right track but am currently stuck on:
how do I create a timestamp from 'Monday' 09:00:00 for the next monday as I only care about timeslots from today onwards?
how do I convert 'Monday' | 09:00:00 | 11:00:00 into a list of time slots 15 mins apart?
how do I create a timestamp from 'Monday' 09:00:00 for the next monday
as I only care about timeslots from today onwards?
You can use date_trunc for this (see this question for more info):
SELECT date_trunc('week', current_date) + interval '1 week';
From the docs re week:
The number of the ISO 8601 week-numbering week of the year. By
definition, ISO weeks start on Mondays
So taking this value and adding a week gives next Monday (you may need to ammend this behaviour based upon what you want to do if today is monday!).
how do I convert 'Monday' | 09:00:00 | 11:00:00 into a list of time
slots 15 mins apart?
This is a little tricker; generate_series will give you the timeslots but the trick is getting it into a result set. The following should do the job (I have included your sample data; change the values bit to refer to your table) - dbfiddle :
with avail_times as (
select
date_trunc('week', current_date) + interval '1 week' + case day_of_week when 'Monday' then interval '0 day' when 'Tuesday' then interval '1 day' end + start_time as start_time,
date_trunc('week', current_date) + interval '1 week' + case day_of_week when 'Monday' then interval '0 day' when 'Tuesday' then interval '1 day' end + end_time as end_time
from
(
values
('Monday','09:00:00'::time,'11:00:00'::time),
('Monday','13:00:00'::time,'17:00:00'::time),
('Tuesday','08:00:00'::time,'17:00:00'::time)
) as availabilities (day_of_week,
start_time,
end_time) )
select
g.ts
from
(
select
start_time,
end_time
from
avail_times) avail,
generate_series(avail.start_time, avail.end_time - interval '1ms', '15 minutes') g(ts);
A few notes:
The CTE avail_times is used to simplify things; it generates two columns (start_time and end_time) which are the full timestamps (so including the date). In this example the first row is "2020-02-03 09:00:00, 2020-02-03 11:00:00" (I'm running this on 2020-02-02 so 2020-02-03 is next Monday).
The way I'm converting 'monday' etc to a day of the week is a bit of a hack (and I have not bothered to do the full week); there is probably a better way but storing the day of week as an integer would make this simpler.
I subtract 1ms from the end time because I'm assuming you dont want this in the result set.
The main query is using a LATERAL Subquery. See this question for more info.
Aditional Question
how to adjust this so I can pass in a start and end date so I can get
time slots for a particular period
You could do something like the following (just adjust the dates CTE to return whatever days you want to include; you could convert to a function or just pass the dates in as parameters).
Note that as #Belayer mentions my original solution did not cater for shifts over midnight so this addresses that too.
with dates as (
select
day
from
generate_series('2020-02-20'::date, '2020-03-10'::date, '1 day') as day ),
availabilities as (
select
*
from
(
values (1,'09:00:00'::time,'11:00:00'::time),
(1,'13:00:00'::time,'17:00:00'::time),
(2,'08:00:00'::time,'17:00:00'::time),
(3,'23:00:00'::time,'01:00:00'::time)
) as availabilities
(day_of_week, -- 1 = monday
start_time,
end_time) ) ,
avail_times as (
select
d.day + start_time as start_time,
case
end_time > start_time
when true then d.day
else d.day + interval '1 day' end + end_time as end_time
from
availabilities a
inner join dates d on extract(ISODOW from d.day) = a.day_of_week )
select
g.ts
from
(
select
start_time,
end_time
from
avail_times) avail,
generate_series(avail.start_time, avail.end_time - interval '1ms', '15 minutes') g(ts)
order by
g.ts;
The following uses much of the techniques mentioned by #Brits. They present some very good information, so I'll not repeat but suggest you review it (and the links).
I do however take a slightly different approach. First a couple table changes. I use the ISO day of week 1-7 (Monday-Sunday) rather than the day name. The day name is easily extracted for the dater later.
Also I use interval instead to time for start and end times. ( A time data type works for most scenarios but there is one it doesn't (more later).
One thing your description does not make clear is whether the ending time is included it the available time or not. If included the last interval would be 11:00-11:15. If excluded the last interval is 10:45-11:00. I have assumed to excluded it. In the final results the end time is to be read as "up to but not including".
-- setup
create table availabilities (weekday integer, start_time interval, end_time interval);
insert into availabilities (weekday , start_time , end_time )
select wkday
, start_time
, end_time
from (select *
from (values (1, '09:00'::interval, '11:00'::interval)
, (1, '13:00'::interval, '17:00'::interval)
, (2, '08:00'::interval, '17:00'::interval)
, (3, '08:30'::interval, '10:45'::interval)
, (4, '10:30'::interval, '12:45'::interval)
) as v(wkday,start_time,end_time)
) r ;
select * from availabilities;
The Query
It begins with a CTE (next_week) generates a entry for each day of the week beginning Monday and the appropriate ISO day number for it. The main query joins these with the availabilities table to pick up times for matching days. Finally that result is cross joined with a generated timestamp to get the 15 minute intervals.
-- Main
with next_week (wkday,tm) as
(SELECT n+1, date_trunc('week', current_date) + interval '1 week' + n*interval '1 day'
from generate_series (0, 6) n
)
select to_char(gdtm,'Day'), gdtm start_time, gdtm+interval '15 min' end_time
from ( select wkday, tm, start_time, end_time
from next_week nw
join availabilities av
on (av.weekday = nw.wkday)
) s
cross join lateral
generate_series(start_time+tm, end_time+tm- interval '1 sec', interval '15 min') gdtm ;
The outlier
As mentioned there is one scenario where a time data type does not work satisfactory, but you may not nee it. What happens when a shift worker says they available time is 23:00-01:30. Believe me when a shift worker goes to work at 22:00 of Friday, 01:30 is still Friday night, even though the calendar might not agree. (I worked that shift for many years.) The following using interval handles that issue. Loading the same data as prior with an addition for the this case.
insert into availabilities (weekday, start_time, end_time )
select wkday
, start_time
, end_time + case when end_time < start_time
then interval '1 day'
else interval '0 day'
end
from (select *
from (values (1, '09:00'::interval, '11:00'::interval)
, (1, '13:00'::interval, '17:00'::interval)
, (2, '08:00'::interval, '17:00'::interval)
, (3, '08:30'::interval, '10:45'::interval)
, (5, '23:30'::interval, '02:30'::interval) -- Friday Night - Saturday Morning
) as v(wkday,start_time,end_time)
) r
;
select * from availabilities;
Hope this helps.

getting dynamic timezone offset based on month

I've a query
"TO_CHAR((TIMEZONE('#{ tz_offset }', created_at) + (#{ months.to_s } * INTERVAL '1 month')))"
The final query becomes:
"SELECT sum(price) as sum_total, min(TO_CHAR((TIMEZONE('-05:00', created_at) + (0 * INTERVAL '1 month')), 'YYYY-MM')) as formatted_date, min(TO_CHAR((TIMEZONE('-05:00', created_at) + (0 * INTERVAL '1 month')), 'Mon/YY')) as key FROM items WHERE created_at BETWEEN '2018-09-01 04:00:00' AND '2020-02-01 03:59:59' GROUP BY TO_CHAR((TIMEZONE('-05:00', created_at) + (0 * INTERVAL '1 month')), 'YYYY-MM') ORDER BY TO_CHAR((TIMEZONE('-05:00', created_at) + (0 * INTERVAL '1 month')), 'YYYY-MM')"
As you can see the tz_offset becomes -05:00
However, the problem I'm running into is that the tz_offset is static, and it can change based on the month. Such as, CST time is -05:00 in July but -06:00 in August. I would like to make sure the amounts fall into correct months based on daylight savings. Is it possible for postgres to calculate this tz_offset based on timezone?
I was able to get it to work using
created_at AT TIME ZONE <tz_name>
Rather than the offset, use the name of the time zone, for example America/Chicago. That includes the daylight savings time setting.

How to calculate how many intervals at given daterange? simpler version

I can write:
select count(*) from generate_series(
'2019-03-01'::date, '2019-05-01'::date,
interval '3 day 1 hour'
)
-- exclude upper boundary
where generate_series <> date '2019-05-01'::date;
Is there a way to do it simpler? like:
daterange( '2019-03-01', '2019-05-01' ) / interval '3 day 1 hour'
You can use
EXTRACT(epoch FROM some_interval)
to get an interval's duration in seconds.
You could use that as follows:
SELECT EXTRACT(epoch FROM '2019-05-01'::timestamptz - '2019-03-01'::timestamptz)
/ EXTRACT(epoch FROM interval '3 day 1 hour');
Note that this will only give correct answers for intervals that are measured in days or lesser units; for months and more you have to go with your original solution.