How many seconds passed by grouped by hour between two dates - postgresql

Let's suppose I have a start date 2016-06-19 09:30:00 and an end date 2016-06-19 10:20:00
I would like to get the time that elapsed every hour before starting the next hour or before getting to the final time in seconds grouped by hour and date, the result I'm trying to achieve (without having any success) would be something like this:
hour | date | time_elapsed_in_seconds
9 | 2016-06-19 | 1800 (there are 1800 seconds between 09:30:00 and 10:00:00)
10 | 2016-06-19 | 1200 (there are 1200 seconds between 10:00:00 and 10:20:00)

Try this :
with table1 as (
select '2016-06-19 09:30:00'::timestamp without time zone start_date,'2016-06-19 10:20:00'::timestamp without time zone end_date
)
select extract(hour from the_hour) "hour",the_hour::date "date",extract (epoch from (new_end-new_start)) "time_elapsed" from (
select the_hour,CASE WHEN date_trunc('hour',start_date)=the_hour then start_date else the_hour end new_start,
CASE WHEN date_trunc('hour',end_date)=the_hour then end_date else the_hour+'1 hour'::interval end new_end
from (
select generate_series(date_trunc('hour',start_date),end_date,'1 hour'::interval) the_hour,start_date,end_date from table1
) a
) b

Related

Week Day Starting from a Certain Day (01 Jan 2021) in Postgres

I am trying to get week numbers in a Year starting from a certain day
I've checked the stack but quite confused.
SELECT EXTRACT(WEEK FROM TIMESTAMP '2021-01-01'),
extract('year' from TIMESTAMP '2021-01-01')
The output is 53|2021
I want it to be 01|2021
I understand the principle of the isoweek but I want the year to start in 01-01-2021
The aim is to use intervals from this day to determine week numbers
Week N0| End Date
1 | 01-01-2021
2 | 01-08-2021
5 | 01-29-2021
...
This is really strange way to determine the week number, but in the end it's a simple math operation: the number of days since January first divided by 7.
You can create a function for this:
create function custom_week(p_input date)
returns int
as
$$
select (p_input - date_trunc('year', p_input)::date) / 7 + 1;
$$
language sql
immutable;
So this:
select date, custom_week(date)
from (
values
(date '2021-01-01'),
(date '2021-01-08'),
(date '2021-01-29')
) as v(date)
yields
date | custom_week
-----------+------------
2021-01-01 | 1
2021-01-08 | 2
2021-01-29 | 5

Group by Date and sum of total duration for that day

I am using workbench/j Postgres DB for my query which is as follows -
Input
ID |utc_tune_start_time |utc_tune_end_time
----------------------------------------------
A |04-03-2019 19:00:00 |04-03-2019 20:00:00
----------------------------------------------
A |04-03-2019 23:00:00 |05-03-2019 01:00:00
-----------------------------------------------
A |05-03-2019 10:00:00 |05-03-2019 10:30:00
-----------------------------------------------
Output
ID |Day |Duration in Minutes
----------------------------------------
A |04-03-2019 |120
-----------------------------------
A |05-03-2019 |90
-----------------------------------
I require the duration elapsed from the utc_tune_start_time till the end of the day and similarly, the time elapsed for utc_tune_end_time since the start of the day.
Thanks for your clarifications. This is possible with some case statements. Basically, if utc_tune_start_time and utc_tune_end_time are on the same day, just use the difference, otherwise calculate the difference from the end or start of the day.
WITH all_activity as (
select date_trunc('day', utc_tune_start_time) as day,
case when date_trunc('day', utc_tune_start_time) =
date_trunc('day', utc_tune_end_time)
then utc_tune_end_time - utc_tune_start_time
else date_trunc('day', utc_tune_start_time) +
interval '1 day' - utc_tune_start_time
end as time_spent
from test
UNION ALL
select date_trunc('day', utc_tune_end_time),
case when date_trunc('day', utc_tune_start_time) =
date_trunc('day', utc_tune_end_time)
then null -- we already calculated this earlier
else utc_tune_end_time - date_trunc('day', utc_tune_end_time)
end
FROM test
)
select day, sum(time_spent)
FROM all_activity
GROUP BY day;
day | sum
---------------------+----------
2019-03-04 00:00:00 | 02:00:00
2019-03-05 00:00:00 | 01:30:00
(2 rows)

generate series using break time

I have a table that store opening hour and closing hour
CREATE TABLE public.open_hours
(
id bigint NOT NULL,
open_hour character varying(255),
end_hour character varying(255),
day character varying(255),
CONSTRAINT pk_open_hour_id PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
ALTER TABLE public.open_hours
OWNER TO postgres;
I have another table that sotre
CREATE TABLE public.break_hours
(
id bigint ,
start_time character varying(255),
end_time character varying(255),
open_hour_id bigint ,
CONSTRAINT break_hours_pkey PRIMARY KEY (id),
CONSTRAINT fkinhl5x01pnn54nv15ol5ntxr5 FOREIGN KEY (open_hour_id )
REFERENCES public.open_hours(id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
WITH (
OIDS=FALSE
);
ALTER TABLE public.break_hours
OWNER TO postgres;
I need to generate time series of 30 minutest interval based on break times.
For eg: if my open hours is 08:00 AM and end hour is 06:00 PM and my break time is 11:00 AM to 11:30 and another break time is 03:00 PM to 03:15 PM then i need to generate series from 08:00 AM to 11:00 AM and 11:30 AM to 03:00 PM and 03:15 to 06:00 PM.
sample data
open_hours
-----------
id open_hours end_hour day
1 08:00 AM 06:00 PM Monday
break_hours
id start_time end_time open_hour_id
1 11:00 AM 11:30 AM 1
2 03:00 PM 03:15 PM 1
Sample out put
--------------
08:00 AM
08:30 AM
09:00 AM
09:30 AM
10:00 AM
10:30 AM
11:30 AM
12:00 PM
12:30 PM
01:00 PM
01:30 PM
02:PM PM
02:30 PM
03:15 PM
03:45 PM
04:15 PM
04:45 PM
05:15 PM
Query used for generating series between open hours is
SELECT DISTINCT gs AS start_time,gs + interval '30min' as end_time
FROM generate_series( timestamp '2018-11-09 08:00 AM', timestamp '2018-11-09 06:00 PM', interval '30min' )gs
ORDER BY start_time
It seems that your table modelling should be cleaned. E.g. you should not store times as text types but as time without time zone.
demo: db<>fiddle
WITH hours AS (
SELECT
oh.open_hour + '1970-01-01'::date as open_hour,
oh.end_hour + '1970-01-01'::date as end_hour,
bh.start_time + '1970-01-01'::date as break_start,
bh.end_time + '1970-01-01'::date as break_end,
lead(start_time + '1970-01-01'::date) OVER (ORDER BY start_time) as next_start_time
FROM open_hours oh
LEFT JOIN break_hours bh
ON oh.id = bh.start_date
)
SELECT generate_series(open_hour, break_start, interval '30 minutes')::time as time_slot
FROM (
SELECT
open_hour, break_start
FROM hours
ORDER BY break_start
LIMIT 1
)s
UNION
SELECT
generate_series(break_end, next_start_time, interval '30 minutes')::time
FROM (
SELECT
break_end, next_start_time
FROM
hours
WHERE next_start_time IS NOT NULL
) s
UNION
SELECT generate_series(break_end, end_hour, interval '30 minutes')::time
FROM (
SELECT
break_end, end_hour
FROM hours
ORDER BY break_start DESC
LIMIT 1
) s
Explanation:
WITH clause (CTE):
Merging both tables. I am adding a nonsense date because this results in a timestamp. The later used function generate_series only works for timestamps not for type time. The part is cut away later after the generation with the ::time cast.
The result of the CTE is:
open_hour end_hour break_start break_end next_start_time
1970-01-01 08:00:00 1970-01-01 18:00:00 1970-01-01 09:30:00 1970-01-01 09:45:00 1970-01-01 11:00:00
1970-01-01 08:00:00 1970-01-01 18:00:00 1970-01-01 11:00:00 1970-01-01 11:30:00 1970-01-01 15:00:00
1970-01-01 08:00:00 1970-01-01 18:00:00 1970-01-01 15:00:00 1970-01-01 15:15:00 (NULL)
UNION part:
This part contains three subparts. Because I have to merge the time series from both tables:
1. Taking the opening hour. Generate a time series to the first break beginning.
For this I only need the first row from the CTE above. That's why LIMIT 1 is used.
2. For all breaks: Generate a time series from current break ending to the next break beginning.
The CTE contains a window function lead() which shifts the start_time of the next row into the current one (have a look at the last column of the CTE result). So now I am able to get all break times, no matter how many there are. In my example I added a third break from 9:30 to 9:45 to demonstrate it. So the next time series can be generated from all these columns (current break_end to next_start_time). Only the last row does not contain a next_start_time because there is none.
3. Last step: Generate a time series from the last break ending to the closing hour.
This is quiet similar to (1). After iterating all break times I have to add the last time series from the last break time to the closing time. This could be achieved either by filtering the row without next_start_time or sorting DESC and using LIMIT 1 as I did.
More complex case with more day types:
demo: db<>fiddle
WITH hours AS (
SELECT
oh.id as day_id,
oh.open_hour + '1970-01-01'::date as open_hour,
oh.end_hour + '1970-01-01'::date as end_hour,
bh.start_time + '1970-01-01'::date as break_start,
bh.end_time + '1970-01-01'::date as break_end,
lead(start_time + '1970-01-01'::date) OVER (PARTITION BY oh.id ORDER BY start_time) as next_start_time
FROM open_hours oh
LEFT JOIN break_hours bh
ON oh.id = bh.start_date
)
SELECT day_id, generate_series(open_hour, break_start, interval '30 minutes')::time as time_slot
FROM (
SELECT DISTINCT ON (day_id)
day_id, open_hour, break_start
FROM hours
ORDER BY day_id, break_start
)s
UNION
SELECT
day_id, generate_series(break_end, next_start_time, interval '30 minutes')::time
FROM (
SELECT
day_id, break_end, next_start_time
FROM
hours
WHERE next_start_time IS NOT NULL
) s
UNION
SELECT day_id, generate_series(break_end, end_hour, interval '30 minutes')::time
FROM (
SELECT DISTINCT ON (day_id)
day_id, break_end, end_hour
FROM hours
ORDER BY day_id, break_start DESC
) s
ORDER BY day_id, time_slot
The main idea stays the same as in the example for only one day. The difference is that we have to consider the different day types. I expanded the example above and added a second day with different opening hours and break times.
Changes:
The window function in the CTE got a PARTITION BY part. This ensures that only the start_times are shifted that contains to the same day.
LIMIT 1 will not work anymore because it limits the whole table to one row. This has been changed to DISTINCT ON (day_id) which limits the table to the first row of each day.

PostgreSQL: Compute number of hours in the day on daylight savings time

I'm trying to come up with a query that will properly count that there are 25 hours on daylight savings. My table has a column of type timestampz called hourly_timestamp. The incorrect answer I have so far looks like this:
select EXTRACT(epoch FROM tomorrow-today)/3600
from(
select date_trunc('day', timezone('America/New_York', hourly_timestamp) as today ,
date_trunc('day', timezone('America/New_York', hourly_timestamp)))
+ '1 day'::interval as tomorrow
)t;
When this query executed during daylight savings time, I still only get 24 hours back and not 25. Any ideas how to do this correctly?
The number of hours varies with the clock.
with hours as (
select (timestamp with time zone '2014-11-01 00:00:00 America/New_York' + (n || ' hour')::interval) as hourly_timestamp
from generate_series(0, 72) n
)
select hourly_timestamp
, hourly_timestamp + interval '1' day as one_day_later
, hourly_timestamp + interval '1' day - hourly_timestamp as elapsed_time
from hours;
hourly_timestamp one_day_later elapsed_time
--
[snip]
2014-11-01 22:00:00-04 2014-11-02 22:00:00-05 1 day 01:00:00
2014-11-01 23:00:00-04 2014-11-02 23:00:00-05 1 day 01:00:00
2014-11-02 00:00:00-04 2014-11-03 00:00:00-05 1 day 01:00:00
2014-11-02 01:00:00-04 2014-11-03 01:00:00-05 1 day 01:00:00
2014-11-02 01:00:00-05 2014-11-03 01:00:00-05 1 day
2014-11-02 02:00:00-05 2014-11-03 02:00:00-05 1 day
2014-11-02 03:00:00-05 2014-11-03 03:00:00-05 1 day
2014-11-02 04:00:00-05 2014-11-03 04:00:00-05 1 day
[snip]
Note that 01:00 repeats, but with a different offset. Daylight savings time ends at 02:00, the clocks fall back and repeat the hour between 01:00 and 02:00, but since daylight savings time has ended, there are now five hours between the UTC and America/New_York time zones.
This similar query displays dates, not timestamps.
with dates as (
select (timestamp with time zone '2014-11-01 00:00:00 America/New_York' + (n || ' day')::interval) as daily_timestamp
from generate_series(0, 2) n
)
select daily_timestamp::date
, (daily_timestamp + interval '1' day)::date as one_day_later
, daily_timestamp + interval '1' day - daily_timestamp as elapsed_time
from dates;
daily_timestamp one_day_later elapsed_time
--
2014-11-01 2014-11-02 1 day
2014-11-02 2014-11-03 1 day 01:00:00
2014-11-03 2014-11-04 1 day
Where did you go wrong? By calculating the elapsed time after you truncated the time information. (Dates don't have time zones associated with them.) If I take the second query and cast "daily_timestamp" to a date in the common table expression, I get 24 hours, too.
with dates as (
select (timestamp with time zone '2014-11-01 00:00:00 America/New_York' + (n || ' day')::interval)::date as daily_timestamp
from generate_series(0, 2) n
)
select daily_timestamp::date
, (daily_timestamp + interval '1' day)::date as one_day_later
, daily_timestamp + interval '1' day - daily_timestamp as elapsed_time
from dates;
daily_timestamp one_day_later elapsed_time
--
2014-11-01 2014-11-02 1 day
2014-11-02 2014-11-03 1 day
2014-11-03 2014-11-04 1 day
You first have to do the extraction to epoch and then the calculations:
WITH test AS (
SELECT '2014-10-26'::timestamptz at time zone 'America/New_York' AS today,
'2014-10-27'::timestamptz at time zone 'America/New_York' AS tomorrow
)
SELECT
extract(epoch from tomorrow) - extract(epoch from today) AS seconds, -- 90000
(extract(epoch from tomorrow) - extract(epoch from today)) / 3600 AS hours -- 25
FROM test;

Grouping by date, with 0 when count() yields no lines

I'm using Postgresql 9 and I'm fighting with counting and grouping when no lines are counted.
Let's assume the following schema :
create table views {
date_event timestamp with time zone ;
event_id integer;
}
Let's imagine the following content :
2012-01-01 00:00:05 2
2012-01-01 01:00:05 5
2012-01-01 03:00:05 8
2012-01-01 03:00:15 20
I want to group by hour, and count the number of lines. I wish I could retrieve the following :
2012-01-01 00:00:00 1
2012-01-01 01:00:00 1
2012-01-01 02:00:00 0
2012-01-01 03:00:00 2
2012-01-01 04:00:00 0
2012-01-01 05:00:00 0
.
.
2012-01-07 23:00:00 0
I mean that for each time range slot, I count the number of lines in my table whose date correspond, otherwise, I return a line with a count at zero.
The following will definitely not work (will yeld only lines with counted lines > 0).
SELECT extract ( hour from date_event ),count(*)
FROM views
where date_event > '2012-01-01' and date_event <'2012-01-07'
GROUP BY extract ( hour from date_event );
Please note I might also need to group by minute, or by hour, or by day, or by month, or by year (multiple queries is possible of course).
I can only use plain old sql, and since my views table can be very big (>100M records), I try to keep performance in mind.
How can this be achieved ?
Thank you !
Given that you don't have the dates in the table, you need a way to generate them. You can use the generate_series function:
SELECT * FROM generate_series('2012-01-01'::timestamp, '2012-01-07 23:00', '1 hour') AS ts;
This will produce results like this:
ts
---------------------
2012-01-01 00:00:00
2012-01-01 01:00:00
2012-01-01 02:00:00
2012-01-01 03:00:00
...
2012-01-07 21:00:00
2012-01-07 22:00:00
2012-01-07 23:00:00
(168 rows)
The remaining task is to join the two selects using an outer join like this :
select extract ( day from ts ) as day, extract ( hour from ts ) as hour,coalesce(count,0) as count from
(
SELECT extract ( day from date ) as day , extract ( hour from date ) as hr ,count(*)
FROM sr
where date>'2012-01-01' and date <'2012-01-07'
GROUP BY extract ( day from date ) , extract ( hour from date )
) AS cnt
right outer join ( SELECT * FROM generate_series ( '2012-01-01'::timestamp, '2012-01-07 23:00', '1 hour') AS ts ) as dtetable on extract ( hour from ts ) = cnt.hr and extract ( day from ts ) = cnt.day
order by day,hour asc;
This query will give you the output what your are looking for,
select to_char(date_event, 'YYYY-MM-DD HH24:00') as time, count (to_char(date_event, 'HH24:00')) as count from views where date(date_event) > '2012-01-01' and date(date_event) > '2012-01-07' group by time order by time;