Monthly-hourly-average calculate from Postgresql database

Monthly-hourly-average calculate from Postgresql database - postgresql

I have the time and the values in the data base. I need to calculate for a given month the average during each hour i.e.
YYYY-mm-dd (the day can be omitted)
2021-01-01 00:00:00 value=avg(values from 00:00:00 until 00:59:59 for every day of this month at this hour interval)
2021-01-01 01:00:00 value=avg(values from 01:00:00 until 01:59:59 idem as above)
...
2021-01-01 23:00:00 value=avg(values from 23:00:00 until 23:59:59)
2021-02-01 00:00:00 value=avg(values from 00:00:00 until 00:59:59)
2021-02-01 01:00:00 value=avg(values from 01:00:00 until 01:59:59)
...
2021-02-01 23:00:00 value=avg(values from 23:00:00 until 23:59:59)
...

You can use date_trunc('hour', datestamp) in a GROUP BY statement, something like this.
SELECT DATE_TRUNC('hour', datestamp) hour_beginning, AVG(value) average_value
FROM mytable
WHERE datestamp >= '2021-01-01'
AND datestamp < '2021-02-01'
GROUP BY DATE_TRUNC('hour', datestamp)
ORDER BY DATE_TRUNC('hour', datestamp)
To generalize, in place of DATE_TRUNC you can use any injective function.
You could use
to_char(datestamp, 'YYYY-MM-01 HH24:00:00')
to get one result row per hour for every month in your date range.
SELECT to_char(datestamp, 'YYYY-MM-01 HH24:00:00') hour,
AVG(value) average_value
FROM mytable
GROUP BY to_char(datestamp, 'YYYY-MM-01 HH24:00:00')
ORDER BY to_char(datestamp, 'YYYY-MM-01 HH24:00:00')

Related

PostgreSQL creating timestamp ranges

In PostgreSQL 11, I am trying to get a weekend time range. From 17:00 Friday to Sunday 17:00.
So far I am able to get a working day by doing
select * from generate_series(date '2021-01-01',date '2021-12-31',interval '1' day) as t(dt) where extract (dow from dt) between 1 and 5;
However, I am have trouble creating 2 columns from start (17:00 Friday) to finish (17:00 Sunday).
Expected output should be something like this:
start stop
2022-10-07 17:00 2022-10-09 17:00
2022-10-14 17:00 2022-10-16 17:00
2022-10-21 17:00 2022-10-23 17:00

To get a series of all hours between 17:00 on Friday and 17:00 on Sunday.
SELECT
*
FROM
generate_series(timestamp '2021-01-01', timestamp '2021-12-31', interval '1' hour) AS t (dt)
WHERE
extract(dow FROM dt) IN (5, 6, 0)
AND CASE WHEN extract(dow FROM dt) = 5 THEN
extract(hour FROM dt) >= 17
WHEN extract(dow FROM dt) = 0 THEN
extract(hour FROM dt) <= 17
ELSE
extract(hour FROM dt) IS NOT NULL
END;
UPDATE
Get two timestamps that represent start and stop of each period Friday 17:00 to Sunday 17:00 over a range of dates.
SELECT
dt + '17:00'::time as start, (dt + '17:00'::time) + '2 days'::interval as stop
FROM
generate_series(date '2022-01-01', date '2022-12-31', interval '1' day) AS t (dt)
WHERE
extract(dow FROM dt) = 5
;
start | stop
-------------------------+-------------------------
01/07/2022 17:00:00 PST | 01/09/2022 17:00:00 PST
01/14/2022 17:00:00 PST | 01/16/2022 17:00:00 PST
01/21/2022 17:00:00 PST | 01/23/2022 17:00:00 PST
01/28/2022 17:00:00 PST | 01/30/2022 17:00:00 PST
02/04/2022 17:00:00 PST | 02/06/2022 17:00:00 PST
02/11/2022 17:00:00 PST | 02/13/2022 17:00:00 PST
02/18/2022 17:00:00 PST | 02/20/2022 17:00:00 PST
02/25/2022 17:00:00 PST | 02/27/2022 17:00:00 PST
03/04/2022 17:00:00 PST | 03/06/2022 17:00:00 PST
03/11/2022 17:00:00 PST | 03/13/2022 17:00:00 PDT
03/18/2022 17:00:00 PDT | 03/20/2022 17:00:00 PDT
03/25/2022 17:00:00 PDT | 03/27/2022 17:00:00 PDT
...

--timestamptz type.
SELECT
(day + interval '17:30') AS start,
(day + interval '17:30' + interval '2 days') AS
END
FROM
generate_series(date '2022-10-01', date '2022-12-31', interval '1' day) _ (day)
WHERE
EXTRACT(ISODOW FROM day) = 5;
--timestamp type.
SELECT
(day + interval '17:30')::timestamp AS start,
(day + interval '17:30' + interval '2 days')::timestamp AS
END
FROM
generate_series(date '2022-10-01', date '2022-12-31', interval '1' day) _ (day)
WHERE
EXTRACT(ISODOW FROM day) = 5;
I do checked the calendar, it works.

How to average hourly values over multiple days with SQL

I have a SQL table (postgreSQL/TimescaleDB) with hourly values, eg:
Timestamp Value
...
2021-02-17 13:00:00 2
2021-02-17 14:00:00 4
...
2021-02-18 13:00:00 3
2021-02-18 14:00:00 3
...
I want to get the average values for each hour mapped to today's date in a specific timespan, so something like that:
select avg(value)
from table
where Timestamp between '2021-02-10' and '2021-02-20'
group by *hourpart of timestamp*
result today (2021-10-08) should be:
...
Timestamp Value
2021-10-08 13:00:00 2.5
2021-10-08 14:00:00 3.5
...
If I do the same select tomorrow (2021-10-09) result should change to:
...
Timestamp Value
2021-10-09 13:00:00 2.5
2021-10-09 14:00:00 3.5
...

I resolved the problem by myself:
Solution:
SELECT EXTRACT(HOUR FROM table."Timestamp") as hour,
avg(table."Value") as average
from table
where Timestamp between '2021-02-10' and '2021-02-20'
group by hour
order by hour;

You have to write your query like this:
select avg(value)
from table
where Timestamp between '2021-02-10' and '2021-02-20'
group by substring(TimeStamp,1,10), substring(TimeStamp,11,9)

Group by Date and sum of total duration for that day

I am using workbench/j Postgres DB for my query which is as follows -
Input
ID |utc_tune_start_time |utc_tune_end_time
----------------------------------------------
A |04-03-2019 19:00:00 |04-03-2019 20:00:00
----------------------------------------------
A |04-03-2019 23:00:00 |05-03-2019 01:00:00
-----------------------------------------------
A |05-03-2019 10:00:00 |05-03-2019 10:30:00
-----------------------------------------------
Output
ID |Day |Duration in Minutes
----------------------------------------
A |04-03-2019 |120
-----------------------------------
A |05-03-2019 |90
-----------------------------------
I require the duration elapsed from the utc_tune_start_time till the end of the day and similarly, the time elapsed for utc_tune_end_time since the start of the day.

Thanks for your clarifications. This is possible with some case statements. Basically, if utc_tune_start_time and utc_tune_end_time are on the same day, just use the difference, otherwise calculate the difference from the end or start of the day.
WITH all_activity as (
select date_trunc('day', utc_tune_start_time) as day,
case when date_trunc('day', utc_tune_start_time) =
date_trunc('day', utc_tune_end_time)
then utc_tune_end_time - utc_tune_start_time
else date_trunc('day', utc_tune_start_time) +
interval '1 day' - utc_tune_start_time
end as time_spent
from test
UNION ALL
select date_trunc('day', utc_tune_end_time),
case when date_trunc('day', utc_tune_start_time) =
date_trunc('day', utc_tune_end_time)
then null -- we already calculated this earlier
else utc_tune_end_time - date_trunc('day', utc_tune_end_time)
end
FROM test
)
select day, sum(time_spent)
FROM all_activity
GROUP BY day;
day | sum
---------------------+----------
2019-03-04 00:00:00 | 02:00:00
2019-03-05 00:00:00 | 01:30:00
(2 rows)

hot to add one month to the required column by substracting one day from it in postgresql

I have a column as date. In that column I have a value as '2016-05-06' I want a result in such manner that it will add the complete one month into this column. But it should return a one day before result.
So when i execute the query like:
select date,(date + interval '1 month') as new_column
from batchproduct_info;
it give me the result as:
date new_column
2016-05-06 2016-06-06 00:00:00
2016-05-07 2016-06-07 00:00:00
But I want result in this format:
date new_column
2016-05-06 2016-06-05 00:00:00
2016-05-07 2016-06-06 00:00:00
i.e it should subtract the one day from one month.

This is a solution to your problem:
select date, (date + '1 month'::interval - '1 day'::interval) as new_column
from batchproduct_info;

Grouping by date, with 0 when count() yields no lines

I'm using Postgresql 9 and I'm fighting with counting and grouping when no lines are counted.
Let's assume the following schema :
create table views {
date_event timestamp with time zone ;
event_id integer;
}
Let's imagine the following content :
2012-01-01 00:00:05 2
2012-01-01 01:00:05 5
2012-01-01 03:00:05 8
2012-01-01 03:00:15 20
I want to group by hour, and count the number of lines. I wish I could retrieve the following :
2012-01-01 00:00:00 1
2012-01-01 01:00:00 1
2012-01-01 02:00:00 0
2012-01-01 03:00:00 2
2012-01-01 04:00:00 0
2012-01-01 05:00:00 0
.
.
2012-01-07 23:00:00 0
I mean that for each time range slot, I count the number of lines in my table whose date correspond, otherwise, I return a line with a count at zero.
The following will definitely not work (will yeld only lines with counted lines > 0).
SELECT extract ( hour from date_event ),count(*)
FROM views
where date_event > '2012-01-01' and date_event <'2012-01-07'
GROUP BY extract ( hour from date_event );
Please note I might also need to group by minute, or by hour, or by day, or by month, or by year (multiple queries is possible of course).
I can only use plain old sql, and since my views table can be very big (>100M records), I try to keep performance in mind.
How can this be achieved ?
Thank you !

Given that you don't have the dates in the table, you need a way to generate them. You can use the generate_series function:
SELECT * FROM generate_series('2012-01-01'::timestamp, '2012-01-07 23:00', '1 hour') AS ts;
This will produce results like this:
ts
---------------------
2012-01-01 00:00:00
2012-01-01 01:00:00
2012-01-01 02:00:00
2012-01-01 03:00:00
...
2012-01-07 21:00:00
2012-01-07 22:00:00
2012-01-07 23:00:00
(168 rows)
The remaining task is to join the two selects using an outer join like this :
select extract ( day from ts ) as day, extract ( hour from ts ) as hour,coalesce(count,0) as count from
(
SELECT extract ( day from date ) as day , extract ( hour from date ) as hr ,count(*)
FROM sr
where date>'2012-01-01' and date <'2012-01-07'
GROUP BY extract ( day from date ) , extract ( hour from date )
) AS cnt
right outer join ( SELECT * FROM generate_series ( '2012-01-01'::timestamp, '2012-01-07 23:00', '1 hour') AS ts ) as dtetable on extract ( hour from ts ) = cnt.hr and extract ( day from ts ) = cnt.day
order by day,hour asc;

This query will give you the output what your are looking for,
select to_char(date_event, 'YYYY-MM-DD HH24:00') as time, count (to_char(date_event, 'HH24:00')) as count from views where date(date_event) > '2012-01-01' and date(date_event) > '2012-01-07' group by time order by time;

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Monthly-hourly-average calculate from Postgresql database - postgresql

Related

PostgreSQL creating timestamp ranges

How to average hourly values over multiple days with SQL

Group by Date and sum of total duration for that day

hot to add one month to the required column by substracting one day from it in postgresql

Grouping by date, with 0 when count() yields no lines

Categories

Resources