How to average hourly values over multiple days with SQL - postgresql

I have a SQL table (postgreSQL/TimescaleDB) with hourly values, eg:
Timestamp Value
...
2021-02-17 13:00:00 2
2021-02-17 14:00:00 4
...
2021-02-18 13:00:00 3
2021-02-18 14:00:00 3
...
I want to get the average values for each hour mapped to today's date in a specific timespan, so something like that:
select avg(value)
from table
where Timestamp between '2021-02-10' and '2021-02-20'
group by *hourpart of timestamp*
result today (2021-10-08) should be:
...
Timestamp Value
2021-10-08 13:00:00 2.5
2021-10-08 14:00:00 3.5
...
If I do the same select tomorrow (2021-10-09) result should change to:
...
Timestamp Value
2021-10-09 13:00:00 2.5
2021-10-09 14:00:00 3.5
...

I resolved the problem by myself:
Solution:
SELECT EXTRACT(HOUR FROM table."Timestamp") as hour,
avg(table."Value") as average
from table
where Timestamp between '2021-02-10' and '2021-02-20'
group by hour
order by hour;

You have to write your query like this:
select avg(value)
from table
where Timestamp between '2021-02-10' and '2021-02-20'
group by substring(TimeStamp,1,10), substring(TimeStamp,11,9)

Related

Postgresql extracting 'epoch' from timestamp cuts off last date in date range

My table has the column event_ts with column type numeric.
Here is my query:
select
min(to_timestamp(event_ts)), max(to_timestamp(event_ts))
from
table1
where
event_ts >= extract('epoch' from '2021-07-01'::timestamp) and
event_ts <= extract('epoch' from '2021-07-31'::timestamp)
However, the results are
min: 2021-06-30 20:00:00.000 -0400
max: 2021-07-30 20:00:00.000 -0400
I would think the where clause would include data from 2021-07-01 to 2021-07-31.
There is data for July 31st, 2021.
Why does this query start at 2021-06-30 and end 2021-07-30?
show timezone;
TimeZone
------------
US/Pacific
select extract('epoch' from '2021-07-01'::timestamp);
extract
-------------------
1625097600.000000
select to_timestamp(1625097600);;
to_timestamp
-------------------------
06/30/2021 17:00:00 PDT
select extract('epoch' from '2021-07-01'::timestamptz);
extract
-------------------
1625122800.000000
(1 row)
test(5432)=# select to_timestamp(1625122800);
to_timestamp
-------------------------
07/01/2021 00:00:00 PDT
So by using timestamp you are creating a local time offset by the timezone offset. Using timestamptz will return a timestamp at 0:00:00.
This is because from here:
https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-EXTRACT
epoch
For timestamp with time zone values, the number of seconds since 1970-01-01 00:00:00 UTC (negative for timestamps before that); for date and timestamp values, the nominal number of seconds since 1970-01-01 00:00:00, without regard to timezone or daylight-savings rules; for interval values, the total number of seconds in the interval
Epoch is based on UTC timezone.
Not sure why you are using epoch anyway?
Why not?:
...
where
event_ts between '2021-07-01'::timestamptz and '2021-07-31'::timestamptz

How to get the minimum value of unique items based upon a datediff function in t-sql?

I am trying to figure out the minimum time elapsed between two columns, grouped by values in a third column
ID
Start Time
End Time
1
2021-08-22 00:00:00
2021-08-24 00:00:00
1
2021-08-21 00:00:00
2021-08-24 00:00:00
2
2021-08-22 00:00:00
2021-08-24 00:00:00
2
2021-08-21 00:00:00
2021-08-24 00:00:00
3
2021-08-22 00:00:00
2021-08-24 00:00:00
3
2021-08-21 00:00:00
2021-08-24 00:00:00
From this table, I would like to get the results:
ID
Elapsed Time
1
48 hours
2
48 hours
3
48 hours
Currently I have this SQL function
SELECT ID, datediff(hour, Start Time, End Time) as diff
FROM t
WHERE
MIN(diff)
GROUP BY ID
Jacob, this should give you the results you are looking for:
SELECT
ID,
MIN(DATEDIFF (HOUR, StartTime, EndTime)) AS diff
FROM
t
GROUP BY
ID;

Monthly-hourly-average calculate from Postgresql database

I have the time and the values in the data base. I need to calculate for a given month the average during each hour i.e.
YYYY-mm-dd (the day can be omitted)
2021-01-01 00:00:00 value=avg(values from 00:00:00 until 00:59:59 for every day of this month at this hour interval)
2021-01-01 01:00:00 value=avg(values from 01:00:00 until 01:59:59 idem as above)
...
2021-01-01 23:00:00 value=avg(values from 23:00:00 until 23:59:59)
2021-02-01 00:00:00 value=avg(values from 00:00:00 until 00:59:59)
2021-02-01 01:00:00 value=avg(values from 01:00:00 until 01:59:59)
...
2021-02-01 23:00:00 value=avg(values from 23:00:00 until 23:59:59)
...
You can use date_trunc('hour', datestamp) in a GROUP BY statement, something like this.
SELECT DATE_TRUNC('hour', datestamp) hour_beginning, AVG(value) average_value
FROM mytable
WHERE datestamp >= '2021-01-01'
AND datestamp < '2021-02-01'
GROUP BY DATE_TRUNC('hour', datestamp)
ORDER BY DATE_TRUNC('hour', datestamp)
To generalize, in place of DATE_TRUNC you can use any injective function.
You could use
to_char(datestamp, 'YYYY-MM-01 HH24:00:00')
to get one result row per hour for every month in your date range.
SELECT to_char(datestamp, 'YYYY-MM-01 HH24:00:00') hour,
AVG(value) average_value
FROM mytable
GROUP BY to_char(datestamp, 'YYYY-MM-01 HH24:00:00')
ORDER BY to_char(datestamp, 'YYYY-MM-01 HH24:00:00')

Hive query to return last date of each month for 3 years

Please provide hive query to return last date of each month in 'yyyy-mm-dd' format for 3 years.
Substitute start and end dates in this example with yours. How it works: space function generates string of spaces with length = number of days returned by datediff() function, split by space creates an array, posexplode explodes an array, returning position of the element in the array, which corresponds to the number of days. Then date_add('${hivevar:start_date}',s.i) returns dates for each day, lest_day() function (exists in Hive since 1.1version) converts each date to the last day (need distinct here). Run this example:
set hivevar:start_date=2015-07-01;
set hivevar:end_date=current_date;
select distinct last_day(date_add ('${hivevar:start_date}',s.i)) as last_date
from ( select posexplode(split(space(datediff(${hivevar:end_date},'${hivevar:start_date}')),' ')) as (i,x)
) s
order by last_date
;
Output:
OK
2015-07-31
2015-08-31
2015-09-30
2015-10-31
2015-11-30
2015-12-31
2016-01-31
2016-02-29
2016-03-31
2016-04-30
2016-05-31
2016-06-30
2016-07-31
2016-08-31
2016-09-30
2016-10-31
2016-11-30
2016-12-31
2017-01-31
2017-02-28
2017-03-31
2017-04-30
2017-05-31
2017-06-30
2017-07-31
2017-08-31
2017-09-30
2017-10-31
2017-11-30
2017-12-31
2018-01-31
2018-02-28
2018-03-31
2018-04-30
2018-05-31
2018-06-30
2018-07-31
Time taken: 71.581 seconds, Fetched: 37 row(s)

Concatenate date and time fields and turn into datetime postgresql

I have a table with the date and time fields separated
Table1
data hora id
2015-01-01 11:40:06 1
2015-01-01 15:40:06 2
2015-01-02 15:40:06 3
2015-01-05 10:40:06 4
2015-01-05 15:40:06 5
2015-01-06 08:23:00 6
Now I need to consult the id between 2015-01-01 12:00:00 12:00:00 and 2015-01-05 12:00:00, , should return the ids 2,3,4.
I'm trying to convert and concatenate the date and time fields that are separated in a single datetime field in order to use the 'between' but I can not hit the syntax can someone give an example?
It works!
SELECT
*
FROM
tableA
WHERE
(dataemissao + hora) BETWEEN (date '2015-01-21' + time '14:00')
AND (date '2015-01-21' + time '18:00')