Take Hourly Average in SQL

Take Hourly Average in SQL - tsql

I have a SQL table with 2 fields: TimeStamp and Value. Below is an excerpt of some of the data.
2005-02-17 13:31:00 2
2005-02-17 13:46:00 3
2005-02-17 14:01:00 1.7
2005-02-17 14:16:00 2.3
2005-02-17 14:31:00 2
2005-02-17 14:46:00 2.5
2005-02-17 15:01:00 2.2
2005-02-17 15:16:00 2.4
2005-02-17 15:31:00 2.6
2005-02-17 15:46:00 2.6
2005-02-17 16:01:00 2.7
I am trying to take an hourly average of the Value column, however I cannot seem to make this work correctly. The final output would show the starting hour for the TimeStamp, and the averaged value for the Value column.
For the final output I am looking to get a full timestamp as a result, not just the hour. So from 14:00 - 14:59 on 2005-02-17 the resulting output would be:
2005-02-17 14:00:00 2.125

I would do it like this:
SELECT CAST(FLOOR(CAST(timestamp AS float)) AS datetime) AS day --strip time
, DATEPART(hh, timestamp) AS hour
, AVG(value) AS average
FROM times
GROUP BY CAST(FLOOR(CAST(timestamp AS float)) AS datetime)
, DATEPART(hh, timestamp)
Example fiddle.

select Time_Stamp_Hour=dateadd(hh,datepart(hh,Time_Stamp), cast(CAST(Time_Stamp as date) as datetime))
, AvgValue=AVG(Value)
from ValueLog
group by dateadd(hh,datepart(hh,Time_Stamp), cast(CAST(Time_Stamp as date) as datetime))
Result:
Time_Stamp_Hour AvgValue
----------------------- ----------------------
2005-02-17 13:00:00.000 2.5
2005-02-17 14:00:00.000 2.125
2005-02-17 15:00:00.000 2.45
2005-02-17 16:00:00.000 2.7
Compatibility: Sql Server 2008+

SELECT DATEPART(hour,Col1) as hourcol,AVG(Col2)
FROM Yourtable
GROUP BY hourcol;
OR
SELECT SUBSTRING(Col1,1,14)+'00' AS hourcol,AVG(Col2)
FROM Yourtable
GROUP BY hourcol;
In this query DATEPART function calculates the hour value for all the values in the DATETIME column and based on each hour the average of 2nd column is calculated at hour level.

I think you also want it grouped by date, not only by hour, right?
select
convert(VARCHAR(10), date, 111) as aDate,
datepart(HH, date) anHour,
avg(value) anAverage
from t
group by convert(VARCHAR(10), date, 111), datepart(HH, date)
Or this:
; with aTable as (
select
convert(VARCHAR(10), date, 111) as aDate,
datepart(HH, date) anHour,
value
from t)
select aDate, anHour, avg(value) from aTable
group by aDate, anHour

SELECT
AVG(myvalue) [Average],
DATEADD(HOUR, DATEPART(HOUR, mydate), CAST(CAST(mydate as Date) as datetime)) [Hour]
FROM
myTable
GROUP BY
DATEADD(HOUR, DATEPART(HOUR, mydate), CAST(CAST(mydate as Date) as datetime))
ORDER BY
DATEADD(HOUR, DATEPART(HOUR, mydate), CAST(CAST(mydate as Date) as datetime))

Related

Dynamic value passing in Postgres

Here is a complex query where i need to pass some dates as dynamic to this, As of now i have hardcoded this '2021-08-01' AND '2022-07-31' these 2 dates.
But i have to pass this dates dynamically in such a way that next dates ie, 2022-06 month , thew dates passed will be '2021-07-01' and '2022-06-30' , basically 12 months behind data.
if we take 2022-05 then the passed date should be '2021-06-01' and '2022-05-31'.
How can we achieve this ? Any suggestions or help will be much appreciated.
below is the query for reference
WITH base as
(
SELECT created_at as period ,order_number, TRIM(email) as email ,is_first_order
FROM orders
WHERE created_at::DATE BETWEEN '2021-08-01' AND '2022-07-31'
)
,base_agg as
(
select TO_CHAR(period,'YYYY-MM') as period
,COUNT(DISTINCT email)FILTER(WHERE is_first_order IS TRUE) as new_users
,COUNT(DISTINCT order_number)FILTER(WHERE is_first_order IS FALSE) as returning_orders
FROM base
GROUP BY 1
)
,base_cumulative as
(
SELECT ROW_NUMBER() OVER(ORDER BY PERIOD DESC ) as rno
,period
,new_users
,returning_orders
,sum("new_users")over (order by "period" asc rows between unbounded preceding and current row) as "cumulative_total"
from base_agg
)
SELECT
(SELECT period FROM base_cumulative WHERE rno=1) period
,(SELECT cumulative_total FROM base_cumulative WHERE rno=1) as cumulated_customers
,SUM(returning_orders) as returning_orders
,SUM(returning_orders)/NULLIF((SELECT cumulative_total FROM base_cumulative WHERE rno=1),0) as rate
FROM base_cumulative

You can calculate the end of current month based on NOW() and some logic, the same can be applied with the rest of the calculation
select date_trunc('month', now())::date + interval '1 month - 1 day' end_of_this_month,
date_trunc('month', now())::date + interval '1 month - 1 day'::interval - '1 year'::interval + '1 day'::interval first_day_of_prev_year_month
;
Result
end_of_this_month | first_day_of_prev_year_month
---------------------+------------------------------
2022-08-31 00:00:00 | 2021-09-01 00:00:00
(1 row)

Postgresql split date range by year parts (financial year)

I have a table like follows:
id start_date end_date
1 2020-01-01 2020-05-01
2 2020-03-01 2021-04-02
I need to be able to split the rows by financial year e.g. 2020-04-01 -> 2021-03-31)
So the result of the query would be as follows:
id start_date end_date
1 2020-01-01 2020-03-31
1 2020-04-01 2020-05-01
2 2020-03-01 2020-03-31
2 2020-04-01 2021-03-31
2 2021-04-01 2021-04-02

Actually another post helped me resolve this: Date split-up based on Fiscal Year
DROP TABLE your_table;
CREATE TABLE your_table (id int, start_date date, end_date date);
INSERT INTO your_table VALUES (1, '2020-01-01', '2020-05-01');
INSERT INTO your_table VALUES (2, '2020-03-01', '2021-04-02');
SELECT
id,
GREATEST(start_date, ('01-04-'||series.year)::date) AS year_start,
LEAST(end_date, ('31-03-'||series.year + 1)::date) AS year_end
FROM
(SELECT
id,
start_date,
end_date,
generate_series(
date_part('year', your_table.start_date - INTERVAL '3 months')::int,
date_part('year', your_table.end_date - INTERVAL '3 months')::int)
FROM your_table) AS series(id, start_date, end_date, year)
ORDER BY
start_date;
Result:
"id","year_start","year_end"
1,"2020-01-01","2020-03-31"
1,"2020-04-01","2020-05-01"
2,"2020-03-01","2020-03-31"
2,"2020-04-01","2021-03-31"
2,"2021-04-01","2021-04-02"

How does one query a DateTimeOffset column to get all rows for a particular day?

So let's say you have a table with the column below (type is datetimeoffset(3)).
DTO_Created
2017-04-28 03:16:56.942 -05:00
2017-05-01 00:20:54.925 -05:00
2017-05-01 12:17:52.752 -05:00
2017-05-01 23:21:00.198 -05:00
2017-05-02 01:19:23.254 -05:00
How would you query to only get the rows created on 2017-05-01? (3 rows total) I am attempting this, but am not getting all 3.
SELECT * FROM MyTable WHERE DTO_Created >= '2017-05-01 00:00:00.000' AND DTO_Created <= '2017-05-01 23:59:59.999'
The problem seems to be caused by the type (datetimeoffset) because this doesn't happen with regular datetime columns.
Environment: SQL Server 2016

You need to use datetimeoffset for the where clause as well.
First, create and populate sample table (Please save us this step in your future questions):
DECLARE #T AS TABLE
(
DTO_Created DateTimeOffset
)
INSERT INTO #T(DTO_Created) VALUES
('2017-04-28 03:16:56.942 -05:00'),
('2017-05-01 00:20:54.925 -05:00'),
('2017-05-01 12:17:52.752 -05:00'),
('2017-05-01 23:21:00.198 -05:00'),
('2017-05-02 01:19:23.254 -05:00')
The query:
SELECT *
FROM #T
WHERE DTO_Created >= '2017-05-01 00:00:00.000 -05:00'
AND DTO_Created < '2017-05-02 00:00:00.000 -05:00'
Results:
DTO_Created
2017-05-01 00:20:54.925 -05:00
2017-05-01 12:17:52.752 -05:00
2017-05-01 23:21:00.198 -05:00
Another option for 2016 or higher version is to use AT TIME ZONE, but you'll have to add the hours difference to the search value, and beware of daylight savings time:
SELECT *
FROM #T
WHERE DTO_Created >= CAST('2017-05-01 05:00:00' AS DateTime2) AT TIME ZONE 'Easter Island Standard Time' -- -05:00
AND DTO_Created < CAST('2017-05-02 05:00:00' AS DateTime2) AT TIME ZONE 'Easter Island Standard Time'
If you want to ignore local time and treat the UTC part of the DateTimeOffset as if it was DateTime you can simply cast the DateTimeOffset to Date, though that would be a non-sargable predicate:
SELECT *
FROM #T
WHERE CAST(DTO_Created As Date) = '2017-05-01'

Try this
CONVERT(DATETIME, DTO_Created, 1) >= '2017-05-01 00:00:00.000' AND CONVERT(DATETIME, DTO_Created, 1) <= '2017-05-01 23:59:59.999'
OR this
cast(DTO_Created as datetime) >= '2017-05-01 00:00:00.000' AND cast(DTO_Created as datetime) <= '2017-05-01 23:59:59.999'

How to query hourly aggregated data by date with postgresql?

There is one table:
ID DATE
1 2017-09-16 20:12:48
2 2017-09-16 20:38:54
3 2017-09-16 23:58:01
4 2017-09-17 00:24:48
5 2017-09-17 00:26:42
..
The result I need is the last 7-days of data with hourly aggregated count of rows:
COUNT DATE
2 2017-09-16 21:00:00
0 2017-09-16 22:00:00
0 2017-09-16 23:00:00
1 2017-09-17 00:00:00
2 2017-09-17 01:00:00
..
I tried different stuff with EXTRACT, DISTINCT and also used the generate_series function (most stuff from similar stackoverflow questions)
This try was the best one currently:
SELECT
date_trunc('hour', demotime) as date,
COUNT(demotime) as count
FROM demo
GROUP BY date
How to generate hourly series for 7 days and fill-in the count of rows?

SQL DEMO
SELECT dd, count("demotime")
FROM generate_series
( current_date - interval '7 days'
, current_date
, '1 hour'::interval) dd
LEFT JOIN Table1
ON dd = date_trunc('hour', demotime)
GROUP BY dd;
To work from now and now - 7 days:
SELECT dd, count("demotime")
FROM generate_series
( date_trunc('hour', NOW()) - interval '7 days'
, date_trunc('hour', NOW())
, '1 hour'::interval) dd
LEFT JOIN Table1
ON dd = date_trunc('hour', demotime)
GROUP BY dd;

How to date trunc in HANA

I have a query to get the count of buses which travel less than 100 km per day. So I use the query in PostgreSQL
select day,count(*)as bus_count from(
SELECT date_trunc('hour',start_time)::timestamp::date as day,bus_id,sum(distance_two_points) as distance
FROM public.datatable where start_time >= '2015-09-05 00:00:00' and start_time <= '2015-09-05 23:59:59'
group by day,bus_id
) as A where distance<=250000 group by day
The query returns the result
day bus_id distance
___ ________ _________
"2015-09-05 00:00:00" 1 523247
"2015-09-05 00:00:00" 2 135114
"2015-09-05 00:00:00" 3 178560
"2015-09-05 00:00:00" 4 400071
"2015-09-05 00:00:00" 5 312832
"2015-09-05 00:00:00" 6 237075
So I now want to use this same query (achieving same results) in SAP HANA but there is no date trunc function and I also tried
SELECT EXTRACT (DAY FROM TO_DATE (START_TIME, 'YYYY-MM-DD')) "extract" as day,
bus_id, sum(distance_two_points) as distance
FROM public.datatable
where start_time >= '2015-09-05 00:00:00' and start_time <= '2015-09-05 23:59:59'
group by day,bus_id
) as A where distance<=250000 group by day
Any help is appreciated.

SELECT SERIES_ROUND('2013-05-24', 'INTERVAL 1 YEAR', ROUND_DOWN) "result" FROM DUMMY;
SELECT SERIES_ROUND('04:25:01', 'INTERVAL 10 MINUTE') "result" FROM DUMMY;
The SERIES_ROUND from SAP Hana provides similar functionalities as DATE_TRUNC in other vendors.
https://help.sap.com/docs/SAP_HANA_PLATFORM/4fe29514fd584807ac9f2a04f6754767/435ec476ab494ad6b8409f22abec13fe.html?version=2.0.00

Converting to a non-datetime data type is usually not a good idea (additional parsing, encoding, semantics...).
Instead use a less granular datetime data type: daydate in this case.
create column table datatab (start_time seconddate, bus_id int, distance_two_points decimal (10, 2));
insert into datatab values (to_seconddate('05.09.2015 13:12:00'), 1, 50.2);
insert into datatab values (to_seconddate('05.09.2015 13:22:00'), 1, 1.2);
insert into datatab values (to_seconddate('05.09.2015 15:32:00'), 1, 24);
insert into datatab values (to_seconddate('05.09.2015 13:12:00'), 1, 50.2);
insert into datatab values (to_seconddate('05.09.2015 14:22:00'), 2, 1.2);
insert into datatab values (to_seconddate('05.09.2015 16:32:00'), 2, 24);
select to_seconddate(day) as day,count(*) as bus_count from(
SELECT to_date(start_time) as day, bus_id, sum(distance_two_points) as distance
FROM datatab
where start_time between '2015-09-05 00:00:00' and '2015-09-05 23:59:59'
group by to_date(start_time),bus_id
) as A
where distance<=250000
group by day;
The inner query gives you:
DAY BUS_ID DISTANCE
2015-09-05 1 75.40
2015-09-05 2 25.20
So, your seconddate "start_time" is now aggregated as daydate and then converted back to 'seconddate'.

What I prefer is using the seconds_between() or nano100_between() function.
select now(),
add_seconds( to_date('1970.01.01', 'YYYY.MM.DD'),
round(
SECONDS_BETWEEN(
to_date('1970.01.01', 'YYYY.MM.DD'),
now()
)/3600
)*3600
)
from dummy;
This looks a bit ugly but given the to_date() is calculated just once and not for each row and the seconds arithmetic is close to how Hana stores the value internally, it should be the most efficient of the lot.
Also it is the most flexible, round by second, minute, hour, day,... everything below year is fine.
PS: round() supports all round and truncate options.

Assuming your start_time is of some data/time type (e.g. SECONDDATE) you could use
...TO_NVARCHAR(START_TIME, 'YYYY-MM-DD') AS DAY...
Instead of date_trunc... in PostgreSQL

Why don't you use CAST() conversion function?
select
cast( now() as date ) myDate
from dummy;

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Take Hourly Average in SQL - tsql

I would do it like this: SELECT CAST(FLOOR(CAST(timestamp AS float)) AS datetime) AS day --strip time , DATEPART(hh, timestamp) AS hour , AVG(value) AS average FROM times GROUP BY CAST(FLOOR(CAST(timestamp AS float)) AS datetime) , DATEPART(hh, timestamp) Example fiddle.

Related

Dynamic value passing in Postgres

Postgresql split date range by year parts (financial year)

How does one query a DateTimeOffset column to get all rows for a particular day?

How to query hourly aggregated data by date with postgresql?

How to date trunc in HANA

Categories

Resources