intersect date intervals and calculate value within them

intersect date intervals and calculate value within them - postgresql

I have a table containing "start_at", "finish_at" and "value" columns:
start_at
finish_at
value
2022-11-01 10:00:00
2022-11-01 22:00:00
5
2022-11-01 16:00:00
2022-11-01 19:00:00
8
2022-11-01 18:00:00
2022-11-01 23:00:00
3
I want to aggregate date intervals and calculate max value within them. Expected result:
start_at
finish_at
max
2022-11-01 10:00:00
2022-11-01 16:00:00
5
2022-11-01 16:00:00
2022-11-01 19:00:00
8
2022-11-01 19:00:00
2022-11-01 22:00:00
5
2022-11-01 22:00:00
2022-11-01 23:00:00
3
I tried SELECT range_agg(tsrange(start_at, finish_at)), max(value) FROM my_table; but the result is not exactly what I need:
range_agg
max
{["2022-11-01 10:00:00","2022-11-01 23:00:00")}
8

setup.
BEGIN;
CREATE TABLE test6 (
start_at timestamp,
finish_at timestamp,
value numeric
);
INSERT INTO test6
VALUES ('2022-11-01 10:00:00', '2022-11-01 22:00:00', 5),
('2022-11-01 16:00:00', '2022-11-01 19:00:00', 8),
('2022-11-01 18:00:00', '2022-11-01 23:00:00', 3);
CREATE VIEW test6v AS
SELECT
tsrange(start_at, finish_at, '[)'),
value
FROM
test6;
COMMIT;
query:
WITH cte AS (
SELECT
tsrange(s, s + interval '3 hour', '[)') AS RANGE
FROM (
SELECT
generate_series(min(start_at), max(finish_at), interval '3 hour')
FROM
test6) foo (s))
SELECT
RANGE,
max(value) FILTER (WHERE RANGE && tsrange IS TRUE)
FROM
cte
CROSS JOIN test6v
GROUP BY
RANGE
ORDER BY
RANGE;
You can customize your desired interval range table. here I slice min(start_at), max(finish_at) based on interval '3 hour'.
tsrange range function, third paramter, you can set the range bound be inclusive or not.
group by custom tsrange then get the max value of value if custome tsrange intersect with test6's tsrange.
Almost identical to your expected result.
WITH cte AS (
SELECT
tsrange(s, s + interval '3 hour', '[)') AS RANGE
FROM ((
SELECT
generate_series('2022-11-01 16:00:00'::timestamp, max(finish_at), interval '3 hour')
FROM
test6)
UNION (
SELECT
'2022-11-01 10:00:00'::timestamp)) s (s))
SELECT
RANGE,
max(value) FILTER (WHERE RANGE && tsrange IS TRUE)
FROM
cte
CROSS JOIN test6v
GROUP BY
RANGE
ORDER BY
RANGE;

Related

I need help in writing a subquery

I have a query like this to create date series:
Select month
From
(select to_char(created_date, 'Mon') as Month,
created_date::date as start_day,
(created_date::date + interval '1 month - 1 day ')::date as end_day
from generate_series(date '2021-01-26',
date '2022-04-26', interval '1 month') as g(created_date)) AS "thang"
And the table looks like this:
month
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Now I want to count the status from the KYC table.
So I try this:
Select
(Select month
From
(select to_char(created_date, 'Mon') as Month,
created_date::date as start_day,
(created_date::date + interval '1 month - 1 day ')::date as end_day
from generate_series(date '2021-01-26',
date '2022-04-26', interval '1 month') as g(created_date)) AS "thang"),
count(*) filter (where status = 4) as "KYC_Success"
From kyc
group by 1
I hope the result will be like this:
Month | KYC_Success
Jan | 234
Feb | 435
Mar | 546
Apr | 157
But it said
error: more than one row returned by a subquery used as an expression
What should I change in this query?

Let us assume that the table KYC has a timestamp column called created_date and the status column, and, that you want to count the success status per month - even if there was zero success items in a month.
SELECT thang.month
, count(CASE WHEN kyc.STATUS = 'success' THEN 1 END) AS successes
FROM (
SELECT to_char(created_date, 'Mon') AS Month
, created_date::DATE AS start_date
, (created_date::DATE + interval '1 month - 1 day ')::DATE AS end_date
FROM generate_series(DATE '2021-01-26', DATE '2022-04-26', interval '1 month') AS g(created_date)
) AS "thang"
LEFT JOIN kyc ON kyc.created_date>= thang.start_date
AND kyc.created_date < thang.end_date
GROUP BY thang.month;

How can I generate a series of date ranges that fall between a start date and end date?

I would like to generate an array of all the time ranges that fall within 4 pm-9 pm between a start and end date
start_date | end_date
----------------------------------------
2020-11-01 16:30:00 | 2020-11-03 18:30:00
The query should be able to turn the above table into:
row | start_date | end_date
------------------------------------------------
1 | 2020-11-01 16:30:00 | 2020-11-01 21:00:00
------------------------------------------------
2 | 2020-11-02 16:00:00 | 2020-11-02 21:00:00
------------------------------------------------
3 | 2020-11-03 16:00:00 | 2020-11-03 18:30:00
Could someone point me in the right direction on how to approach this?

I would do it like this:
with input as (
select '2020-11-01 16:30:00'::timestamptz as start_date,
'2020-11-03 18:30:00'::timestamptz as end_date
)
select row_number() over (order by ddate) as row,
case
when start_date::date = ddate
and start_date > ddate + interval '16 hours'
then start_date
else ddate + interval '16 hours'
end as start_date,
case
when end_date::date = ddate
and end_date < ddate + interval '21 hours'
then end_date
else ddate + interval '21 hours'
end as end_date
from input
cross join lateral
generate_series(
case
when start_date::time > '21:00' then start_date::date + interval '1 day'
else start_date::date
end,
case
when end_date::time < '16:00' then end_date::date - interval '1 day'
else end_date::date
end,
interval '1 day') as gs(ddate)
;
┌─────┬────────────────────────┬────────────────────────┐
│ row │ start_date │ end_date │
├─────┼────────────────────────┼────────────────────────┤
│ 1 │ 2020-11-01 16:30:00-05 │ 2020-11-01 21:00:00-05 │
│ 2 │ 2020-11-02 16:00:00-05 │ 2020-11-02 21:00:00-05 │
│ 3 │ 2020-11-03 16:00:00-05 │ 2020-11-03 18:30:00-05 │
└─────┴────────────────────────┴────────────────────────┘
(3 rows)

When you use generate_series with hours instead of days as in the solution of #MikeOrganek, you can filter them directly. Start and end hours should be "complete hours", which can be achieved using date_trunc(). Because date_trunc cuts the final minutes of the end date, it makes sense, to create the series an hour more; this is why an hour is added in the example.
Finally GROUP BY date and give out the MIN and MAX time. With least() and greatest() you can adjust the edge cases.
step-by-step demo:db<>fiddle
SELECT
GREATEST(MIN(gs), start_date) as start_time,
LEAST(MAX(gs), end_date) as end_time
FROM
t,
generate_series(
date_trunc('hour', start_date),
date_trunc('hour', end_date) + interval '1 hour',
interval '1 hour'
) as gs
WHERE gs::time >= '16:00:00' and gs::time <= '21:00:00'
GROUP BY start_date, end_date, gs::date
ORDER BY gs::date

Another solution is to use generate_series() with a one day interval and construct the time part depending on the value of the time in start/end date for the first and last day.
with from_to (start_date, end_date) as (
values (timestamp '2020-11-01 16:30:00', timestamp '2020-11-03 18:30:00')
)
select g.nr as row,
g.d::date + case
when g.d::date = start_date::date and start_date::time > time '16:00' then start_date::time
else time '16:00'
end as start_date,
g.d::date + case
when g.d::date = end_date::date and end_date::time < time '21:00' then end_date::time
else time '21:00'
end as end_date
from from_to
cross join generate_series(start_date, end_date, interval '1 day') with ordinality as g(d,nr)
order by g.nr;
Online example

Postgresql split date range by year parts (financial year)

I have a table like follows:
id start_date end_date
1 2020-01-01 2020-05-01
2 2020-03-01 2021-04-02
I need to be able to split the rows by financial year e.g. 2020-04-01 -> 2021-03-31)
So the result of the query would be as follows:
id start_date end_date
1 2020-01-01 2020-03-31
1 2020-04-01 2020-05-01
2 2020-03-01 2020-03-31
2 2020-04-01 2021-03-31
2 2021-04-01 2021-04-02

Actually another post helped me resolve this: Date split-up based on Fiscal Year
DROP TABLE your_table;
CREATE TABLE your_table (id int, start_date date, end_date date);
INSERT INTO your_table VALUES (1, '2020-01-01', '2020-05-01');
INSERT INTO your_table VALUES (2, '2020-03-01', '2021-04-02');
SELECT
id,
GREATEST(start_date, ('01-04-'||series.year)::date) AS year_start,
LEAST(end_date, ('31-03-'||series.year + 1)::date) AS year_end
FROM
(SELECT
id,
start_date,
end_date,
generate_series(
date_part('year', your_table.start_date - INTERVAL '3 months')::int,
date_part('year', your_table.end_date - INTERVAL '3 months')::int)
FROM your_table) AS series(id, start_date, end_date, year)
ORDER BY
start_date;
Result:
"id","year_start","year_end"
1,"2020-01-01","2020-03-31"
1,"2020-04-01","2020-05-01"
2,"2020-03-01","2020-03-31"
2,"2020-04-01","2021-03-31"
2,"2021-04-01","2021-04-02"

How to query hourly aggregated data by date with postgresql?

There is one table:
ID DATE
1 2017-09-16 20:12:48
2 2017-09-16 20:38:54
3 2017-09-16 23:58:01
4 2017-09-17 00:24:48
5 2017-09-17 00:26:42
..
The result I need is the last 7-days of data with hourly aggregated count of rows:
COUNT DATE
2 2017-09-16 21:00:00
0 2017-09-16 22:00:00
0 2017-09-16 23:00:00
1 2017-09-17 00:00:00
2 2017-09-17 01:00:00
..
I tried different stuff with EXTRACT, DISTINCT and also used the generate_series function (most stuff from similar stackoverflow questions)
This try was the best one currently:
SELECT
date_trunc('hour', demotime) as date,
COUNT(demotime) as count
FROM demo
GROUP BY date
How to generate hourly series for 7 days and fill-in the count of rows?

SQL DEMO
SELECT dd, count("demotime")
FROM generate_series
( current_date - interval '7 days'
, current_date
, '1 hour'::interval) dd
LEFT JOIN Table1
ON dd = date_trunc('hour', demotime)
GROUP BY dd;
To work from now and now - 7 days:
SELECT dd, count("demotime")
FROM generate_series
( date_trunc('hour', NOW()) - interval '7 days'
, date_trunc('hour', NOW())
, '1 hour'::interval) dd
LEFT JOIN Table1
ON dd = date_trunc('hour', demotime)
GROUP BY dd;

postgresql daysdiff between two dates grouped by month

I have a table with the date columns (start_date, end_date) and I want to calculate the difference between these dates and grouped by the month.
I am able to get the datediff in days, but I do not know how to group this in month, any suggestions?
Table:
id Start_date End_date days
1234 2014-06-03 2014-07-05 32
12345 2014-02-02 2014-05-10 97
Expected results:
month diff_days
2 26
3 30
4 31
5 10
6 27
7 5

I think your expected output numbers are off a little. You might want to double-check.
I use a calendar table myself, but this query uses a CTE and date arithmetic. Avoiding the hard-coded date '2014-01-01' and the interval for 365 days is straightforward, but it makes the query harder to read, so I just used those values directly.
with your_data as (
select date '2014-06-03' as start_date, date '2014-07-05' as end_date union all
select '2014-02-02', '2014-05-10'
), calendar as (
select date '2014-01-01' + (n || ' days')::interval calendar_date
from generate_series(0, 365) n
)
select extract (month from calendar_date) calendar_month, count(*) from calendar
inner join your_data on calendar.calendar_date between start_date and end_date
group by calendar_month
order by calendar_month;
calendar_month count
--
2 27
3 31
4 30
5 10
6 28
7 5
As a rule of thumb, you should never group by the month alone--doing that risks grouping data from different years. This is a safer version that includes the year, and which also restricts output to a single calendar year.
with your_data as (
select date '2014-06-03' as start_date, date '2014-07-05' as end_date union all
select '2014-02-02', '2014-05-10'
), calendar as (
select date '2014-01-01' + (n || ' days')::interval calendar_date
from generate_series(0, 700) n
)
select extract (year from calendar_date) calendar_year, extract (month from calendar_date) calendar_month, count(*) from calendar
inner join your_data on calendar.calendar_date between start_date and end_date
where calendar_date between '2014-01-01' and '2014-12-31'
group by calendar_year, calendar_month
order by calendar_year, calendar_month;

SQL Fiddle
with min_max as (
select min(start_date) as start_date, max(end_date) as end_date
from t
), g as (
select daterange(d::date, (d + interval '1 month')::date, '[)') as r
from generate_series(
(select date_trunc('month', start_date) from min_max),
(select end_date from min_max),
'1 month'
) g(d)
)
select *
from (
select
to_char(lower(r), 'YYYY Mon') as "Month",
sum(upper(r) - lower(r)) as days
from (
select t.r * g.r as r
from
(
select daterange(start_date, end_date, '[]') as r
from t
) t
inner join
g on t.r && g.r
) s
group by 1
) s
order by to_timestamp("Month", 'YYYY Mon')
;
Month | days
----------+------
2014 Feb | 27
2014 Mar | 31
2014 Apr | 30
2014 May | 10
2014 Jun | 28
2014 Jul | 5
Range data types
Range functions and operators

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

intersect date intervals and calculate value within them - postgresql

Related

I need help in writing a subquery

How can I generate a series of date ranges that fall between a start date and end date?

Postgresql split date range by year parts (financial year)

How to query hourly aggregated data by date with postgresql?

postgresql daysdiff between two dates grouped by month

Categories

Resources