Speed up postgres read window query on overlapping date ranges - postgresql

I have a table (simplified) that contains readings like follows
meter_id read_date value
1 2017-01-01 10
1 2017-01-15 15
1 2017-02-05 20
1 2017-04-15 22
2 2016-12-14 120
2 2016-03-02 200
This table contains millions of readings.
And I have a view (or query) that goes something like
select meter_id, read_date as start_read_date, value as start_value,
CASE
WHEN lead(read_date) OVER read_wdw IS NULL THEN date_trunc('month'::text, read_date + '1 day'::interval) + '1 mon'::interval - '1 day'::interval) + '1 mon'::interval - '1 day'::interval
ELSE lead(.read_date) OVER read_wdw::date
END::date AS read_end_date,
lead(value) OVER read_wdw AS end_value,
from reads_table
WINDOW read_wdw AS (PARTITION BY meter_id ORDER BY read_date);
I need to be able to query dates within a certain month. So start_read_date, end_read_date between e.g. '2017-01-01' and '2017-01-31'
So e.g.
select * from my_view where daterange(start_read_date,end_read_date, '[]') && daterange('2017-01-01', '2017-01-31', '[])
Which with the above table would return
meter_id start_read_date start_value end_read_date end_value
1 2017-01-01 10 2017-01-15 15
1 2017-01-15 15 2017-02-05 20
2 2016-12-14 120 2016-03-02 200
Is there a way to do a similar query on this table without having to build the whole view first to get my desired result?
Something like (which doesn't work)
select meter_id, read_date as start_read_date, value as start_value,
CASE
WHEN lead(read_date) OVER read_wdw IS NULL THEN date_trunc('month'::text, read_date + '1 day'::interval) + '1 mon'::interval - '1 day'::interval) + '1 mon'::interval - '1 day'::interval
ELSE lead(.read_date) OVER read_wdw::date
END::date AS read_end_date,
lead(value) OVER read_wdw AS end_value,
from reads_table
where read_date between '2017-01-01' and '2017-01-31'
or lead(read_date) over read_window between '2017-01-01' and '2017-01-31'
WINDOW read_wdw AS (PARTITION BY meter_id ORDER BY read_date);

Actually wrapping it in another select seems to resolve...
select * from (
select meter_id, read_date as start_read_date, value as start_value,
CASE
WHEN lead(read_date) OVER read_wdw IS NULL THEN date_trunc('month'::text, read_date + '1 day'::interval) + '1 mon'::interval - '1 day'::interval) + '1 mon'::interval - '1 day'::interval
ELSE lead(.read_date) OVER read_wdw::date
END::date AS read_end_date,
lead(value) OVER read_wdw AS end_value,
from reads_table
WINDOW read_wdw AS (PARTITION BY meter_id ORDER BY read_date)
)sub
where read_start_date between ...
or read_end_date between ...

Related

PostgreSQL: Date Calendar Days Interval Scenario

I would like to print this table (displaying only 4 rows for brevity):
Dates
Period
01-MAR-2022
61
02-MAR-2022
61
03-MAR-2022
61
30-APR-2022
61
So far I have:
SELECT CAST(TRUNC(date_trunc('month',CURRENT_DATE) + interval '-2 month') AS DATE) + (n || 'day')::INTERVAL AS Dates
, date_trunc('month',CURRENT_DATE) + interval '-2 month' + INTERVAL '2 month' - date_trunc('month',CURRENT_DATE) + interval '-2 month' AS Period
FROM generate_series(0,61) n
Please help with a better way of generating the period and also replacing the hard-coded 61 in generate_series(0,61).
Thanks!
What are you actually trying to accomplish, it is not clear nor specified. BTW your query is invalid. It appears you looking to list each data from first date of 2 months prior to the last date of 1 month prior and the total number of days in that range. The following would give the first date, and using date subtraction gives the number of days.
with full_range( first_dt, num_days) as
( select date_trunc ('month', (current_date - interval '2 months'))::date
, date_trunc ('month', (current_date - interval '1 day'))::date -
date_trunc ('month', (current_date - interval '2 months'))::date
)
select *
from full_range;
With that in hand you can use the num_days with generate series with the expression
select generate_series(0, num_days-1) from full_range
Finally combine the above arriving at: (see demo)
with full_range( first_dt, num_days) as
( select date_trunc ('month', (current_date - interval '2 months'))::date
, date_trunc ('month', (current_date - interval '1 day'))::date -
date_trunc ('month', (current_date - interval '2 months'))::date
)
select (first_dt + n*interval '1 day')::date, num_days
from full_range
cross join (select generate_series(0, num_days-1) from full_range) gn(n);

How can i get a week range for a given month in Postgress

This is my current implementation
SELECT
date_trunc('month', do_date::date)::date as starting_of_the_month,
(date_trunc('month', do_date::date) + interval '1 month' - interval '1 day')::date as ending_of_the_month,
case when 1 + FLOOR((EXTRACT(DAY FROM do_date) - 1) / 7) = 1
THEN date_trunc('week', do_date)::date || ' - ' ||
(date_trunc('week', do_date) + '6 days') ::date end as week1,
case when 1 + FLOOR((EXTRACT(DAY FROM do_date) - 1) / 7) = 2
THEN date_trunc('week', do_date)::date || ' - ' ||
(date_trunc('week', do_date) + '6 days') ::date end as week2,
case when 1 + FLOOR((EXTRACT(DAY FROM do_date) - 1) / 7) = 3
THEN date_trunc('week', do_date)::date || ' - ' ||
(date_trunc('week', do_date) + '6 days') ::date end as week3,
case when 1 + FLOOR((EXTRACT(DAY FROM do_date) - 1) / 7) = 4
THEN date_trunc('week', do_date)::date || ' - ' ||
(date_trunc('week', do_date) + '6 days') ::date end as week4,
case when 1 + FLOOR((EXTRACT(DAY FROM do_date) - 1) / 7) = 5
THEN date_trunc('week', do_date)::date || ' - ' ||
(date_trunc('week', do_date) + '6 days') ::date end as week5
FROM sales_dos
WHERE date_trunc('month', do_date::date)::date >= '2021-02-01' AND date_trunc('month', do_date::date)::date < '2021-02-28'
This is my output for now :
I want the output to display as below :
Week 1 : 2021-02-01 - 2021-02-07
Week 2 : 2021-02-08 - 2021-02-14
Week 3 : 2021-02-15 - 2021-02-21
Week 4 : 2021-02-22 - 2021-02-28
Week 5 : -
Here is another way to do it (example for January 2021).
with
t as (select date_trunc('month', '2021-03-11'::date) as aday), -- any date in Jan-2021
s as
(
select d::date, d::date + 6 ed, extract('isodow' from d) wd
from t, generate_series (aday, aday + interval '1 month - 1 day', interval '1 day') d
)
select format ('Week %s', extract(day from d)::integer / 7 + 1) as weekname, d, ed
from s
where wd = 1;
So what you are looking for is a hybrid ISO with standard Calendar. You are taking the ISO week starting and ending period, but instead of all weeks being exactly 7 days you potentially truncate the 1st and/or last weeks.
The change to need for this is not actually extensive. For initial query returns the in the ISO week begin date instead of the 1st of the month. Then the main query then checks for week 1 and if so produces the 1st of the month. The only twist is determining the ISO week begin date. For this I've just included a function I have had for some time specifically for that. The change to the week_days function are marked --<<<.
create or replace function iso_first_of_week(date_in date)
returns date
language sql
immutable strict
/*
Given a date return the 1st day of the week according to ISO-8601.
I.e. Return the Date if it is Monday otherwise return the preceding Monday
*/
AS $$
with wk_adj(l_days) as (values (array[0,1,2,3,4,5,6]))
select date_in - l_days[ extract (isodow from date_in)::integer ]
from wk_adj;
$$;
create or replace
function week_dates( do_date_in date)
returns table (week_num integer, first_date date, last_date date)
language sql
immutable strict
as $$
with recursive date_list(week_num,first_date,terminate_date) as
( select 1
, iso_first_of_week(do_date_in)::timestamp --<<<
, (date_trunc('month', do_date_in) + interval '1 month' - interval '1 day')::timestamp
union all
select week_num+1, (first_date+interval '7 day'), terminate_date
from date_list
where first_date+interval '6 day' < terminate_date::timestamp
)
select week_num
, case when week_num = 1 --<<<
then date_trunc('month', do_date_in)::date --<<<
else first_date::date --<<<
end --<<<
, case when (first_date+interval '6 day')::date > terminate_date
then terminate_date::date
else (first_date+interval '6 day')::date
end last_date
from date_list;
$$;
---------- Original Reply
You can use a recursive query CTE to get the week number and first date for each week of the month specified. The main query calculates the ending date, shorting the last if necessary. Then wrap that into a SQL function to return the week number and date range for each week. See example.
create or replace
function week_dates( do_date_in date)
returns table (ween_num integer, first_date date, last_date date)
language sql
immutable strict
as $$
with recursive date_list(week_num,first_date,terminate_date) as
( select 1
, date_trunc('month', do_date_in)::timestamp
, (date_trunc('month', do_date_in) + interval '1 month' - interval '1 day')::timestamp
union all
select week_num+1, (first_date+interval '7 day'), terminate_date
from date_list
where first_date+interval '6 day' < terminate_date::timestamp
)
select week_num
, first_date::date
, case when (first_date+interval '6 day')::date > terminate_date
then terminate_date::date
else (first_date+interval '6 day')::date
end last_date
from date_list;
$$;
Response to: "How can i put the output in a single row with week1, week2, week3, week4 and week5". This is essentially the initial output that did not satisfy what you wanted. The term for this type action is PIVOT and is generally understood. It stems from transforming row orientation to column orientation. It is not overly difficult but it is messy.
IMHO this is something that belongs in the presentation layer and is not suitable for SQL. After all you are rearranging the data structure for presentation purposes. Let the database server use its natural format, use the presentation layer to reformat. This allows reuse of the queries instead of rewriting when the presentation is changed or another view of the same data is required.
If you actually want this then just use your initial query, or see the answer from
#Bohemian. However the below shows how this issue can be handled with just SQL (assuming the function week_dates was created).
select week1s
, case when week5e is null
then week4e
else week5e
end "end of month"
, week1s || ' - ' || week1e
, week2s || ' - ' || week2e
, week3s || ' - ' || week3e
, week4s || ' - ' || week4e
, week5s || ' - ' || week5e
from ( select max(case when (week_num=1) then first_date else NULL end) as week1s
, max(case when (week_num=1) then last_date else NULL end) as week1e
, max(case when (week_num=2) then first_date else NULL end) as week2s
, max(case when (week_num=2) then last_date else NULL end) as week2e
, max(case when (week_num=3) then first_date else NULL end) as week3s
, max(case when (week_num=3) then last_date else NULL end) as week3e
, max(case when (week_num=4) then first_date else NULL end) as week4s
, max(case when (week_num=4) then last_date else NULL end) as week4e
, max(case when (week_num=5) then first_date else NULL end) as week5s
, max(case when (week_num=5) then last_date else NULL end) as week5e
from week_dates(current_date)
) w ;
As before I have wrapped the above in a SQL function and provide an example here.
I would first simplify to:
extract(day from do_date)::int / 7 + 1 as week_in_month
then pivot on that using crosstab().

Postgresql compare two select result on the same table

I compare results from two selects and get 1 or 0 as a final result.
Below query syntax is good but this query causes timeout.
SELECT (CASE WHEN (
select count(*) from order where ordered_date > (NOW() - INTERVAL '120 minutes')
and order_ordered = current_date) >
(select count(*)/3
from order
where ordered_date > (NOW() - INTERVAL '2 days' - INTERVAL '120 minutes')
and ordered_date < (NOW() - INTERVAL '2 days'))
THEN 1 ELSE 0 end);
Therefore, i try to optimize the query to use an alias for each select as below :
select (case when a > b then 1 else 0 end) from (select count(*) from order where ordered_date > (NOW() - INTERVAL '120 minutes')
and order_ordered = current_date) as a,
from (select count(*)/3
from order
where ordered_date > (NOW() - INTERVAL '2 days' - INTERVAL '120 minutes')
and ordered_date < (NOW() - INTERVAL '2 days'))as b;
I have syntax error near "from", in my memory this kind of syntax works on mysql.
Could you please advise me if there a possiblity to use two times of "from" by using alias on Postgresql or if you know another possility i am a taker.
Sample:
First query gives : select count(*) from order where ordered_date > (NOW() - INTERVAL '120 minutes') and order_ordered = current_date -> 60
Seconde query gives : select count(*)/3 from order where ordered_date > (NOW() - INTERVAL '2 days' - INTERVAL '120 minutes') and ordered_date < (NOW() - INTERVAL '2 days') -> 20
Final condition : case when (60 > 20 then 1 else 0 end)
Result expected : 1
Thanks
I suggest using SELECT in WITH (here documentation).
WITH orders_current_date AS (
SELECT count(*)
FROM order
WHERE ordered_date > (NOW() - INTERVAL '120 minutes')
AND order_ordered = current_date)
), orders_interval AS (
SELECT count(*)/3
FROM order
WHERE ordered_date > (NOW() - INTERVAL '2 days' - INTERVAL '120 minutes')
AND ordered_date < (NOW() - INTERVAL '2 days')
)
SELECT
CASE
WHEN SELECT * FROM orders_current_date > SELECT * FROM orders_interval
THEN '1'
ELSE
0
END;

Postgres distinct union only for specific columns

I have two sets of data, one of which is dynamically generated.
If I leave off the column state it works perfectly as that column doesn't really exist, my question is how can I ignore a column for the UNION so that it combines the two datasets (as it is it's the same as UNION ALL). eg I prefer the first table and want any rows from the second dataset ignored if they exist in the first one.
SELECT event_id, start_at, state
FROM event_logs
WHERE start_at BETWEEN current_date AND current_date + interval '3 weeks'
UNION
SELECT id event_id,
GENERATE_SERIES(date_trunc('week', current_date)::date + (extract(isodow from start_at)::int - 1) + start_at::time, current_date + interval '3 weeks', '1 week'::INTERVAL) AS start_at,
'draft' AS state
FROM events
Update, also tried:
WITH future_logs AS (
SELECT id event_id,
GENERATE_SERIES(date_trunc('week', current_date)::date + (extract(isodow from start_at)::int - 1) + start_at::time, current_date + interval '3 weeks', '1 week'::INTERVAL) AS start_at,
'draft' AS state
FROM events)
SELECT future_logs.event_id, future_logs.start_at, future_logs.state
FROM future_logs
LEFT JOIN event_logs ON future_logs.event_id = event_logs.event_id AND future_logs.start_at = event_logs.start_at
WHERE event_logs.start_at BETWEEN current_date AND current_date + interval '3 weeks'
But got too few results 77 vs ~1000 expected.
Just add NOT EXISTS() to the second leg, and you can use UNION ALL to avoid sort/merging.
SELECT event_id, start_at, state
FROM event_logs
WHERE start_at BETWEEN current_date AND current_date + interval '3 weeks'
UNION ALL
SELECT id AS event_id
, generate_series(date_trunc('week', current_date)::date + (extract(isodow from start_at)::int - 1) + start_at::time
, current_date + interval '3 weeks'
, '1 week'::INTERVAL) AS start_at
, 'draft' AS state
FROM events ev
WHERE NOT EXISTS ( SELECT*
FROM event_logs nx
WHERE nx.event_id =ev.id
AND nx.start_at BETWEEN current_date AND current_date + interval '3 weeks' )
;
select DISTINCT ON (date_day) date_day, state from(
SELECT day::date as date_day, null as state
FROM generate_series(now()- interval '2 week'
, now()
, interval '1 day') day
UNION ALL
select distinct
date_trunc('day',e.updated_at) as date_day,
max(des.state) over (partition by date_trunc('day',des.updated_at)) as state
from device_event as des where e.id=49 and e.updated_at >= now() - interval '2 week'
) dba order by 1
I would add one other column taborder into your UNION query to ensure simple ordering of the rows and use window function row_number() over(...) in following way:
SELECT
event_id,
start_at,
state
FROM (
SELECT
event_id,
start_at,
state,
row_number(*) OVER (PARTITION BY event_id, start_at ORDER BY taborder) AS rownum
FROM (
SELECT
event_id,
start_at,
state,
1 AS taborder
FROM original_table
UNION
SELECT
event_id,
start_at,
state,
2 AS taborder
FROM draft_table
) src0
) src1
WHERE rownum = 1
ORDER BY 1, 2, 3

Unify select sql. Postgres

I can unify the two select below in a single, where in the first column return the result of the first and second column the result of the second.
select count(*) from rrhh.empleado where fecha_contratado > current_date - interval '100 days'; // select1
select count(*) from rrhh.empleado where fecha_fin_contrato > current_date - interval '100 days'; //select2
Thank you
try:
with a as (
select
case when fecha_contratado > current_date - interval '100 days' then 1
else 0 end q1
, case when fecha_fin_contrato > current_date - interval '100 days' then 1
else 0 end q2
from rrhh.empleado
)
select sum(q1), sum(q2)
from a
;
This is a typical case for conditional aggregation:
select count(*) filter (where fecha_contratado > current_date - interval '100 days'),
count(*) filter (where fecha_fin_contrato > current_date - interval '100 days')
from rrhh.empleado
You can use the CASE expression (and the fact that most aggregates does not use NULL values) for versions earlier than 9.4:
select count(case when fecha_contratado > current_date - interval '100 days' then 1 end),
count(case when fecha_fin_contrato > current_date - interval '100 days' then 1 end)
from rrhh.empleado
Note: these queries will scan the whole table, while your original queries could make use of indexes on fecha_contratado and fecha_fin_contrato. If performance matters to you, you could append a filter to these queries too:
where least(fecha_contratado, fecha_fin_contrato) > current_date - interval '100 days'
and you could index the expression: least(fecha_contratado, fecha_fin_contrato).