Postgres expand time window using date_part - postgresql

Have two dates - '2018-05-01' and '2018-06-01'. I would like to expand this window to the past by day difference of those dates.
SELECT * FROM data
WHERE
start_time > CAST('2018-05-01' AS timestamptz) - INTERVAL '30 DAY'
AND start_time < CAST('2018-06-01' AS timestamptz)
How can I replace INTERVAL '30 DAY' with number of days between given dates without explicitly defining number of days? I know to calculate day difference:
date_part('day',age('2018-05-01', '2018-06-01'))
But not sure how to incorporate into the substraction. Dates and days between them will change.

You can use date_trunc('mon', some_date_expression) to round down to the start of a month:
select date_trunc('mon', now() - '3 mon'::interval) as date_begin
, date_trunc('mon', now() - '1 day'::interval) as date_end
;
Result
date_begin | date_end
------------------------+------------------------
2018-03-01 00:00:00+01 | 2018-06-01 00:00:00+02
(1 row)

You can simply subtract the difference from the start date:
with t (start_date, end_date) as (
values (date '2018-05-01', date '2018-06-01')
)
select start_date - (end_date - start_date) as new_start,
end_date
from t;
returns
new_start | new_end
-----------+-----------
2018-03-31 | 2018-06-01

Related

How are months intervals internally calculated in Postgres?

In PostgreSQL, the interval of '1 month' sometimes counts as 30 days and sometimes counts as 31 days. What are the criteria used to determine this?
I ran the below query to demonstrate my confusion.
select
now() - interval '1 month'
, now() - interval '30 days'
, interval '30 days' = interval '1 month'
, interval '31 days' = interval '1 month'
The query returns:
2022-03-27 21:09:30.933434+00 | 2022-03-28 21:09:30.933434+00 | true | false
I would expect the query to return both days on March 28th, since an interval of one month is equal to an interval of 30 days.
It comes down to the specific vs the general where day is the specific and month is not. The same happens with day and hour as in:
select '2022-03-13 12:00 PDT'::timestamptz - '1 day'::interval;
?column?
------------------------
2022-03-12 12:00:00-08
select '2022-03-13 12:00 PDT'::timestamptz - '24 hours'::interval;
?column?
------------------------
2022-03-12 11:00:00-08
DST occurred morning of 2022-03-13 in PST/PDT. So a day is generalized to the same time a day ago whereas 24 hours ago is actually 24 hours passing.
In your case:
select
now() - interval '1 month'
, now() - interval '30 days';
?column? | ?column?
-------------------------------+-------------------------------
2022-03-27 14:44:33.515669-07 | 2022-03-28 14:44:33.515669-07
The 1 month is going to go back to the same date and time one month back, whereas 30 days is going back an actual 30 days.
In this case:
select '2022-03-30 21:17:05'::timestamp - interval '1 month' ;
?column?
---------------------
2022-02-28 21:17:05
There is no day 30 in February so it goes to the actual end of the month the 28th.

Generate date series by month between two dates and avrage by month in postgresql

I want to create a row for every month between two dates, the first day of every month should be the day of the start date or the first day of every month, and the last date should be the last day of every month or the end date, with average (if date start = 15, then the average should be 15/30) for my table.
input :
product_id | date_start | date_end
1 | 16-01-2020 | 15-03-2020
2 | 07-01-2020 | 22-04-2020
The result should be :
product_id | date_start | date_end | average
1 | 16-01-2020 | 31-01-2020 | 0.5
1 | 01-02-2020 | 29-02-2020 | 1
1 | 01-03-2020 | 15-03-2020 | 0.5
2 | 07-01-2020 | 31-01-2020 | 0.76 -- (30-07)/30
2 | 01-02-2020 | 29-02-2020 | 1
2 | 01-03-2020 | 31-03-2020 | 1
2 | 01-04-2020 | 22-04-2020 | 0.76
I tried using generate series and date trunc and union
SELECT (date_trunc('month', dt) + INTERVAL '1 MONTH' ):: DATE AS date_start ,
(date_trunc('month', dt) + INTERVAL '2 MONTH - 1 day' ):: DATE AS date_end
FROM generate_series( DATE '2020-01-15', DATE '2020-05-21', interval '1 MONTH' ) AS dt
union select '2020-01-15' as date_start,
(date_trunc('month', '2020-01-15'::date) + INTERVAL '1 MONTH - 1 day' ):: DATE AS date_end
union select (date_trunc('month', '2020-05-21'::date) ):: DATE AS date_start ,
'2020-05-21' AS date_end
order by date_start
To adding average I calculate the difference between two dates
SELECT (date_trunc('month', dt) + INTERVAL '1 MONTH' ):: DATE AS date_start ,
(date_trunc('month', dt) + INTERVAL '2 MONTH - 1 day' ):: DATE AS date_end,
((date_trunc('month', dt) + INTERVAL '2 MONTH - 1 day' ) - (date_trunc('month', dt) + INTERVAL '1 MONTH' ):: DATE )
FROM generate_series( DATE '2020-01-15', DATE '2020-05-21', interval '1 MONTH' ) AS dt
with this it seemed like I was hit a wall.
The following gives approximately the same result as you desired, only averages deviates. I believe this stems from an inconsistency in the your calculations where the dates are inclusive in some and excludes either start or end date in others, I was inclusive in all. The other area of difference being I used the actual number of days in the month for denominator calculating it instead of 30. This is necessary for Feb to ever have average 1, otherwise max would be 0.97, and full months having 31 days would average 1.03.
with product_dates(product_id, date_start, date_end) as
( values (1,'2020-01-16'::date,'2020-03-15'::date)
, (2,'2020-01-07'::date,'2020-04-22'::date)
)
select product_id, start_date, end_date, round((end_date-start_date+1 ) * 1.0 / (eom-som+1),2) average
from (select product_id
, greatest(date_start,dt::date) start_date
, least(date_end, (dt+interval '1 month' -interval '1 day')::date) end_date
, dt::date som
, (dt+interval '1 month' -interval '1 day')::date eom
from product_dates
cross join generate_series(date_trunc('month', date_start)
,date_trunc('month', date_end) + interval '1 month' - interval '1 day'
,interval '1 month'
) gs(dt)
) s1;
The heart is the generate_series working directly with dates, notice the date manipulation to ensure I had first day and last day of month. Then in the outer portion of the quest I selected those dates or the parameter date or the generated one (greatest and least functions),

Postgres search available time slots with generate_series

I have a table in my postgres database which has a column of dates. I want to search which of those dates is missing - for example:
date
2016-11-09 18:30:00
2016-11-09 19:00:00
2016-11-09 20:15:00
2016-11-09 22:20:00
2016-11-09 23:00:00
Here, |2016-11-09 21:00:00| is missing. After sorting my generated series if my table has an entry between two slots (slot of 1 hr interval) i need to remove that.
I want to make a query with generate_series that returns me the date which is missing. Is this possible?.
sample query that i used to generate series.
SELECT t
FROM generate_series(
TIMESTAMP WITH TIME ZONE '2016-11-09 18:00:00',
TIMESTAMP WITH TIME ZONE '2016-11-09 23:00:00',
INTERVAL '1 hour'
) t
EXCEPT
SELECT tscol
FROM mytable;
But this query is not removing 2016-11-09 18:30:00,2016-11-09 20:15:00 etc. cuz i used except.
This is not a gaps-and-island problem. You just want to find the 1 hour intervals for which no record exist in the table.
EXCEPT does not work here because it does equality comparison, while you want to check if a record exists or not within a range.
A typical solution for this is to use a left join antipattern:
select dt
from generate_series(
timestamp with time zone '2016-11-09 18:00:00',
timestamp with time zone '2016-11-09 23:00:00',
interval '1 hour'
) d(dt)
left join mytable t
on t.tscol >= dt and t.tscol < dt + interval '1 hour'
where t.tscol is null
You can also use not exists:
select dt
from generate_series(
timestamp with time zone '2016-11-09 18:00:00',
timestamp with time zone '2016-11-09 23:00:00',
interval '1 hour'
) d(dt)
where not exists (
select 1
from mytable t
where t.tscol >= dt and t.tscol < dt + interval '1 hour'
)
In this demo on DB Fiddle, both queries return:
| dt |
| :--------------------- |
| 2016-11-09 21:00:00+00 |

How do I generate months between start date and now() in postgresql

I also have the question how do i get code block to work on stack overflow but that's a side issue.
I have this quasi-code that works:
select
*
from
unnest('{2018-6-1,2018-7-1,2018-8-1,2018-9-1}'::date[],
'{2018-6-30,2018-7-31,2018-8-31,2018-9-30}'::date[]
) zdate(start_date, end_date)
left join lateral pipe_f(zdate...
But now I want it to work from 6/1/2018 until now(). What's the best way to do this.
Oh, postgresql 10. yay!!
Your query gives a list of first and last days of months between "2018-06-01" and now. So I am assuming that you want to this in a more dynamic way:
demo: db<>fiddle
SELECT
start_date,
(start_date + interval '1 month -1 day')::date as end_date
FROM (
SELECT generate_series('2018-6-1', now(), interval '1 month')::date as start_date
)s
Result:
start_date end_date
2018-06-01 2018-06-30
2018-07-01 2018-07-31
2018-08-01 2018-08-31
2018-09-01 2018-09-30
2018-10-01 2018-10-31
generate_series(timestamp, timestamp, interval) generates a list of timestamps. Starting with "2018-06-01" until now() with the 1 month interval gives this:
start_date
2018-06-01 00:00:00+01
2018-07-01 00:00:00+01
2018-08-01 00:00:00+01
2018-09-01 00:00:00+01
2018-10-01 00:00:00+01
These timestamps are converted into dates with ::date cast.
Then I add 1 month to get the next month. But as we are interested in the last day of the previous month I subtract one day again (+ interval '1 month -1 day')
Another option that's more ANSI-compliant is to use a recursive CTE:
WITH RECURSIVE
dates(d) AS
(
SELECT '2018-06-01'::TIMESTAMP
UNION ALL
SELECT d + INTERVAL '1 month'
FROM dates
WHERE d + INTERVAL '1 month' <= '2018-10-01'
)
SELECT
d AS start_date,
-- add 1 month, then subtract 1 day, to get end of current month
(d + interval '1 month') - interval '1 day' AS end_date
FROM dates

postgres '1 year' equals '360 days'?

Am wondering if anyone else has encountered this or knows information about it.
Today is November 3, 2014 and if i check whether or not November 5, 2013 is within the last year i get different answers depending on how i check: 1 year versus 365 days
select now() - '20131105' as diff,
case when now() - '20131105' <= '1 year' then 'within year' else 'not within year' end as yr_check,
case when now() - '20131105' <= '365 days' then 'within 365 days' else 'not within 365 days' end as day_check
2014-11-03 16:27:38.39669-06; 363 days 16:27:38.39669; not within year; within 365 days
Looks like when querying against November 9 tho, it's ok
select now() as right_now, now() - '20131109' as diff,
case when now() - '20131109' <= '1 year' then 'within year' else 'not within year' end as yr_check,
case when now() - '20131109' <= '365 days' then 'within 365 days' else 'not within 365 days' end as day_check
2014-11-03 16:31:12.464469-06; 359 days 16:31:12.464469; within year; within 365 days
anyone have an idea about this? or is there something about date arithmetic that's funny?
postgres version is 9.2.4
or is there something about date arithmetic that's funny?
It's funny alright, but not in the way that makes you laugh.
Twelve months has to equal a year doesn't it?
=> SELECT '12 months'::interval = '1 year'::interval;
?column?
----------
t
Good. Makes sense. Hmm - wonder how long a month is.
=> SELECT '30 days'::interval = '1 month'::interval;
?column?
----------
t
Fair enough. Suppose they had to pick something.
Hmm - but that means...
=> SELECT '360 days'::interval = '12 months'::interval;
?column?
----------
t
Which seems to imply...
=> SELECT '360 days'::interval = '1 year'::interval;
?column?
----------
t
That can't be right! What they need to do is have a month equal to 30.41666 days. No hang on, what about leap years? Hmm - does this affect weeks? AARGH!
Basically, you can't convert sensibly between time units. There aren't 60 seconds in a minute, or 24 hours in a day, 52 weeks in a year or even 365 days. Unfortunately, humans (particularly customer-shaped humans) like converting between time units so we end up with a mess like this.
PostgreSQL's system is no more loony than any other and in fact is better than most.
I'm not sure what is real problem with this check, but it works other way around:
select now() - interval '1 year' <= date '2013-11-05'
I'm no expert in Postgres, but it can be something with type comparisons, because:
select pg_typeof(now() - date '2013-11-05'),
pg_typeof(now() - interval '1 year')
yields result:
interval, timestamp with time zone
so your example compares interval with interval, but for different scales - days vs year, and my solution compares timestamp with date, which seems to work
UPDATE:
You can check that interval '1 year' when not attached to year (not added to date or timestamp) equals to 360 days:
select interval '1 year' <= interval '359 days',
interval '1 year' <= interval '360 days'
which yields:
f, t
From my understanding you can't just compare random year interval when you don't know year it is attached - always compare dates, and just use interval to create new date object.
select now() - interval '1 year' <= now() - interval '365 days'
t
From www.postgresql.org/docs/current/static/datatype-datetime.html:
Internally interval values are stored as months, days, and seconds. This is done because the number of days in a month varies, and a day can have 23 or 25 hours if a daylight savings time adjustment is involved. The months and days fields are integers while the seconds field can store fractions. Because intervals are usually created from constant strings or timestamp subtraction, this storage method works well in most cases. Functions justify_days and justify_hours are available for adjusting days and hours that overflow their normal ranges.
Because you compare two intervals, PostgreSQL internally normalizes values (like justify_interval()), before comparing:
SELECT INTERVAL '31 days' > INTERVAL '1 mon' -- yields 't'
But, if you apply interval substraction/addition, varying day & month length taken into consideration:
SELECT (timestamptz '2014-11-03 00:00:00 America/New_York' - INTERVAL '1 day') AT TIME ZONE 'America/New_York',
timestamptz '2014-11-03 00:00:00 America/New_York' - timestamptz '2014-11-02 00:00:00 America/New_York' <= interval '1 day';
-- | timestamp | boolean |
-- +---------------------+---------+
-- | 2014-11-02 01:00:00 | f |
So, if you need to test, whether a timestamp/date is within a range, you should manipulate timestampts/dates (or use timestamp/date ranges) & compare those values with <, > or BETWEEN.
SELECT timestamp '2014-11-03 00:00:00' - timestamp '2014-10-03 00:00:00' <= interval '1 mon',
timestamp '2014-11-03 00:00:00' - interval '1 mon' <= timestamp '2014-10-03 00:00:00';
-- | boolean | boolean |
-- +---------+---------+
-- | f | t |