Postgres generate_series excluding date ranges - postgresql

I'm creating a subscription management system, and need to generate a list of upcoming billing date for the next 2 years. I've been able to use generate_series to get the appropriate dates as such:
SELECT i::DATE
FROM generate_series('2015-08-01', '2017-08-01', '1 month'::INTERVAL) i
The last step I need to take is exclude specific date ranges from the calculation. These excluded date ranges may be any range of time. Additionally, they should not be factored into the time range for the generate_series.
For example, say we have a date range exclusion from '2015-08-27' to '2015-09-03'. The resulting generate_series should exclude the date that week from the calculation, and basically push all future month billing dates one week to the future:
2015-08-01
2015-09-10
2015-10-10
2015-11-10
2015-12-10

First you create a time series of dates over the next two years, EXCEPT your blackout dates:
SELECT dt
FROM generate_series('2015-08-01'::date, '2017-08-01'::date, interval '1 day') AS s(dt)
EXCEPT
SELECT dt
FROM generate_series('2015-08-27'::date, '2015-09-03'::date, interval '1 day') as ex1(dt)
Note that you can have as many EXCEPT clauses as you need. For individual blackout days (as opposed to ranges) you could use a VALUES clause instead of a SELECT.
Then you window over that time-series to generate row numbers of billable days:
SELECT row_number() OVER (ORDER BY dt) AS rn, dt
FROM (<query above>) x
Then you select those days where you want to bill:
SELECT dt
FROM (<query above>) y
WHERE rn % 30 = 1; -- billing on the first day of the period
(This latter query following Craig's advice of billing by 30 days)
Yields:
SELECT dt
FROM (
SELECT row_number() OVER (ORDER BY dt) AS rn, dt
FROM (
SELECT dt
FROM generate_series('2015-08-01'::date, '2017-08-01'::date, interval '1 day') AS s(dt)
EXCEPT
SELECT dt
FROM generate_series('2015-08-27'::date, '2015-09-03'::date, interval '1 day') as ex1(dt)
) x
) y
WHERE rn % 30 = 1;

You will have to split the call to generate series for exclusions. Some thing like this:
Union of 3 queries
First query pulls dates from start to exclusion range from
Second query pulls dates between exclusion range to and your end date
Third query pulls dates when none of your series dates cross exclusion range
Note: You still need a way to loop through exclusion list (if you have one). Also this query may not be very efficient as such scenarios can be better handled through functions or procedural code.

Related

Calculate the difference between two dates in business days

I'm trying to calculate the difference in business days between two dates, at my searches I found the use of functions and gereneate_series() but I would like to search something more practical.
Sample Table:
|start_date |end_date |
|2022-06-01 |2022-06-01|
|2022-05-29 |2022-06-02|
What would you consider more practical? It seems generate_series and extract are specifically made for what you are asking. (see demo)
select start_date, end_date, count(*) "Business days"
from sometable
cross join generate_series(start_date,end_date,interval '1 day') gs(dt)
where extract(isodow from dt) < 6
group by start_date, end_date;

How to select data between two dates using only the start date?

I have problem select data between two dates if the only start_date is available.
The example I want to see is what discount_nr was active between 2020-07-01 and 2020-07-15 or only one day 2020-07-14. I tried different solutions, date range, generate series, and so on, but was still not able to get it to work.
Table only have start dates, no end dates
Example:
discount_nr, start_date
1, 2020-06-30
2, 2020-07-03
3, 2020-07-10
4, 2020-07-15
You can get the end dates by looking at the start date of the next row. This is done with lead. lead(start_date) over(order by start_date asc) will get you the start_date of the next row. If we take 1 day from that we'll get the inclusive end date.
Rather than separate start/end columns, a single daterange column is easier to work with. You can use that as a CTE or create a view.
create view discount_durations as
select
id,
daterange(
start_date,
lead(start_date) over(order by start_date asc)
) as duration
from discounts
Now querying it is easy using range operators. #> to check if the range contains a date.
select *
from discount_durations
where duration #> '2020-07-14'::date
And use && to see if they have any overlap.
select *
from discount_durations
where duration && daterange('2020-07-01', '2020-07-15');
Demonstration

Postgres: How to change start day of week and apply it in date_part?

with partial as(
select
date_part('week', activated_at) as weekly,
count(*) as count
from vendors
where activated_at notnull
group by weekly
)
This is the query counts number of vendors activating per week. I need to change the start day of week from Monday to Saturday. Similar posts like how to change the first day of the week in PostgreSQL or Making Postgres date_trunc() use a Sunday based week but non explain how to embed it in date_part function. I would like to know how to use this function in my query and start day from Saturday.
Thanks in advance.
maybe a little bit overkill for that, you can use some ctes and window functions, so first generate your intervals, start with your first saturday, you want e.g. 2018-01-06 00:00 and the last day you want 2018-12-31, then select your data, join it , sum it and as benefit you also get weeks with zero activations:
with temp_days as (
SELECT a as a ,
a + '7 days'::interval as e
FROM generate_series('2018-01-06 00:00'::timestamp,
'2018-12-31 00:00', '7 day') as a
),
temp_data as (
select
1 as counter,
vendors.activated_at
from vendors
where activated_at notnull
),
temp_order as
(
select *
from temp_days
left join temp_data on temp_data.activated_at between (temp_days.a) and (temp_days.e)
)
select
distinct on (temp_order.a)
temp_order.a,
temp_order.e,
coalesce(sum(temp_order.counter) over (partition by temp_order.a),0) as result
from temp_order

Produce a row for dates that do not exist in a table [duplicate]

I have a postgresql table userDistributions like this :
user_id, start_date, end_date, project_id, distribution
I need to write a query in which a given date range and user id the output should be the sum of all distributions for every day for that given user.
So the output should be like this for input : '2-2-2012' - '2-4-2012', some user id :
Date SUM(Distribution)
2-2-2012 12
2-3-2012 15
2-4-2012 34
A user has distribution in many projects, so I need to sum the distributions in all projects for each day and output that sum against that day.
My problem is what I should group by against ? If I had a field as date (instead of start_date and end_date), then I could just write something like
select date, SUM(distributions) from userDistributions group by date;
but in this case I am stumped as what to do. Thanks for the help.
Use generate_series to produce your dates, something like this:
select dt.d::date, sum(u.distributions)
from userdistributions u
join generate_series('2012-02-02'::date, '2012-02-04'::date, '1 day') as dt(d)
on dt.d::date between u.start_date and u.end_date
group by dt.d::date
Your date format is ambiguous so I guess while converting it to ISO 8601.
This is much like #mu's answer.
However, to cover days with no matches you should use LEFT JOIN:
SELECT d.d::date, sum(u.distributions) AS dist_sum
FROM generate_series('2012-02-02'::date, '2012-02-04'::date, '1 day') AS d(d)
LEFT JOIN userdistributions u ON d.d::date BETWEEN u.start_date AND u.end_date
GROUP BY 1

PostgreSQL: exists robust third party date-math functions to augment the built-in date operators?

I'm porting some T-SQL stored procs to PL/pgSql and, being very new to PostgreSQL, don't know what helpful utility functions might be available in the pg community. Is there a set of robust date-math functions that "nearly everybody uses" out there somewhere? I don't want to quickly cobble together some date-math functions if there's already a great package out there.
The PostgreSQL date math operators with "natural language" string literal arguments are user-friendly if you're typing a query and you happen to know the interval:
select now() - interval '1 day'
but if the interval 1 is the result of a calculation involving nested date-math function calls, these string literals are actually not very user-friendly at all, and it would easier to work with a date_add function:
select dateadd(d, {calculation that returns the interval}, now() )
Thanks
Let me give you an example. I want to subtract from an arbitrary date the number of months that have elapsed since 1/1/1970, and then add that number of months to 1/1/1970 to return the first day of the month in which the arbitrary date falls
select (date_trunc('month', '2013-01-30'::date))::date
Or add a month to the first day of this month to get the first day of the next month, then subtract one day to get the last day of this month
select date_trunc('month', '2013-01-30'::date + 1 * interval '1 month')::date - 1
Notice in the above example you can add any number of months by multiplying the interval '1 month' by an integer. You can do that with any interval without manipulating the string '1 month'. So to add or subtract any interval you just:
select current_date + 5 * interval '1 month'
No need for messy string manipulations. You can multiply by fractions also:
select current_timestamp + 3.5 * interval '1 minute'
To add or subtract days to a date type you use an integer:
select current_date + 10
The "natural language" strings you're talking about are interval literals. Intervals can also be obtained by using date arithmetic.
Surely dateadd can be quite simply emulated in Postgresql as follows:
select d + ({calculation the returns the interval}::text || ' day')::interval
Substitute "month" or "hours" etc as appropriate.
In PostgreSQL, you simply add and subtract interval values to datetime
values:
'2001-06-27 14:43:21'::TIMESTAMP - '00:10:00'::INTERVAL = '2001-06-27 14:33:21'::TIMESTAMP
'2001-06-27 14:43:21'::TIMESTAMP- '2001-06-27 14:33:21'::TIMESTAMP = '00:10:00'::INTERVAL
For more information, see "Functions and Operators" in the PostgreSQL
online docs.
To compute the first day of the month of a date: date_trunc('month', date)
First day of the next month: date_trunc('month', date) + '1 month'::INTERVAL
Add three months to the first day of the month of this date: date_trunc('month', date) + 3*('1 month'::INTERVAL)
The interval is a data type, not a string, and you can do computations with its values.