Calculate the difference between two dates in business days - postgresql

I'm trying to calculate the difference in business days between two dates, at my searches I found the use of functions and gereneate_series() but I would like to search something more practical.
Sample Table:
|start_date |end_date |
|2022-06-01 |2022-06-01|
|2022-05-29 |2022-06-02|

What would you consider more practical? It seems generate_series and extract are specifically made for what you are asking. (see demo)
select start_date, end_date, count(*) "Business days"
from sometable
cross join generate_series(start_date,end_date,interval '1 day') gs(dt)
where extract(isodow from dt) < 6
group by start_date, end_date;

Related

Postgres: How to change start day of week and apply it in date_part?

with partial as(
select
date_part('week', activated_at) as weekly,
count(*) as count
from vendors
where activated_at notnull
group by weekly
)
This is the query counts number of vendors activating per week. I need to change the start day of week from Monday to Saturday. Similar posts like how to change the first day of the week in PostgreSQL or Making Postgres date_trunc() use a Sunday based week but non explain how to embed it in date_part function. I would like to know how to use this function in my query and start day from Saturday.
Thanks in advance.
maybe a little bit overkill for that, you can use some ctes and window functions, so first generate your intervals, start with your first saturday, you want e.g. 2018-01-06 00:00 and the last day you want 2018-12-31, then select your data, join it , sum it and as benefit you also get weeks with zero activations:
with temp_days as (
SELECT a as a ,
a + '7 days'::interval as e
FROM generate_series('2018-01-06 00:00'::timestamp,
'2018-12-31 00:00', '7 day') as a
),
temp_data as (
select
1 as counter,
vendors.activated_at
from vendors
where activated_at notnull
),
temp_order as
(
select *
from temp_days
left join temp_data on temp_data.activated_at between (temp_days.a) and (temp_days.e)
)
select
distinct on (temp_order.a)
temp_order.a,
temp_order.e,
coalesce(sum(temp_order.counter) over (partition by temp_order.a),0) as result
from temp_order

Produce a row for dates that do not exist in a table [duplicate]

I have a postgresql table userDistributions like this :
user_id, start_date, end_date, project_id, distribution
I need to write a query in which a given date range and user id the output should be the sum of all distributions for every day for that given user.
So the output should be like this for input : '2-2-2012' - '2-4-2012', some user id :
Date SUM(Distribution)
2-2-2012 12
2-3-2012 15
2-4-2012 34
A user has distribution in many projects, so I need to sum the distributions in all projects for each day and output that sum against that day.
My problem is what I should group by against ? If I had a field as date (instead of start_date and end_date), then I could just write something like
select date, SUM(distributions) from userDistributions group by date;
but in this case I am stumped as what to do. Thanks for the help.
Use generate_series to produce your dates, something like this:
select dt.d::date, sum(u.distributions)
from userdistributions u
join generate_series('2012-02-02'::date, '2012-02-04'::date, '1 day') as dt(d)
on dt.d::date between u.start_date and u.end_date
group by dt.d::date
Your date format is ambiguous so I guess while converting it to ISO 8601.
This is much like #mu's answer.
However, to cover days with no matches you should use LEFT JOIN:
SELECT d.d::date, sum(u.distributions) AS dist_sum
FROM generate_series('2012-02-02'::date, '2012-02-04'::date, '1 day') AS d(d)
LEFT JOIN userdistributions u ON d.d::date BETWEEN u.start_date AND u.end_date
GROUP BY 1

Postgres generate_series excluding date ranges

I'm creating a subscription management system, and need to generate a list of upcoming billing date for the next 2 years. I've been able to use generate_series to get the appropriate dates as such:
SELECT i::DATE
FROM generate_series('2015-08-01', '2017-08-01', '1 month'::INTERVAL) i
The last step I need to take is exclude specific date ranges from the calculation. These excluded date ranges may be any range of time. Additionally, they should not be factored into the time range for the generate_series.
For example, say we have a date range exclusion from '2015-08-27' to '2015-09-03'. The resulting generate_series should exclude the date that week from the calculation, and basically push all future month billing dates one week to the future:
2015-08-01
2015-09-10
2015-10-10
2015-11-10
2015-12-10
First you create a time series of dates over the next two years, EXCEPT your blackout dates:
SELECT dt
FROM generate_series('2015-08-01'::date, '2017-08-01'::date, interval '1 day') AS s(dt)
EXCEPT
SELECT dt
FROM generate_series('2015-08-27'::date, '2015-09-03'::date, interval '1 day') as ex1(dt)
Note that you can have as many EXCEPT clauses as you need. For individual blackout days (as opposed to ranges) you could use a VALUES clause instead of a SELECT.
Then you window over that time-series to generate row numbers of billable days:
SELECT row_number() OVER (ORDER BY dt) AS rn, dt
FROM (<query above>) x
Then you select those days where you want to bill:
SELECT dt
FROM (<query above>) y
WHERE rn % 30 = 1; -- billing on the first day of the period
(This latter query following Craig's advice of billing by 30 days)
Yields:
SELECT dt
FROM (
SELECT row_number() OVER (ORDER BY dt) AS rn, dt
FROM (
SELECT dt
FROM generate_series('2015-08-01'::date, '2017-08-01'::date, interval '1 day') AS s(dt)
EXCEPT
SELECT dt
FROM generate_series('2015-08-27'::date, '2015-09-03'::date, interval '1 day') as ex1(dt)
) x
) y
WHERE rn % 30 = 1;
You will have to split the call to generate series for exclusions. Some thing like this:
Union of 3 queries
First query pulls dates from start to exclusion range from
Second query pulls dates between exclusion range to and your end date
Third query pulls dates when none of your series dates cross exclusion range
Note: You still need a way to loop through exclusion list (if you have one). Also this query may not be very efficient as such scenarios can be better handled through functions or procedural code.

PostgreSQL subquery not working

What's wrong with this query?
select extract(week from created_at) as week,
count(*) as received,
(select count(*) from bugs where extract(week from updated_at) = a.week) as done
from bugs as a
group by week
The error message is:
column a.week does not exist
UPDATE:
following the suggestion of the first comment, I tried this:
select a.extract(week from created_at) as week,
count(*) as received, (select count(*)
from bugs
where extract(week from updated_at) = a.week) as done from bugs as a group by week
But it doesn't seem to work:
ERROR: syntax error at or near "from"
LINE 1: select a.extract(week from a.created_at) as week, count(*) a...
As far as I can tell you don't need the sub-select at all:
select extract(week from created_at) as week,
count(*) as received,
sum( case when extract(week from updated_at) = extract(week from created_at) then 1 end) as done
from bugs
group by week
This counts all bugs per week and counts those that are updated in the same week as "done".
Note that your query will only report correct values if you never have more than one year in your table.
If you have more than one year of data in the table you need to include the year in the comparison as well:
select to_char(created_at, 'iyyy-iw') as week,
count(*) as received,
sum( case when to_char(created_at, 'iyyy-iw') = to_char(updated_at, 'iyyy-iw') then 1 end) as done
from bugs
group by week
Note that I used IYYY an IW to cater for the ISO definition of the year and the week around the year end/start.
Maybe a little explanation on why your original query did not work would be helpful:
The "outer" query uses two aliases
a table alias for bugs named a
a column alias for the expression extract(week from created_at) named week
The only place where the column alias week can be used is in the group by clause.
To the sub-select (select count(*) from bugs where extract(week from updated_at) = a.week)) the alias a is visible, but not the alias week (that's how the SQL standard is defined).
To get your subselect working (in terms of column visibility) you would need to reference the full expression of the "outer" column:
(select count(*) from bugs b where extract(week from b.updated_at) = extract(week from a.created_at))
Note that I introduced another table alias b in order to make it clear which column stems from which alias.
But even then you'd have a problem with the grouping as you can't reference an ungrouped column like that.
that could work as well
with origin as (
select extract(week from created_at) as week, count(*) as received
from bugs
group by week
)
select week, received,
(select count(*) from bugs where week = extract(week from updated_at) )
from origin
it should have a good performance

How to get the no of Days from the difference of two dates in PostgreSQL?

select extract(day from age('2013-04-06','2013-04-04'));`
gives me the no of days ! i.e.: 2 days
but it failed when I have a differnt month:
select extract(day from age('2013-05-02','2013-04-01'));
So what I need is to get the no of days as 32 days
Subtraction seems more intuitive.
select '2013-05-02'::date - '2013-04-01'::date
Run this query to see why the result is 31 instead of 32
with dates as (
select generate_series('2013-04-01'::date, '2013-05-02'::date, '1 day')::date as end_date,
'2013-04-01'::date as start_date
)
select end_date, start_date, end_date - start_date as difference
from dates
order by end_date
The difference between today and today is zero days. Whether that matters is application-dependent. You can just add 1 if you need to.