I have a query that returns all dates between 01/01/2011 and 12/31/2041 (excluding Saturdays and sundays). It's working fine, but when I try to put the filter "TRIM(TO_CHAR(dt, 'DAY')) NOT IN ('SATURDAY', 'SUNDAY')" in the WITH, It only returns one row. Any suggestions? Please keep in mind that i'm brand new to PostgreSQL.
WITH RECURSIVE bd(dt,rn) AS
(
SELECT
TO_DATE('01/01/2011', 'MM/DD/YYYY') AS dt,
1 AS rn
UNION
SELECT
dt + 1, rn + 1
FROM
bd
WHERE
dt BETWEEN TO_DATE('01/01/2011', 'MM/DD/YYYY') AND TO_DATE('12/31/2041', 'MM/DD/YYYY') - 1
)
SELECT
dt, rn
FROM
bd
WHERE TRIM(TO_CHAR(dt, 'DAY')) NOT IN ('SATURDAY', 'SUNDAY')
No need for a recursive query. This can be done much easier using generate_series()
select t.dt::date
from generate_series(date '2011-01-01', date '2014-12-31', interval '1 day') as t(dt)
where extract(isodow from t.dt) not in (6,7);
Related
I have a dataset of sales. To summarize, the structure is
client_id
date_purchase
There might be several purchases done by the same customer on different dates. There can also be several purchases done on the same date (by different or the same customer).
My goal is to get the number of customers, for any given day, that made 2 or more purchases between that day and 90 days prior.
That is, the expected output is
date_purchase
number_of_customers
2022-12-19
200
2022-12-18
194
(...)
Please note this calculates, for any given date, the number of customer with 2+ purchases between that date and 90 days prior.
I know it has something to do with a window function. But so far I have not found a way to calculate, for every window of 90 days, how many customers have done 2+ purchases.
I've tried several window functions with no success:
partition by date_purchase
range between interval '90 days' preceding and current row
So far I can't get to calculate correctly the number for each date.
Window function doesn't seem to be relevant here because there is no relationship between the rows of the same window. A simple query or a self-join query should provide the expected result.
Assuming that client_id and date_purchase are two columns of my_table :
1. Query for a given date reference_date :
SELECT a.reference_date AS date_purchase, count(*) AS number_of_customers
FROM ( SELECT reference_date , client_id
FROM my_table
WHERE date_purchase <= reference_date AND date_purchase >= reference_date - INTERVAL '90 days'
GROUP BY client_id
HAVING count(*) >= 2
) AS a
2. Query for a given interval of dates reference_date => reference_date + INTERVAL '20 days' :
SELECT a.date AS date_purchase, count(*) AS number_of_customers
FROM ( SELECT ref.date, t.client_id
FROM my_table AS t
INNER JOIN generate_series(reference_date, reference_date + INTERVAL '20 days', '1 day') AS ref(date)
ON t.date_purchase <= ref.date AND t.date_purchase >= ref.date - INTERVAL '90 days'
GROUP BY ref.date, t.client_id
HAVING count(*) >= 2
) AS a
GROUP BY a.date
ORDER BY a.date
3. Query for all the date_purchase in mytable :
SELECT a.date AS date_purchase, count(*) AS number_of_customers
FROM ( SELECT ref.date, t.client_id
FROM my_table AS t
INNER JOIN (SELECT DISTINCT date_purchase AS date FROM my_table) AS ref
ON t.date_purchase <= ref.date AND t.date_purchase >= ref.date - INTERVAL '90 days'
GROUP BY ref.date, t.client_id
HAVING count(*) >= 2
) AS a
GROUP BY a.date
ORDER BY a.date
Here is a complex query where i need to pass some dates as dynamic to this, As of now i have hardcoded this '2021-08-01' AND '2022-07-31' these 2 dates.
But i have to pass this dates dynamically in such a way that next dates ie, 2022-06 month , thew dates passed will be '2021-07-01' and '2022-06-30' , basically 12 months behind data.
if we take 2022-05 then the passed date should be '2021-06-01' and '2022-05-31'.
How can we achieve this ? Any suggestions or help will be much appreciated.
below is the query for reference
WITH base as
(
SELECT created_at as period ,order_number, TRIM(email) as email ,is_first_order
FROM orders
WHERE created_at::DATE BETWEEN '2021-08-01' AND '2022-07-31'
)
,base_agg as
(
select TO_CHAR(period,'YYYY-MM') as period
,COUNT(DISTINCT email)FILTER(WHERE is_first_order IS TRUE) as new_users
,COUNT(DISTINCT order_number)FILTER(WHERE is_first_order IS FALSE) as returning_orders
FROM base
GROUP BY 1
)
,base_cumulative as
(
SELECT ROW_NUMBER() OVER(ORDER BY PERIOD DESC ) as rno
,period
,new_users
,returning_orders
,sum("new_users")over (order by "period" asc rows between unbounded preceding and current row) as "cumulative_total"
from base_agg
)
SELECT
(SELECT period FROM base_cumulative WHERE rno=1) period
,(SELECT cumulative_total FROM base_cumulative WHERE rno=1) as cumulated_customers
,SUM(returning_orders) as returning_orders
,SUM(returning_orders)/NULLIF((SELECT cumulative_total FROM base_cumulative WHERE rno=1),0) as rate
FROM base_cumulative
You can calculate the end of current month based on NOW() and some logic, the same can be applied with the rest of the calculation
select date_trunc('month', now())::date + interval '1 month - 1 day' end_of_this_month,
date_trunc('month', now())::date + interval '1 month - 1 day'::interval - '1 year'::interval + '1 day'::interval first_day_of_prev_year_month
;
Result
end_of_this_month | first_day_of_prev_year_month
---------------------+------------------------------
2022-08-31 00:00:00 | 2021-09-01 00:00:00
(1 row)
I have a pickupDate and returnDate in my OrderHistory table. I want to extract the sum of rental days of all OrderHistory entries, grouped/ordered by month. A cte seems to be the solution but I don´t get how to implement it in my query since the cte´s i saw were refering to themselves where it says "FROM cte".
I tried something like this:
SELECT
SUM((EXTRACT (DAY FROM("OrderHistory"."returnDate")-("OrderHistory"."pickupDate")))) as traveltime
, to_char("OrderHistory"."pickupDate"::date, 'YYYY-MM') as M
FROM
"OrderHistory"
GROUP BY
M
ORDER BY
M
But the outcome doesn´t split bookings btw two months (e.g. pickupDate=27th march 2022 and returnDate=03rd of april 2022) but will assign the whole 7 days to the month of march, since the returndate is in it. It should show 4 days in march and 3 in april.
Sorry for the probably very stupid question but I am a beginner. (my code is written in postgresql btw)
PostgreSQL naming conventions
Are PostgreSQL column names case-sensitive?
use legal, lower-case names exclusively so double-quoting is not
needed.
Final result in db fiddle
Add daterange column.
alter table order_history add column date_ranges daterange;
update order_history
with a(m_begin, m_end, pickup_date) as
(select date_trunc('month', pickup_date)::date,
(date_trunc('month', pickup_date) + interval '1 month - 1 day')::date,
pickup_date from order_history)
update order_history set date_ranges =
daterange(a.m_begin, a.m_end,'[]') from a
where a.pickup_date = order_history.pickup_date;
then final query:
WITH A AS(
select
pickup_date,
return_date,
return_date - pickup_date as total,
case when return_date <# date_ranges then (return_date - pickup_date)
else ( date_trunc('month', pickup_date) + interval '1 month - 1 day')::date - pickup_date
end partial_mth
from order_history),
b as (SELECT *, a.total - partial_mth parital_not_mth FROM a)
select *,
case when to_char(pickup_date,'YYYY-MM') = to_char(return_date,'YYYY-MM')
then
sum(partial_mth) over(partition by to_char(pickup_date,'YYYY-MM')) +
sum(parital_not_mth) over (partition by to_char(return_date,'YYYY-MM'))
else sum(partial_mth) over(partition by to_char(pickup_date,'YYYY-MM'))
end
from b;
After trying different things I think I found the best answer to my question, that I want to share with the community:
WITH hier as (
SELECT
"OrderHistory"."pickupDate" as start_date
, "OrderHistory"."returnDate" as end_date
, to_char("OrderHistory"."pickupDate"::date, 'YYYY-MM') as M
FROM
"OrderHistory"
GROUP BY
1, 2, 3
ORDER BY
3
), calendar as (
select date '2022-01-01' + (n || ' days')::interval calendar_date
from generate_series(0, 365) n
)
select
to_char(calendar_date::date, 'YYYY-MM')
, count(*) as tage_gebucht
from calendar
inner join hier on calendar.calendar_date between start_date and end_date
where calendar_date between '2022-01-01' and '2022-12-31'
group by 1
order by 1;
I think this is the simplest solution I came up with.
i have a table
and i have a range from '2019-01-02' to '2019-01-04'
I need to generate ID and DATES (generated) from my table which started_at and ended_at (nullable) between the given range
result must be like this:
ID 4 from table is not included in result because it's started_at and ended_at not in range '2019-01-02' and '2019-01-04'
I need query that will do that in postgres.
Use generate_series()
select t.id, g.dt::date
from the_table t
cross join generate_series(t.started_at::date + 1,
least(t.ended_at::date, date '2019-01-04'),
interval '1 day') as g(dt)
where t.started_at >= date '2019-01-02'
and t.started_at < date '2019-01-04';
Worked this variant:
select t.id, g.dt::date from the_table t
cross join generate_series(t.started_at::date + 1,
least(t.ended_at::date, date '2019-01-04'), interval '1 day') as g(dt)
where dt >= date '2019-01-02' and dt < date '2019-01-04';
Here is my code but its showing null while today is friday. But I would like to get last working day.
-- Insert statements for procedure here
--Below is the param you would pass
DECLARE #dateToEvaluate date=GETDATE();
--Routine
DECLARE #startDate date=CAST('1/1/'+CAST(YEAR(#dateToEvaluate) AS char(4)) AS date); -- let's get the first of the year
WITH
tally(n) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL))-1 FROM sys.all_columns),
dates AS (
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS dt_id,
DATEADD(DAY,n,#startDate) AS dt,
DATENAME(WEEKDAY,DATEADD(DAY,n,#startdate)) AS dt_name
FROM tally
WHERE n<366 --arbitrary
AND DATEPART(WEEKDAY,DATEADD(DAY,n,#startDate)) NOT IN (6)
AND DATEADD(DAY,n,#startDate) NOT IN (SELECT CAST(HolidayDate AS date) FROM Holiday)),
curr_id(id) AS (SELECT dt_id FROM dates WHERE dt=#dateToEvaluate)
SELECT d.dt
FROM dates AS d
CROSS JOIN
curr_id c
WHERE d.dt_id+1=c.id
The code below will take any date and "walk backward" to find the previous week day (M-F) which is not in the #holidays table.
declare #currentdate datetime = '2015-03-22'
declare #holidays table (holiday datetime)
insert #holidays values ('2015-03-20')
;with cte as (
select
#currentdate k
union all
select
dateadd(day, -1, k)
from cte
where
k = #currentdate
or ((datepart(dw, k) + ##DATEFIRST - 1 - 1) % 7) + 1 > 5 --determine day of week independent of culture
or k in (select holiday from #holidays)
)
select min(k) from cte
The dates table doesn't have any FRIDAY dates in it. Change the NOT IN (6) to NOT IN (1, 7). This will remove Saturday and Sundays from the dates table.