Join on generate_series and count - postgresql

I'm trying to find the # users who did action A or action B on a monthly basis.
Table: User
- id
- "creationDate"
Table: action_A
- user_id (= user.id)
- "creationDate"
Table: action_B
- user_id (= user.id)
- "creationDate"
The general idea of what I was trying to do was that I'd find the list of users who did action A in Month X and the list of users who did action B in Month X, then count how many ids are there for every month based on a generate_series of monthly dates.
I tried the following, however, the query times out when running and I'm not sure if there's any way to optimize it (or if it is even correct).
SELECT monthseries."Month", count(*)
FROM
(SELECT to_char(DAY::date, 'YYYY-MM') AS "Month"
FROM generate_series('2014-01-01'::date, CURRENT_DATE, '1 month') DAY) monthseries
LEFT JOIN
(SELECT to_char("creationDate", 'YYYY-MM') AS "Month",
id
FROM action_A) did_action_A ON monthseries."Month" = did_action_A."Month"
LEFT JOIN
(SELECT to_char("creationDate", 'YYYY-MM') AS "Month",
id
FROM action_B) did_action_B ON monthseries."Month" = did_action_B."Month"
GROUP BY monthseries."Month"
Any comments/ help would be immensely helpful!

If you want to count distinct users:
select to_char(month, 'YYYY-MM') as "Month", count(*)
from
generate_series(
'2014-01-01'::date, current_date, '1 month'
) monthseries (month)
left join (
(
select distinct date_trunc('month', "creationDate") as month, id
from action_a
) a
full outer join (
select distinct date_trunc('month', "creationDate") as month, id
from action_b
) b using (month, id)
) s using (month)
group by 1
order by 1

Related

postgresql query for getting all the values which pass through a state

i have a table with 4 columns id primary key, created_at(date stamp), order_id, order_state_id
i want to make a query in which i have grouped by day for each day order_id, created_at where order_state_id could have only a list of value
i made something like that
select *
from marketplace_management.orders mmo
inner join (
select max(created_at),order_id,date(created_at+ interval '2 hours') as zian
from marketplace_management.order_state_updates
group by zian,order_id
having max(order_state_id) <4
) stategr
on mmo.id =stategr.order_id
and date(mmo.created_at+ interval '2 hours')=stategr.zian
and mmo.created_at between '2022-03-17 22:00:00' AND '2022-03-18 22:00:00'
basically i want to get rows in which column order_State_id is 1, 2, or 3 for a day even that tomorrow the column get in another state
select *
from marketplace_management.orders mmo
inner join
(select inter1.order_id,inter1.zian,states,
case when position(',' in states)=0 then states
else substring(states,0,position(',' in states))end as "ult_stare"
from
(select mmosu.order_id,
string_agg(cast(mmosu.order_state_id as varchar),','
order by created_at desc) as states,
date(mmosu.created_at+ interval '2 hours') as zian
from marketplace_management.order_state_updates mmosu
group by date(mmosu.created_at+ interval '2 hours'),mmosu.order_id)inter1
) laststate
on mmo.id=laststate.order_id and date(mmo.created_at+ interval '2 hours')=laststate.zian and
mmo.created_at between '2022-03-17 22:00:00' AND '2022-03-18 22:00:00' and
laststate.ult_stare in ('1','2','3')

How to get number of consecutive days from current date using postgres?

I want to get the number of consecutive days from the current date using Postgres SQL.
enter image description here
Above is the scenario in which I have highlighted consecutive days count should be like this.
Below is the SQL query which I have created but it's not returning the expected result
with grouped_dates as (
select user_id, created_at::timestamp::date,
(created_at::timestamp::date - (row_number() over (partition by user_id order by created_at::timestamp::date) || ' days')::interval)::date as grouping_date
from watch_history
)
select * , dense_rank() over (partition by grouping_date order by created_at::timestamp::date) as in_streak
from grouped_dates where user_id = 702
order by created_at::timestamp::date
Can anyone please help me to resolve this issue?
If anyhow we can able to apply distinct for created_at field to below query then I will get solutions for my issue.
WITH list AS
(
SELECT user_id,
(created_at::timestamp::date - (row_number() over (partition by user_id order by created_at::timestamp::date) || ' days')::interval)::date as next_day
FROM watch_history
)
SELECT user_id, count(*) AS number_of_consecutive_days
FROM list
WHERE next_day IS NOT NULL
GROUP BY user_id
Does anyone have an idea how to apply distinct to created_at for the above mentioned query ?
To get the "number of consecutive days" for the same user_id :
WITH list AS
(
SELECT user_id
, array_agg(created_at) OVER (PARTITION BY user_id ORDER BY created_at RANGE BETWEEN CURRENT ROW AND '1 day' FOLLOWING) AS consecutive_days
FROM watch_history
)
SELECT user_id, count(DISTINCT d.day) AS number_of_consecutive_days
FROM list
CROSS JOIN LATERAL unnest(consecutive_days) AS d(day)
WHERE array_length(consecutive_days, 1) > 1
GROUP BY user_id
To get the list of "consecutive days" for the same user_id :
WITH list AS
(
SELECT user_id
, array_agg(created_at) OVER (PARTITION BY user_id ORDER BY created_at RANGE BETWEEN CURRENT ROW AND '1 day' FOLLOWING) AS consecutive_days
FROM watch_history
)
SELECT user_id
, array_agg(DISTINCT d.day ORDER BY d.day) AS list_of_consecutive_days
FROM list
CROSS JOIN LATERAL unnest(consecutive_days) AS d(day)
WHERE array_length(consecutive_days, 1) > 1
GROUP BY user_id
full example & result in dbfiddle

Is there a SQL code for cumulative count of SaaS customer over months?

I have a table with:
ID (id client), date_start (subscription of SaaS), date_end (could be a date value or be NULL).
So I need a cumulative count of active clients month by month.
any idea on how to write that in Postgres and achieve this result?
Starting from this, but I don't know how to proceed
select
date_trunc('month', c.date_start)::date,
count(*)
from customer
Please check next solution:
select
subscrubed_date,
subscrubed_customers,
unsubscrubed_customers,
coalesce(subscrubed_customers, 0) - coalesce(unsubscrubed_customers, 0) cumulative
from (
select distinct
date_trunc('month', c.date_start)::date subscrubed_date,
sum(1) over (order by date_trunc('month', c.date_start)) subscrubed_customers
from customer c
order by subscrubed_date
) subscribed
left join (
select distinct
date_trunc('month', c.date_end)::date unsubscrubed_date,
sum(1) over (order by date_trunc('month', c.date_end)) unsubscrubed_customers
from customer c
where date_end is not null
order by unsubscrubed_date
) unsubscribed on subscribed.subscrubed_date = unsubscribed.unsubscrubed_date;
share SQL query
You have a table of customers. With a start date and sometimes an end date. As you want to group by date, but there are two dates in the table, you need to split these first.
Then, you may have months where only customers came and others where only customers left. So, you'll want a full outer join of the two sets.
For a cumulative sum (also called a running total), use SUM OVER.
with came as
(
select date_trunc('month', date_start) as month, count(*) as cnt
from customer
group by date_trunc('month', date_start)
)
, went as
(
select date_trunc('month', date_end) as month, count(*) as cnt
from customer
where date_end is not null
group by date_trunc('month', date_end)
)
select
month,
came.cnt as cust_new,
went.cnt as cust_gone,
sum(came.cnt - went.cnt) over (order by month) as cust_active
from came full outer join went using (month)
order by month;

Count PostgreSQL with condition

I have a table for persons and other for visits. I need to count the visits for the month only if it was the first one. So basically get a count of visits for the month but only if your visit for the month, was the first.
Example:
PEOPLE:
id
---------
20
30
23
VISITS
id | date
-------------------------
20 | 09-20-2019
20 | 10-01-2019
23 | 10-09-2019
30 | 10-07-2019
I want to know the coutning only if its the first one, on this example, the counting should equal to 2 because person with ID 20 had a past visit last month.
This is what I have so far on my query
SELECT * FROM visits
LEFT JOIN people ON visits.id = people.id
WHERE date_trunc('month', current_date) <= visits.visit_date AND visits.visit_date < date_trunc('month', current_date) + INTERVAL '1 month'
This is just giving me the visits for the month. How can I filter by only if the person doesnt have past visits.
You can simplify the condition by just comparing months. Add not exists to check whether the person had a visit before the current month.
select *
from visits
left join people on visits.id = people.id
where date_trunc('month', current_date) = date_trunc('month', visits.visit_date)
and not exists (
select from visits
where id = people.id
and date_trunc('month', current_date) > date_trunc('month', visits.visit_date)
)
from what I understand on your requirement you just need to get the person Ids that had previous visit.
my proposal is to use dense_rank() to based on date_trunc('month', visit_date)
select v.id from visits v
inner join
(select dense_rank() over (partition by date_trunc('month', visit_date) order by visit_date) as rn
, *
from visits) t1 on t1.id = v.id
group by v.id, t1.rn
having count(v.id) > 1

Generate count of group memberships x days after group creation

I have a table of group_id, member_id, and created_at.
I'm trying to track the growth in group membership across time. Since group_id's are created when the first member_id joins, the min(created_at) for a given group should give the created date. I think this broken code gets the point across for what I'm trying to do (at the month level in this case):
SELECT brand_id,
min(created_at) as created_date,
min(created_at) + INTERVAL '1 month' as end_date,
count(member_id)
FROM member_group
HAVING created_at < end_date
group by 1
It seems to me that you are looking for a query like this:
SELECT g.brand_id, x.created_date, x.end_date, count(g.member_id)
FROM member_group g
JOIN (
SELECT brand_id,
min(created_at) as created_date,
min(created_at) + INTERVAL '1' month as end_date
FROM member_group
GROUP BY brand_id
) x
ON ( g.brand_id = x.brand_id
AND g.created_at BETWEEN x.created_date AND x.end_date )
GROUP BY g.brand_id, x.created_date, x.end_date
select
brand_id,
count(created_at < brand_date + interval '1 month' or null) as total
from
member_group
inner join (
select brand_id, min(created_at) as brand_date
from member_group
group by 1
) s using (brand_id)
group by 1
order by 1
;