How to get number of consecutive days from current date using postgres? - postgresql

I want to get the number of consecutive days from the current date using Postgres SQL.
enter image description here
Above is the scenario in which I have highlighted consecutive days count should be like this.
Below is the SQL query which I have created but it's not returning the expected result
with grouped_dates as (
select user_id, created_at::timestamp::date,
(created_at::timestamp::date - (row_number() over (partition by user_id order by created_at::timestamp::date) || ' days')::interval)::date as grouping_date
from watch_history
)
select * , dense_rank() over (partition by grouping_date order by created_at::timestamp::date) as in_streak
from grouped_dates where user_id = 702
order by created_at::timestamp::date
Can anyone please help me to resolve this issue?
If anyhow we can able to apply distinct for created_at field to below query then I will get solutions for my issue.
WITH list AS
(
SELECT user_id,
(created_at::timestamp::date - (row_number() over (partition by user_id order by created_at::timestamp::date) || ' days')::interval)::date as next_day
FROM watch_history
)
SELECT user_id, count(*) AS number_of_consecutive_days
FROM list
WHERE next_day IS NOT NULL
GROUP BY user_id
Does anyone have an idea how to apply distinct to created_at for the above mentioned query ?

To get the "number of consecutive days" for the same user_id :
WITH list AS
(
SELECT user_id
, array_agg(created_at) OVER (PARTITION BY user_id ORDER BY created_at RANGE BETWEEN CURRENT ROW AND '1 day' FOLLOWING) AS consecutive_days
FROM watch_history
)
SELECT user_id, count(DISTINCT d.day) AS number_of_consecutive_days
FROM list
CROSS JOIN LATERAL unnest(consecutive_days) AS d(day)
WHERE array_length(consecutive_days, 1) > 1
GROUP BY user_id
To get the list of "consecutive days" for the same user_id :
WITH list AS
(
SELECT user_id
, array_agg(created_at) OVER (PARTITION BY user_id ORDER BY created_at RANGE BETWEEN CURRENT ROW AND '1 day' FOLLOWING) AS consecutive_days
FROM watch_history
)
SELECT user_id
, array_agg(DISTINCT d.day ORDER BY d.day) AS list_of_consecutive_days
FROM list
CROSS JOIN LATERAL unnest(consecutive_days) AS d(day)
WHERE array_length(consecutive_days, 1) > 1
GROUP BY user_id
full example & result in dbfiddle

Related

simpler query for counting total row in the column

not sure if my below query script right to execute as I am trying to find just one query in Oracle SQL query
select distinct (master_id) , last_name from (
select q2.* , max (count_a) over (partition by master_id)
count_b from ( select q1.* , count (*) over (partition by
master_id order by purchased_date desc ) count_a from profile q1)
q2) where count_b > 2
I am trying to minimise timer to execute get data by reducing sub query
for example above it has two subqueries
max (count_a) over (partition by master_id) count_b
count (*) over (partition by master_id order by purchased_date desc ) count_a
so I played around until this query
max (count (*)) over (partition by master_id) count
SQL query script;
select * from profile a
join( select * from (
select master_id, max (count(*)) over (partition by
master_id) count from profile) where count >2) b
ON a. master_id = b. master_id
Thank you in advance for your help

Reset increment in PostgreSQL

I just started learning Postgres, and I'm trying to make an aggregation table that has the columns:
user_id
booking_sequence
booking_created_time
booking_paid_time
booking_price_amount
total_spent
All columns are provided, except for the booking_sequence column. I need to make a query that shows the first five flights of each user that has at least x purchases and has spent more than a certain amount of money, then sort it by the amount of money spent by the user, and then sort it by the booking sequence column.
I've tried :
select user_id,
row_number() over(partition by user_id order by user_id) as booking_sequence,
booking_created_time as booking_created_date,
booking_price_amount,
sum(booking_price_amount) as total_booking_price_amount
from fact_flight_sales
group by user_id, booking_created_time, booking_price_amount
having count(user_id) > 5
and total_booking_price_amount > 1000
order by total_booking_price_amount;
I got 0 when I added count(user_id) > 5, and total_booking_price_amount is not found when I add the second condition in the HAVING clause.
Edit:
I managed to make the code function correctly, for those who are curious:
select x.user_id, row_number() over(partition by x.user_id)
as booking_sequence, x.booking_created_time::date as booking_created_date, x.booking_price_amount,
sum(y.booking_price_amount) as total_booking_price_amount from
(
select user_id, booking_created_time, booking_price_amount from fact_flight_sales
group by user_id, booking_created_time, booking_price_amount
) as x
join
(
select user_id, booking_price_amount
from fact_flight_sales group by user_id, booking_price_amount
) as y
on x.user_id = y.user_id
group by x.user_id, x.booking_created_time, x.booking_price_amount
having count(x.user_id) >= 1 and sum(y.booking_price_amount) >250000
order by total_booking_price_amount desc, booking_sequence asc;
Big thanks to Laurenz for the help!
About count(user_id) > 5:
HAVING is calculated before window functions are evaluated, So result rows excluded by the HAVING clause will not be used to calculate the window function.
About total_booking_price_amount in HAVING:
You cannot use aliases from the SELECT list in the HAVING clause. You will have to repeat the expression (or use a subquery).

Is there a SQL code for cumulative count of SaaS customer over months?

I have a table with:
ID (id client), date_start (subscription of SaaS), date_end (could be a date value or be NULL).
So I need a cumulative count of active clients month by month.
any idea on how to write that in Postgres and achieve this result?
Starting from this, but I don't know how to proceed
select
date_trunc('month', c.date_start)::date,
count(*)
from customer
Please check next solution:
select
subscrubed_date,
subscrubed_customers,
unsubscrubed_customers,
coalesce(subscrubed_customers, 0) - coalesce(unsubscrubed_customers, 0) cumulative
from (
select distinct
date_trunc('month', c.date_start)::date subscrubed_date,
sum(1) over (order by date_trunc('month', c.date_start)) subscrubed_customers
from customer c
order by subscrubed_date
) subscribed
left join (
select distinct
date_trunc('month', c.date_end)::date unsubscrubed_date,
sum(1) over (order by date_trunc('month', c.date_end)) unsubscrubed_customers
from customer c
where date_end is not null
order by unsubscrubed_date
) unsubscribed on subscribed.subscrubed_date = unsubscribed.unsubscrubed_date;
share SQL query
You have a table of customers. With a start date and sometimes an end date. As you want to group by date, but there are two dates in the table, you need to split these first.
Then, you may have months where only customers came and others where only customers left. So, you'll want a full outer join of the two sets.
For a cumulative sum (also called a running total), use SUM OVER.
with came as
(
select date_trunc('month', date_start) as month, count(*) as cnt
from customer
group by date_trunc('month', date_start)
)
, went as
(
select date_trunc('month', date_end) as month, count(*) as cnt
from customer
where date_end is not null
group by date_trunc('month', date_end)
)
select
month,
came.cnt as cust_new,
went.cnt as cust_gone,
sum(came.cnt - went.cnt) over (order by month) as cust_active
from came full outer join went using (month)
order by month;

Join on generate_series and count

I'm trying to find the # users who did action A or action B on a monthly basis.
Table: User
- id
- "creationDate"
Table: action_A
- user_id (= user.id)
- "creationDate"
Table: action_B
- user_id (= user.id)
- "creationDate"
The general idea of what I was trying to do was that I'd find the list of users who did action A in Month X and the list of users who did action B in Month X, then count how many ids are there for every month based on a generate_series of monthly dates.
I tried the following, however, the query times out when running and I'm not sure if there's any way to optimize it (or if it is even correct).
SELECT monthseries."Month", count(*)
FROM
(SELECT to_char(DAY::date, 'YYYY-MM') AS "Month"
FROM generate_series('2014-01-01'::date, CURRENT_DATE, '1 month') DAY) monthseries
LEFT JOIN
(SELECT to_char("creationDate", 'YYYY-MM') AS "Month",
id
FROM action_A) did_action_A ON monthseries."Month" = did_action_A."Month"
LEFT JOIN
(SELECT to_char("creationDate", 'YYYY-MM') AS "Month",
id
FROM action_B) did_action_B ON monthseries."Month" = did_action_B."Month"
GROUP BY monthseries."Month"
Any comments/ help would be immensely helpful!
If you want to count distinct users:
select to_char(month, 'YYYY-MM') as "Month", count(*)
from
generate_series(
'2014-01-01'::date, current_date, '1 month'
) monthseries (month)
left join (
(
select distinct date_trunc('month', "creationDate") as month, id
from action_a
) a
full outer join (
select distinct date_trunc('month', "creationDate") as month, id
from action_b
) b using (month, id)
) s using (month)
group by 1
order by 1

selecting only two employees from every department

Can you let me know how to select only two employees from every department? The table has deptname, ssn, name . I am doing a sampling and I need only two ssns for every department name. Can someone help?
You can accomplish this with an "OLAP expression" row_number()
with e as
( select deptname, ssn, empname,
row_number() over (partition by dptname order by empname) as pick
from employees
)
select deptname, ssn, empname
from e
where pick < 3
order by deptname, ssn
This example will give you the two employees with the lowest order names, because that is what is specified in the row_number() (order by) expression.
Try this:
select *
from t t1
where (
select count(*)
from t t2
where
t2.deptname = t1.deptname
and
t2.ssn <= t1.ssn) <= 2
order by deptname, ssn,name;
The above will give "smallest" two ssn.
If you want top 2, change to t2.ssn >= t1.ssn
sqlfiddle
The data:
The result from query:
select * from
( select rank() over (partition by dptname order by empname) as count , *
from employees
)
where count<=2
order by deptname, ssn,name;