Correlated subquery in Postgres - postgresql

I have a query like below to find the stock details of certain products.The query is working fine but i think it is not efficient and fast enough(DB: postgresql version 11).
There is a CTE "result_set"in this code where i need to find the "quantity of a product ordered"(qty_last_7d_from_oos_date) during the period between out of stock and last 7 days before out of stock date.Same like this i have to find the revenue also.
So what i did is wrote a same subquery two times one outputting the revenue and other the quantity which is not an efficient step.So someone have any suggestions on how to rewrite this and make it an efficient code.
WITH final as
(
SELECT product_id,product_name,item_sku,out_of_stock_at
,out_of_stock_at - INTERVAL '7 days' as previous_7_days
,back_in_stock_at
FROM oos_base
)
SELECT product_id,product_name,item_sku,out_of_stock_at,previous_7_days
,back_in_stock_at
,(SELECT coalesce(sum(i.qty_ordered), 0) AS qty_last_7d_from_oos_date
FROM ol.orders o
LEFT JOIN ol.items i ON i.order_id = o.order_id
LEFT JOIN ol.products p ON p.product_id = i.product_id AND i.store_id = p.store_id
WHERE o.order_state_2 IN('complete','processing')
AND f.product_id=p.product_id
AND o.created_at_order :: DATE BETWEEN f.previous_7_days::DATE AND COALESCE(f.out_of_stock_at::DATE,current_date)
)
,( SELECT coalesce(sum(i.row_amount_minus_discount_order), 0) AS rev_last_7d_from_oos_date
FROM ol.orders o
LEFT JOIN ol.items i ON i.order_id = o.order_id
LEFT JOIN ol.products p ON p.product_id = i.product_id AND i.store_id = p.store_id
WHERE o.order_state_2 IN('complete','processing')
AND f.product_id=p.product_id
AND o.created_at_order :: DATE BETWEEN f.previous_7_days::DATE AND COALESCE(f.out_of_stock_at::DATE,current_date)
)
FROM final f
In the above code the CTE "final" gives you two dates "out_of_stock_at" &
"previous_7_days". I want to find the quantity and revenue of a product based on this 2 dates means between "previous_7_days" & "out_of_stock_at".
Below query will give the quantity and revenue of the products but the period between "previous_7_days" & "out_of_stock_at"from the above CTE.
As of now i have used the below code two times to obtain the information of revenue and quantity.
SELECT coalesce(sum(i.qty_ordered), 0) AS qty ,
coalesce(sum(i.row_amount_minus_discount_order), 0)
FROM ol.orders o
LEFT JOIN ol.items i ON i.order_id = o.order_id
LEFT JOIN ol.products p ON p.product_id = i.product_id AND i.store_id = p.store_id
WHERE o.order_state_2 IN('complete','processing')
AND f.product_id=p.product_id
AND o.created_at_order :: DATE BETWEEN f.previous_7_days::DATE AND COALESCE(f.out_of_stock_at::DATE,current_date)

Related

How to calculate values from previous month to current month

I want to calculate this table. Everytime that there is a new participant per month it will add the previous value to current value.
month
no_participant
2021-01
10
2021-02
20
2021-03
5
2021-04
17
Something like this, output
month
no_participant
count
2021-01
10
10
2021-02
20
30
2021-03
5
35
2021-04
17
52
Here's my query: I am using Postgres. Thanks to your help
SELECT (TO_CHAR(CSD.SCHEDULED_START_DATETIME, 'YYYY-MM'))AS MONTH,
COUNT(DISTINCT PARTICIPANT_ID) AS PARTICIPANT
FROM TSUP.COURSE_SCHEDULE_DETAIL AS CSD
INNER JOIN TSUP.COURSE_PARTICIPANT AS CP
ON CSD.COURSE_SCHEDULE_ID = CP.COURSE_SCHEDULE_ID
INNER JOIN(
SELECT
MIN(COALESCE(CSD.RESCHEDULED_START_DATETIME, CSD.SCHEDULED_START_DATETIME)) AS SCHEDULED_START_DATETIME,
MAX(COALESCE(CSD.RESCHEDULED_END_DATETIME, CSD.SCHEDULED_END_DATETIME)) AS SCHEDULED_END_DATETIME,
COUNT(CSD.SCHEDULED_START_DATETIME) AS "COUNT"
FROM TSUP.COURSE_SCHEDULE_DETAIL AS CSD
INNER JOIN (
SELECT CP.PARTICIPANT_ID AS "PARTICIPANT",
MIN(COALESCE(CSD.RESCHEDULED_START_DATETIME, CSD.SCHEDULED_START_DATETIME)) AS SCHEDULED_START_DATETIME,
MAX(COALESCE(CSD.RESCHEDULED_END_DATETIME, CSD.SCHEDULED_END_DATETIME)) AS SCHEDULED_END_DATETIME
FROM TSUP.COURSE_PARTICIPANT AS CP
INNER JOIN TSUP.COURSE_SCHEDULE_DETAIL AS CSD
ON CP.COURSE_SCHEDULE_ID = CSD.COURSE_SCHEDULE_ID
INNER JOIN TSUP.COURSE_SCHEDULE AS CS
ON CSD.ID = CS.ID
INNER JOIN TSUP.COURSE AS C
ON CS.COURSE_ID = C.ID
INNER JOIN TSUP.COURSE_CATEGORY AS CC
ON C.COURSE_CATEGORY_ID = CC.ID
INNER JOIN TSUP.EMPLOYEE AS E
ON CP.PARTICIPANT_ID = E.ID
INNER JOIN TSUP.MEMBER_ROLE AS MR
ON E.MEMBER_ROLE_ID = MR.ID
WHERE C.MANDATORY = 'Yes'
AND MR.ROLE_TYPE = 'Dev'
AND CC.CATEGORY = 'JJ'
GROUP BY CP.PARTICIPANT_ID)
TEMP ON CSD.SCHEDULED_START_DATETIME = TEMP.SCHEDULED_START_DATETIME
GROUP BY CSD.RESCHEDULED_START_DATETIME, CSD.SCHEDULED_START_DATETIME,
CSD.RESCHEDULED_END_DATETIME, CSD.SCHEDULED_END_DATETIME
)
TEMP ON CSD.SCHEDULED_START_DATETIME = TEMP.SCHEDULED_START_DATETIME
GROUP BY MONTH
The query you provided is verbose, and also does not seem to exactly line up with the sample data. I will give the following query based on the sample data shown:
SELECT month, no_participant, SUM(no_participant) OVER (ORDER BY month) AS count
FROM yourTable
ORDER BY month;
The above logic uses SUM() as an analytic function.

How can I make the denominator a constant for each of the numbers in the same row in SQL?

I am trying to create a table with the average amount of sales divided by a cohort of users that signed up for an account in a certain month, however, I can only figure out to divide by the number of people that made a purchase in that specific month which is lower than the total amount of the cohort. How do I change the query below to make each of the avg_sucessful_transacted amounts divide by cohort 0 for each month?
thank you.
select sum (t.amount_in_dollars)/ count (distinct u.id) as Avg_Successful_Transacted, (datediff(month,[u.created:month],[t.createdon:month])) as Cohort, [u.created:month] as Months,
count (distinct u.id) as Users
from [transaction_cache as t]
left join [user_cache as u] on t.owner = u.id
where t.type = 'savings' and t.status = 'successful' and [u.created:year] > ['2017-01-01':date:year]
group by cohort, months
order by Cohort, Months
You will need to break out the cohort sizing into its own subquery or CTE in order to calculate the total number of distinct users who were created during the month which matches the cohort's basis month.
I approached this by bucketing users by the month they were created using the date_trunc('Month', <date>, <date>) function, but you may wish to approach it differently based on the specific business logic that generates your cohorts.
I don't work with Periscope, so the example query below is structured for pure Redshift, but hopefully it is easy to translate the syntax into Periscope's expected format:
WITH cohort_sizes AS (
SELECT date_trunc('Month', created)::DATE AS cohort_month
, COUNT(DISTINCT(id)) AS cohort_size
FROM user_cache u
GROUP BY 1
),
cohort_transactions AS (
SELECT date_trunc('Month', created)::DATE AS cohort_month
, createdon
, owner
, type
, status
, amount_in_dollars
, id
, created
FROM transaction_cache t
LEFT JOIN user_cache u ON t.owner = u.id
WHERE t.type = 'savings'
AND t.status = 'successful'
AND u.created > '2017-01-01'
)
SELECT SUM(t.amount_in_dollars) / s.cohort_size AS Avg_Successful_Transacted
, (datediff(MONTH, u.created, t.createdon)) AS Cohort
, u.created AS Months
, count(DISTINCT u.id) AS Users
FROM cohort_transactions t
JOIN cohort_sizes s ON t.cohort_month = s.cohort_month
LEFT JOIN user_cache AS u ON t.owner = u.id
GROUP BY s.cohort_size, Cohort, Months
ORDER BY Cohort, Months
;

How to filter database table by a multiple join records from another one table but different types?

I have a products table and corresponding ratings table which contains a foreign key product_id, grade(int) and type which is an enum accepting values robustness and price_quality_ratio
The grades accept values from 1 to 10. So for example, how would the query look like, if I wanted to filter the products where minimum grade for robustness would be 7 and minimum grade for price_quality_ratio would be 8?
You can join twice, once per rating. The inner joins eliminate the products that fail any rating criteria,
select p.*
from products p
inner join rating r1
on r1.product_id = p.product_id
and r1.type = 'robustness'
and r1.rating >= 7
inner join rating r2
on r2.product_id = p.product_id
and r2.type = 'price_quality_ratio'
and r2.rating >= 8
Another option is to use do conditional aggregation. This requires only one join, then a group by; the rating criteria are checked in the having clause.
select p.product_id, p.product_name
from products p
inner join rating r
on r.product_id = p.product_id
and r.type in ('robustness', 'price_quality_ratio')
group by p.product_id, p.product_name
having
min(case when r.type = 'robustness' then r.rating end) >= 7
and min(case when r.type = 'price_quality_ratio then r.rating end) >= 8
The JOIN proposed by #GMB would've been my first suggestion as well. If that gets too complicated with having to maintain too many rX.ratings, you can also use a nested query:
SELECT *
FROM (
SELECT p.*, r1.rating as robustness, r2.rating as price_quality_ratio
FROM products p
JOIN rating r1 ON (r1.product_id = p.product_id AND r1.type = 'robustness')
JOIN rating r2 ON (r2.product_id = p.product_id AND r2.type = 'price_quality_ratio')
) AS tmp
WHERE robustness >= 7
AND price_quality_ratio >= 8
-- ORDER BY (price_quality_ratio DESC, robustness DESC) -- etc

How do you organize this query by week

Here is my Query so far:
select one.week, total, comeback, round(comeback)::Numeric / total::numeric * 100 as comeback_percent
FROM
(
SELECT count(username) as total, week
FROM
(
select row_number () over (partition by u.id order by creation_date) as row, username, date_trunc ('month', creation_date)::date AS week
FROM users u
left join entries e on u.id = e.user_id
where ((entry_type = 0 and distance >= 1) or (entry_type = 1 and seconds_running >= 600))
) x
where row = 1
group by week
order by week asc
) one
join
(
SELECT count(username) as comeback, week
FROM
(
select row_number () over (partition by u.id order by creation_date) as row, username, runs_completed, date_trunc ('month', creation_date)::date AS week
FROM entries e
left join users u on e.user_id = u.id
where ((entry_type = 0 and distance >= 1) or (entry_type = 1 and seconds_running >= 600))
) y
where runs_completed > 1 and row = 1
group by week
order by week asc
) two
on one.week = two.week
What I want to accomplish, is return a line graph for users that have completed one run with us, grouped by week, and assign percentages for that week of anyone who has completed a second run EVER, not just within that week. Our funnel has improved by a factor of 5 since we started, yet the line graph that is produced does not show similar results.
I could be incorrectly joining them together, or there may be a cleaner way to use CTE or window functions to perform this query, I am open to any and all suggestions. Thanks!
If you need tables or further information, let me know. I'm happy to provide anything that may be needed.

UNION and BETWEEN dates

I have two tables, people and shifts and for every person I want to
get all shifts for a week.
The problem is that there doesn't have to be a shift for every date.
In case there is no shift I want to get a dynamic template result with the date where no shift is availabe
SELECT p.id, p.name, s.date_of_shift
FROM people AS p
LEFT JOIN LATERAL (
SELECT sh.id, sh.date_of_shift, sh.person_id
FROM shifts as sh
) AS s ON p.id = s.person_id
WHERE p.id = 2 AND s.date_of_shift BETWEEN '2016-03-21' AND '2016-03-25'
UNION ALL
SELECT null, null, '2016-03-21'
WHERE NOT EXISTS (
SELECT 1
FROM people AS p
LEFT JOIN LATERAL (
SELECT sh.id, sh.date_of_shift, sh.person_id
FROM shifts AS sh
) AS s ON p.id = s.person_id
WHERE p.id = 88000 AND s.date_of_shift BETWEEN '2016-03-21' AND '2016-03-25');
This is the query I managed to create. The problem is that I always get the same date. But I want the date in the between range where no shift is.
In a case like this where you want all dates in a range, even when there is possibly no data for a specific date, you should use the generate_series() function and LEFT JOIN your data to it:
SELECT DISTINCT p.id, p.name, date_of_shift
FROM generate_series('2016-03-21'::date, '2016-03-25', interval '1 day') AS d(date_of_shift)
LEFT JOIN shifts sh USING (date_of_shift)
LEFT JOIN (SELECT id, name FROM person WHERE id = 2) p ON p.id = sh.person_id;
SQLFiddle