How to filter database table by a multiple join records from another one table but different types? - postgresql

I have a products table and corresponding ratings table which contains a foreign key product_id, grade(int) and type which is an enum accepting values robustness and price_quality_ratio
The grades accept values from 1 to 10. So for example, how would the query look like, if I wanted to filter the products where minimum grade for robustness would be 7 and minimum grade for price_quality_ratio would be 8?

You can join twice, once per rating. The inner joins eliminate the products that fail any rating criteria,
select p.*
from products p
inner join rating r1
on r1.product_id = p.product_id
and r1.type = 'robustness'
and r1.rating >= 7
inner join rating r2
on r2.product_id = p.product_id
and r2.type = 'price_quality_ratio'
and r2.rating >= 8
Another option is to use do conditional aggregation. This requires only one join, then a group by; the rating criteria are checked in the having clause.
select p.product_id, p.product_name
from products p
inner join rating r
on r.product_id = p.product_id
and r.type in ('robustness', 'price_quality_ratio')
group by p.product_id, p.product_name
having
min(case when r.type = 'robustness' then r.rating end) >= 7
and min(case when r.type = 'price_quality_ratio then r.rating end) >= 8

The JOIN proposed by #GMB would've been my first suggestion as well. If that gets too complicated with having to maintain too many rX.ratings, you can also use a nested query:
SELECT *
FROM (
SELECT p.*, r1.rating as robustness, r2.rating as price_quality_ratio
FROM products p
JOIN rating r1 ON (r1.product_id = p.product_id AND r1.type = 'robustness')
JOIN rating r2 ON (r2.product_id = p.product_id AND r2.type = 'price_quality_ratio')
) AS tmp
WHERE robustness >= 7
AND price_quality_ratio >= 8
-- ORDER BY (price_quality_ratio DESC, robustness DESC) -- etc

Related

Using another order when window function result is equal

I'm using the window function rank at the following query:
select *,
rank() over(partition by challenge_id order by total_distance desc nulls last) as classification
from user_challenges uc
left join users u on u.id = uc.user_id
left join profiles p on p.user_id = u.id
where challenge_id = 'de26076b-88ac-466e-8878-89d589c48d1c'
limit 10 offset 0
This query returns the users ranked by the total_distance reached but when two or more users has the total_distance equal instead of ordering by the total_distance I'd like to order by the name of the user. How can I achieve that?

Correlated subquery in Postgres

I have a query like below to find the stock details of certain products.The query is working fine but i think it is not efficient and fast enough(DB: postgresql version 11).
There is a CTE "result_set"in this code where i need to find the "quantity of a product ordered"(qty_last_7d_from_oos_date) during the period between out of stock and last 7 days before out of stock date.Same like this i have to find the revenue also.
So what i did is wrote a same subquery two times one outputting the revenue and other the quantity which is not an efficient step.So someone have any suggestions on how to rewrite this and make it an efficient code.
WITH final as
(
SELECT product_id,product_name,item_sku,out_of_stock_at
,out_of_stock_at - INTERVAL '7 days' as previous_7_days
,back_in_stock_at
FROM oos_base
)
SELECT product_id,product_name,item_sku,out_of_stock_at,previous_7_days
,back_in_stock_at
,(SELECT coalesce(sum(i.qty_ordered), 0) AS qty_last_7d_from_oos_date
FROM ol.orders o
LEFT JOIN ol.items i ON i.order_id = o.order_id
LEFT JOIN ol.products p ON p.product_id = i.product_id AND i.store_id = p.store_id
WHERE o.order_state_2 IN('complete','processing')
AND f.product_id=p.product_id
AND o.created_at_order :: DATE BETWEEN f.previous_7_days::DATE AND COALESCE(f.out_of_stock_at::DATE,current_date)
)
,( SELECT coalesce(sum(i.row_amount_minus_discount_order), 0) AS rev_last_7d_from_oos_date
FROM ol.orders o
LEFT JOIN ol.items i ON i.order_id = o.order_id
LEFT JOIN ol.products p ON p.product_id = i.product_id AND i.store_id = p.store_id
WHERE o.order_state_2 IN('complete','processing')
AND f.product_id=p.product_id
AND o.created_at_order :: DATE BETWEEN f.previous_7_days::DATE AND COALESCE(f.out_of_stock_at::DATE,current_date)
)
FROM final f
In the above code the CTE "final" gives you two dates "out_of_stock_at" &
"previous_7_days". I want to find the quantity and revenue of a product based on this 2 dates means between "previous_7_days" & "out_of_stock_at".
Below query will give the quantity and revenue of the products but the period between "previous_7_days" & "out_of_stock_at"from the above CTE.
As of now i have used the below code two times to obtain the information of revenue and quantity.
SELECT coalesce(sum(i.qty_ordered), 0) AS qty ,
coalesce(sum(i.row_amount_minus_discount_order), 0)
FROM ol.orders o
LEFT JOIN ol.items i ON i.order_id = o.order_id
LEFT JOIN ol.products p ON p.product_id = i.product_id AND i.store_id = p.store_id
WHERE o.order_state_2 IN('complete','processing')
AND f.product_id=p.product_id
AND o.created_at_order :: DATE BETWEEN f.previous_7_days::DATE AND COALESCE(f.out_of_stock_at::DATE,current_date)

Postgresql - how to combine these two queries

I try to combine these two queries in one.
the result of these queries is the number of accepted / rejected applications for a given operator.
I want to get such a result - in three column: number of accepted applications , number of rejected applications and operators assigned to it.
select count(applications.id) as number_of_applications, operator_id
from applications
inner join travel p on applications.id = p.application_id
inner join trip_details sp on p.id = sp.trip_id
where application_status ilike '%rejected%'
group by operator_id
order by number_of_applications desc;
select count(applications.id) as number_of_applications, operator_id
from applications
inner join travel p on applications.id = p.application_id
inner join trip_details sp on p.id = sp.trip_id
where application_status ilike '%accepted%'
group by operator_id
order by number_of_applications desc;
With conditional aggregation:
select
sum(case when application_status ilike '%accepted%' then 1 else 0 end) as number_of_applications_accepted,
sum(case when application_status ilike '%rejected%' then 1 else 0 end) as number_of_applications_rejected,
operator_id
from applications
inner join travel p on applications.id = p.application_id
inner join trip_details sp on p.id = sp.trip_id
where (application_status ilike '%rejected%') or (application_status ilike '%accepted%')
group by operator_id;
You can add the ordering that you prefer.

How do I do LIMIT within GROUP in the same table?

I can't figure out how to do limit within group although I've read all similar questions here. Reading PSQL doc didn't help either :( Consider the following:
CREATE TABLE article_relationship
(
article_from INT NOT NULL,
article_to INT NOT NULL,
score INT
);
I want to get a list of top 5 related articles per given article IDs sorted by score.
Here is what I tried:
select DISTINCT o.article_from
from article_relationship o
join lateral (
select i.article_from, i.article_to, i.score from article_relationship i
order by score desc
limit 5
) p on p.article_from = o.article_from
where o.article_from IN (18329382, 61913904, 66538293, 66540477, 66496909)
order by o.article_from;
And it returns nothing. I was under impression that outer query is like loop so I guess I only need source IDs there.
Also what if I want to join on articles table where there are columns id and title and get titles of related articles in resultset?
I added join in inner query:
select o.id, p.*
from articles o
join lateral (
select a.title, i.article_from, i.article_to, i.score
from article_relationship i
INNER JOIN articles a on a.id = i.article_to
where i.article_from = o.id
order by score desc
limit 5
) p on true
where o.id IN (18329382, 61913904, 66538293, 66540477, 66496909)
order by o.id;
But it made it very very slow.
The problem with no rows returning from your query is that your join condition is wrong: ON p.article_from = o.article_from; this should obviously be ON p.article_from = o.article_to.
That issue aside, your query will not return the top 5 scoring relations per article id; instead it will return the article IDs that reference one of the 5 top rated referenced articles throughout the table and (also) at least 1 of the 5 referenced articles for which you specify the id.
You can get the top 5 rated referenced articles per referencing article with a window function to rank the scores in a sub-select and then select only the top 5 in the main query. Specifying a list of referenced article IDs effectively means that you will rank how these referenced articles are scored for each referencing article:
SELECT article_from, article_to, score
FROM (
SELECT article_from, article_to, score,
rank() OVER (PARTITION BY article_from ORDER BY score DESC) AS rnk
FROM article_relationship
WHERE article_to IN (18329382, 61913904, 66538293, 66540477, 66496909) ) a
WHERE rnk < 6
ORDER BY article_from, score DESC;
This is different from your code in that it returns up to 5 records for each article_from but it is consistent with your initial description.
Adding columns from table articles is trivially done in the main query:
SELECT a.article_from, a.article_to, a.score, articles.*
FROM (
SELECT article_from, article_to, score,
rank() OVER (PARTITION BY article_from ORDER BY score DESC) AS rnk
FROM article_relationship
WHERE article_to IN (18329382, 61913904, 66538293, 66540477, 66496909) ) a
JOIN articles ON articles.id = a.article_to
WHERE a.rnk < 6
ORDER BY a.article_from, a.score DESC;
Version with join lateral
select o.id as from_id, p.article_to as to_id, a.title, a.journal_id, a.pub_date_p from articles o
join lateral (
select i.article_to from article_relationship i
where i.article_from = o.id
order by score desc
limit 5
) p on true
INNER JOIN articles a on a.id = p.article_to
where o.id IN (18329382, 61913904, 66538293, 66540477, 66496909)
order by o.id;

How to count detail rows on nested categories?

Let us consider that we have Categories (with PK as CategoryId) and Products (with PK as ProductId). Also, assume that every Category can relate to its parent category (using ParentCategoryId column in Categories).
How can I get Category wise product count? The parent category should include the count of all products of all of its sub-categories as well.
Any easier way to do?
sounds like what you are asking for would be a good use for with rollup
select cola, colb, SUM(colc) AS sumc
from table
group by cola, colb
with rollup
This would give a sum for colb and a rollup sum for cola. Example result below. Hope the formatting works. The null values are the rollup sums for the group.
cola colb sumc
1 a 1
1 b 4
1 NULL 5
2 c 2
2 d 3
2 NULL 5
NULL NULL 10
Give it a go and let me know if that has worked.
--EDIT
OK i think ive got this as it is working on a small test set i am using. Ive started to see a place where i need this myself so thanks for asking the question. I will admit this is a bit messy but should work for any number of levels and will only return the sum at the highest level.
I made an assumption that there is a number field in products.
with x
as (
select c.CategoryID, c.parentid, p.number, cast(c.CategoryID as varchar(8000)) as grp, c.CategoryID as thisid
from Categories as c
join Products as p on p.Categoryid = c.CategoryID
union all
select c.CategoryID, c.parentid, p.number, cast(c.CategoryID as varchar(8000))+'.'+x.grp , x.thisid
from Categories as c
join Products as p on p.Categoryid = c.CategoryID
join x on x.parentid = c.CategoryID
)
select x.CategoryID, SUM(x.number) as Amount
from x
left join Categories a on (a.CategoryID = LEFT(x.grp, case when charindex('.',x.grp)-1 > 0 then charindex('.',x.grp)-1 else 0 end))
or (a.CategoryID = x.thisid)
where a.parentid = 0
group by x.CategoryID
Assuming that Products can only point to a subcategory, here's a probable solution to the problem:
SELECT
cp.CategoryId,
ProductCount = COUNT(*)
FROM Products p
INNER JOIN Categories cc ON p.CategoryId = cc.CategoryId
INNER JOIN Categories cp ON cc.ParentCategoryId = cp.CategoryId
GROUP BY cp.CategoryId
But if the above assumption is wrong and a product can reference a parent category directly as well as a subcategory, then here's how you could count the products in this case:
SELECT
CategoryId = ISNULL(c2.CategoryId, c1.CategoryId),
ProductCount = COUNT(*)
FROM Products p
INNER JOIN Categories c1 ON p.CategoryId = c1.CategoryId
LEFT JOIN Categories c2 ON c1.ParentCategoryId = c2.CategoryId
GROUP BY ISNULL(c2.CategoryId, c1.CategoryId)
EDIT
This should work for 3 levels of hierarchy of categories (category, sub-category, sub-sub-category).
SELECT
CategoryId = COALESCE(c3.CategoryId, c2.CategoryId, c1.CategoryId),
ProductCount = COUNT(*)
FROM Products p
INNER JOIN Categories c1 ON p.CategoryId = c1.CategoryId
LEFT JOIN Categories c2 ON c1.ParentCategoryId = c2.CategoryId
LEFT JOIN Categories c3 ON c2.ParentCategoryId = c3.CategoryId
GROUP BY ISNULL(c3.CategoryId, c2.CategoryId, c1.CategoryId)
COALESCE picks the first non-NULL component. If the category is a child, it picks c3.Category, which is its grand-parent, if a parent, then its parent c2.Category is chosen, otherwise it's a grand-parent (c1.CategoryId).
In the end, it selects only grand-parent categories, and shows product count for them that includes all the subcategories of all levels.