Finding Min and Max per Country - postgresql

Im trying to find the distributor with the highest and lowest quantity for each country
in two columns distributor with minimum quantity and maximum quantity
I have been able to get the information from other posts but it is in a column however I want it on a row per country
See http://sqlfiddle.com/#!17/448f6/2
Desired result
"country" "min_qty_name" "max_qty_name"
1. "Madagascar" "Leonard Cardenas" "Gwendolyn Mccarty"
2. "Malaysia" "Arsenio Knowles" "Yael Carter"
3. "Palau" "Brittany Burris" "Clark Weaver"
4. "Tanzania" "Levi Douglas" "Levi Douglas"

You can use subqueries:
select distinct country,
(select distributor_name
from product
where country = p.country
order by quantity limit 1) as min_qty_name,
(select distributor_name
from product
where country = p.country
order by quantity desc limit 1) as max_qty_name
from product p;
Fiddle

You can do it with cte too (result here)
WITH max_table AS
(
SELECT ROW_NUMBER() OVER (partition by country order by country,quantity DESC) AS rank,
country, quantity,distributor_name
FROM
product
),
min_table AS
(
SELECT ROW_NUMBER() OVER (partition by country order by country,quantity) AS rank,
country, quantity,distributor_name
FROM
product
)
SELECT m1.country,m2.distributor_name,m1.distributor_name
from max_table m1, min_table m2
where m1.country = m2.country
and m1.rank = 1 and m2.rank = 1

You can do this with a single sort and pass through the data as follows:
with min_max as (
select distinct country,
first_value(distributor_name) over w as min_qty_name,
last_value(distributor_name) over w as max_qty_name
from product
window w as (partition by country
order by quantity
rows between unbounded preceding
and unbounded following)
)
select *
from min_max
order by min_max;
Updated Fiddle

Related

can you use max in this query?

From this table, I'm trying to determine the nation (s) that have the highest number of teams (a nation X has a team if it has at least one athlete from that country X).
driver(id,name, team, country)
This solution restores all countries in descending order. Would it be possible to ensure that only the one (s) with the most team (s) return and not all of them? I think you should use the 'max' command but I'm not sure.
SELECT (country) ,count(distinct team)
FROM driver
GROUP BY country
order by count(distinct team) DESC;
I would use your query as a CTE and then select from it like this -
WITH t AS
(
SELECT country, count(distinct team) cnt
FROM driver
GROUP BY country
)
SELECT country, cnt FROM t
WHERE cnt = (SELECT max(cnt) FROM t);
You can combine this with a window function:
with counts as (
SELECT country,
count(distinct team) as num_teams,
dense_rank() over (order by count(distinct team) desc) as rnk
FROM driver
GROUP BY country
)
select country, num_teams
from counts
where rnk = 1;
If you are using Postgres 14, you can use fetch first with the option with ties:
SELECT country,
count(distinct team) as num_teams
FROM driver
GROUP BY country
order by count(distinct team) desc
fetch first 1 rows with ties
If two countries have the same highest number of drivers, this would return both. Without the with ties option (which was introduced in Postgres 14) only one of them would be returned.

How to get the MAX(SUM of values) to find the category with the biggest total? PostgreSQL

I have two tables. One is Transactions and the other is Tickets. In Tickets I have the Ticket_Number,the name of the Category(Theater,Cinema,Concert), the Price of the Ticket. In Transactions I also have the Ticket_Number. What i want to do is to Get a SUM of money for each Category, and then with that data I want to Select the Category with the most money.
I already managed to get the SUM for each category but I am stuck here
SELECT category, SUM (Tickets.Price) AS Price
FROM Tickets,Transactions
WHERE Tickets.ticket_num=Transactions.ticket_num
GROUP BY Category
ORDER BY Price DESC;
I know i can add LIMIT 1 but I know it's not correct because 2 or more values can be the same
Using ROW_NUMBER to generate a sequence based on the sum of the price. Then, restrict to only the matching aggregated row with the highest total price.
WITH cte AS (
SELECT category, SUM(t1.Price) AS Price,
ROW_NUMBER() OVER (ORDER BY SUM(t1.Price) DESC) rn
FROM Tickets t1
INNER JOIN Transactions t2
ON t1.ticket_num = t2.ticket_num
GROUP BY Category
)
SELECT category, Price
FROM cte
WHERE rn = 1
ORDER BY Price DESC;
Note that if you want to capture all categories tied for the highest price, should a tie occur, then replace ROW_NUMBER in the above CTE with RANK, keeping everything else the same.
What you are looking for is a window function DENSE_RANK() which will handle ties properly.
RANK() will also work for your case, but if you would like to extend it to get TOP N places with ties (where N > 1), dense rank is the way to go.
SELECT Category, Price
FROM (
SELECT
Category,
SUM(ti.Price) AS Price,
DENSE_RANK() OVER (ORDER BY SUM(ti.Price) DESC) AS rnk
FROM Tickets ti
INNER JOIN Transactions tr ON
ti.ticket_num = tr.ticket_num
GROUP BY Category
) t
WHERE rnk = 1
I've also replaced the old style and not recommended joining of tables as comma separated list in FROM clause to a proper INNER JOIN clause and assigned aliases to tables.
You can use rank() to rank the sums of the prices, more expensive first.
SELECT category,
price
FROM (SELECT category,
sum(tickets.price) price,
rank() OVER (ORDER BY sum(tickets.price) DESC) r
FROM tickets
INNER JOIN transactions
ON transactions.ticket_num = tickets.ticket_num
GROUP BY category) x
WHERE r = 1;
I also took the liberty to rewrite your join from the ancient comma style to a modern, clearer version.

PostgreSQL: select column and exclude from group

This question has probably been asked in different formats, but I could not find the answer.
I have table orders
date, quantity_ordered, unit_cost_cents , product_model_number, title
I would like to:
SELECT
model_number,
title,
SUM(unit_cost_cents / 100.00 * quantity_ordered) as total
FROM orders
GROUP BY model_number
HAVING SUM(quantity_submitted) > 0
ORDER BY total DESC
But it requires grouping by the title as well.
My problem being is that my title changes over time. I'd like to preserve the titles and simply display/select the most recent title without grouping by title which would make the numbers different.
You can use a subquery to fetch the latest title:
SELECT
model_number,
(select max(title) from orders where date = (
select max(date) from orders where model_number = o.model_number)
) title,
SUM(unit_cost_cents / 100.00 * quantity_ordered) as total
FROM orders o
GROUP BY model_number
HAVING SUM(quantity_submitted) > 0
ORDER BY total DESC
I used select max(title) instead of select title to make sure that the subquery will not return more than 1 rows (just in case).
SELECT
o.model_number
, om.title
, SUM(o.unit_cost_cents / 100.00 * o.quantity_ordered) as total
FROM orders o
JOIN (SELECT model_number, title
,row_number() OVER (PARTITION BY model_number ORDER BY zdate DESC) AS rn
FROM orders) om
ON om.model_number=o.model_number AND om.rn=1
GROUP BY 1,2
HAVING SUM(o.quantity_submitted) > 0
ORDER BY 3 DESC
;

postgres - get top category purchased by customer

I have a denormalized table with the columns:
buyer_id
order_id
item_id
item_price
item_category
I would like to return something that returns 1 row per buyer_id
buyer_id, sum(item_price), item_category
-- but ONLY for the category with the highest rank of sales along that specific buyer_id.
I can't get row_number() or partition to work because I need to order by the sum of item_price relative to item_category relative to buyer. Am I overlooking anything obvious?
You need a few layers of fudging here:
SELECT buyer_id, item_sum, item_category
FROM (
SELECT buyer_id,
rank() OVER (PARTITION BY buyer_id ORDER BY item_sum DESC) AS rnk,
item_sum, item_category
FROM (
SELECT buyer_id, sum(item_price) AS item_sum, item_category
FROM my_table
GROUP BY 1, 3) AS sub2) AS sub
WHERE rnk = 1;
In sub2 you calculate the sum of 'item_price' for each 'item_category' for each 'buyer_id'. In sub you rank these with a window function by 'buyer_id', ordering by 'item_sum' in descending order (so the highest 'item_sum' comes first). In the main query you select those rows where rnk = 1.

selecting only two employees from every department

Can you let me know how to select only two employees from every department? The table has deptname, ssn, name . I am doing a sampling and I need only two ssns for every department name. Can someone help?
You can accomplish this with an "OLAP expression" row_number()
with e as
( select deptname, ssn, empname,
row_number() over (partition by dptname order by empname) as pick
from employees
)
select deptname, ssn, empname
from e
where pick < 3
order by deptname, ssn
This example will give you the two employees with the lowest order names, because that is what is specified in the row_number() (order by) expression.
Try this:
select *
from t t1
where (
select count(*)
from t t2
where
t2.deptname = t1.deptname
and
t2.ssn <= t1.ssn) <= 2
order by deptname, ssn,name;
The above will give "smallest" two ssn.
If you want top 2, change to t2.ssn >= t1.ssn
sqlfiddle
The data:
The result from query:
select * from
( select rank() over (partition by dptname order by empname) as count , *
from employees
)
where count<=2
order by deptname, ssn,name;