Find high and low prices in sql - tsql

In this sample database there are two tables, products and prices.
The goal is to find the highest and the lowest price for each product.
The price table can have either zero, one or two rows per product.
create table products(
id int,
name nvarchar(50)
)
create table prices(
productId int,
price int
)
insert into products (id, name) values (33,'bike')
insert into products (id, name) values (44,'car')
insert into products (id, name) values (55,'bus')
insert into prices (productId, price) values (33, 10)
insert into prices (productId, price) values (33, 40)
insert into prices (productId, price) values (44, 300)
The sql query should result in this:
productId highPrice lowPrice
33 40 10
44 300 NULL
55 NULL NULL

This is for MySQL, but it might work for you too.
SELECT
products.id as productId
, MIN(price) as highPrice
, MAX(price) as lowPrice
FROM products
LEFT JOIN prices ON products.id=prices.productId
GROUP BY products.id

SELECT productId,
MAX(price) AS highPrice,
MIN(price) AS lowPrice
FROM prices
GROUP BY productId
and if you want the product name in there as well:
SELECT name,
MAX(price) AS highPrice,
MIN(price) AS lowPrice
FROM products
LEFT OUTER JOIN prices ON ID = ProductID
GROUP BY name

This gives you the table that you're looking for (I notice that the other answers don't), in SQL Server 2005
select P.ID as ProductID,
nullif(sum(case when idx=1 then price else 0 end), 0) as highPrice,
nullif(sum(case when idx=2 then price else 0 end), 0) as lowPrice from
(
select productid, price, row_number() over(partition by productID order by price desc) as idx from prices
) T
right join products P on T.productID = P.ID
group by P.ID

Related

Finding Min and Max per Country

Im trying to find the distributor with the highest and lowest quantity for each country
in two columns distributor with minimum quantity and maximum quantity
I have been able to get the information from other posts but it is in a column however I want it on a row per country
See http://sqlfiddle.com/#!17/448f6/2
Desired result
"country" "min_qty_name" "max_qty_name"
1. "Madagascar" "Leonard Cardenas" "Gwendolyn Mccarty"
2. "Malaysia" "Arsenio Knowles" "Yael Carter"
3. "Palau" "Brittany Burris" "Clark Weaver"
4. "Tanzania" "Levi Douglas" "Levi Douglas"
You can use subqueries:
select distinct country,
(select distributor_name
from product
where country = p.country
order by quantity limit 1) as min_qty_name,
(select distributor_name
from product
where country = p.country
order by quantity desc limit 1) as max_qty_name
from product p;
Fiddle
You can do it with cte too (result here)
WITH max_table AS
(
SELECT ROW_NUMBER() OVER (partition by country order by country,quantity DESC) AS rank,
country, quantity,distributor_name
FROM
product
),
min_table AS
(
SELECT ROW_NUMBER() OVER (partition by country order by country,quantity) AS rank,
country, quantity,distributor_name
FROM
product
)
SELECT m1.country,m2.distributor_name,m1.distributor_name
from max_table m1, min_table m2
where m1.country = m2.country
and m1.rank = 1 and m2.rank = 1
You can do this with a single sort and pass through the data as follows:
with min_max as (
select distinct country,
first_value(distributor_name) over w as min_qty_name,
last_value(distributor_name) over w as max_qty_name
from product
window w as (partition by country
order by quantity
rows between unbounded preceding
and unbounded following)
)
select *
from min_max
order by min_max;
Updated Fiddle

How can I combine the two select queries on the same table horizontally in Postgresql?

everyone. I am a beginner of Postgresql. Recently I met with one question.
I have one table named 'sales'.
create table sales
(
cust varchar(20),
prod varchar(20),
day integer,
month integer,
year integer,
state char(2),
quant integer
);
insert into sales values ('Bloom', 'Pepsi', 2, 12, 2001, 'NY', 4232);
insert into sales values ('Knuth', 'Bread', 23, 5, 2005, 'PA', 4167);
insert into sales values ('Emily', 'Pepsi', 22, 1, 2006, 'CT', 4404);
insert into sales values ('Emily', 'Fruits', 11, 1, 2000, 'NJ', 4369);
insert into sales values ('Helen', 'Milk', 7, 11, 2006, 'CT', 210);
insert into sales values ('Emily', 'Soap', 2, 4, 2002, 'CT', 2549);
insert into sales values ('Bloom', 'Eggs', 30, 11, 2000, 'NJ', 559);
....
There are 498 rows in total.
Here is the overview of this table:
Now I want to compute the maximum and minimum sales quantities for each product, along with their corresponding customer (who purchased the product), dates (i.e., dates of those maximum and minimum sales quantities) and the state in which the sale transaction took place.
And the average sales quantity for the corresponding products.
The combined one should be like this:
It should have 10 rows because there are 10 distinct products in total.
I have tried:
select prod,
max(quant),
cust as MAX_CUST
from sales
group by prod;
but it returned an error and said the cust should be in the group by. But I only want to classify by the type of product.
What's more, how can I horizontally combine the max_q and its customer, date, state with min_q and its customer, date, state and also the AVG_Q by their product name?
I feel really confused!
You can use analytic function ROW_NUMBER to rank records by increasing/decreasing sales for each product in a subquery, and then do conditional aggregation:
SELECT
prod product,
MAX(CASE WHEN rn2 = 1 THEN quant END) max_quant,
MAX(CASE WHEN rn2 = 1 THEN cust END) max_cust,
MAX(CASE WHEN rn2 = 1 THEN TO_DATE(year || '-' || month || '-' || day, 'YYYY-MM-DD') END) max_date,
MAX(CASE WHEN rn2 = 1 THEN state END) max_state,
MAX(CASE WHEN rn1 = 1 THEN quant END) min_quant,
MAX(CASE WHEN rn1 = 1 THEN cust END) min_cust,
MAX(CASE WHEN rn1 = 1 THEN TO_DATE(year || '-' || month || '-' || day, 'YYYY-MM-DD') END) min_date,
MAX(CASE WHEN rn1 = 1 THEN state END) min_state,
avg_quant
FROM (
SELECT
s.*,
ROW_NUMBER() OVER(PARTITION BY prod ORDER BY quant) rn1,
ROW_NUMBER() OVER(PARTITION BY prod ORDER BY quant DESC) rn2,
AVG(quant) OVER(PARTITION BY prod) avg_quant
FROM sales s
) x
WHERE rn1 = 1 OR rn2 = 1
GROUP BY prod, avg_quant
With two aggregate function (min, max) applied on a column and selecting respective row is not that straight forward. if u wanted only one aggregate function u could do something like example below with dense rank (window function).
SELECT prod, quant cust,
dense_rank() OVER (PARTITION BY prod ORDER BY quant DESC) AS c_rank
FROM sales WHERE c_rank < 2;
this will give you rows for a product with maximum quant. you can do same for minimum quant. it will more complicated to do both in same query, you can do it in simple way of creating on the fly tables for each case and joining them as show below.
with max_quant as (
SELECT prod, quant cust,
dense_rank() OVER (PARTITION BY prod ORDER BY quant DESC) AS c_rank
FROM sales WHERE c_rank < 2
),
min_quant as (
SELECT prod, quant cust,
dense_rank() OVER (PARTITION BY prod ORDER BY quant DESC) AS c_rank
FROM sales WHERE c_rank < 2
),
avg_quant as (
select prod, avg(quant) as avg_quant from sales group by prod
)
select mx.prod, mx.quant, mx.cust, mn.quant, mn.cust, ag.avg_quant
from max_quant mx
join min_quant mn on mn.prod = mx.prod
join avg_quant ag on ag.prod = mx.prod;
you cant use a group by to select min/max here as you want to get the complete row for the min/max value of quant which is not possible directly with group by.

Updating a column in Postgres only returning one value incorrectly (Postgres 9.5.1)

Suppose I have a table created by the following:
create table test.sales
(
customer text,
purchased text,
date_purchased date,
rownumber integer,
primary key (customer, date_purchased)
)
;
insert into test.sales
values
('kevin', 'books', '2017-01-01'::date, null),
('kevin', 'movies', '2017-01-02'::date, null),
('paul', 'books', '2017-01-05'::date, null),
('paul', 'movies', '2017-01-07'::date, null)
At this point, rownumber is always NULL and I want to set the value of rownumber with row_number() over (partition by customer order by date_purchased) as rownumber. The way I am going about this is the following:
update test.sales as a
set (rownumber) =
(
select
row_number() over (partition by customer order by date_purchased) as rownumber
from test.sales as b
-- These fields correspond to the primary keys of the table.
where a.customer = b.customer
and a.date_purchased = b.date_purchased
)
But for some reason this returned:
customer purchased date_purchased rownumber
kevin books 1/1/17 1
kevin movies 1/2/17 1
paul books 1/5/17 1
paul movies 1/7/17 1
I'm expecting this:
customer purchased date_purchased rownumber
kevin books 1/1/17 1
kevin movies 1/2/17 2
paul books 1/5/17 1
paul movies 1/7/17 2
Notice that in the actual results, rownumber is always 1. Why is this?
You are ussing a scalar subquery. Use a correlated subquery instead:
UPDATE test.sales dst
SET rownumber = src.rownumber
FROM ( SELECT customer,date_purchased
, row_number() over (partition by customer order by date_purchased) as rownumber
from test.sales
) src
WHERE dst.customer = src.customer
AND dst.date_purchased = src.date_purchased
;

postgres - get top category purchased by customer

I have a denormalized table with the columns:
buyer_id
order_id
item_id
item_price
item_category
I would like to return something that returns 1 row per buyer_id
buyer_id, sum(item_price), item_category
-- but ONLY for the category with the highest rank of sales along that specific buyer_id.
I can't get row_number() or partition to work because I need to order by the sum of item_price relative to item_category relative to buyer. Am I overlooking anything obvious?
You need a few layers of fudging here:
SELECT buyer_id, item_sum, item_category
FROM (
SELECT buyer_id,
rank() OVER (PARTITION BY buyer_id ORDER BY item_sum DESC) AS rnk,
item_sum, item_category
FROM (
SELECT buyer_id, sum(item_price) AS item_sum, item_category
FROM my_table
GROUP BY 1, 3) AS sub2) AS sub
WHERE rnk = 1;
In sub2 you calculate the sum of 'item_price' for each 'item_category' for each 'buyer_id'. In sub you rank these with a window function by 'buyer_id', ordering by 'item_sum' in descending order (so the highest 'item_sum' comes first). In the main query you select those rows where rnk = 1.

How to calculate average date occurrence frequency in SQL

I'm trying to produce a query on the following table (relevant portion only):
Create Table [Order] (
OrderID int NOT NULL IDENTITY(1,1),
CreationDate datetime NOT NULL,
CustomerID int NOT NULL
)
I would like to see a list of CustomerIDs with each customer's average number of days between orders. I'm curious if this can be done with a pure set based solution or if a cursor/temp table solution is necessary.
;WITH base AS
(
SELECT CustomerID,
ROW_NUMBER() over (partition BY CustomerID ORDER BY CreationDate, OrderID) AS rn
FROM [Order]
)
SELECT b1.CustomerID,
AVG(DATEDIFF(DAY,b1.CreationDate, b2.CreationDate) )
FROM base b1
JOIN base b2
ON b1.CustomerID=b2.CustomerID
AND b2.rn =b1.rn+1
GROUP BY b1.CustomerID