T-SQL select distinct based on highest values from another column

T-SQL select distinct based on highest values from another column - tsql

I have a table that contains number of tryouts, customerID, status of that one tryout and some other columns with various data.
Of course a single customerID can have multiple number of tryouts ( in the real table first tryout is number 1, second one number 2 etc.).
Ex.
Customer ID = 1, tryout = 1
Customer ID = 1, tryout = 2
Customer ID = 1, tryout = 3
Customer ID = 2, tryout = 1
Customer ID = 3, tryout = 1
Customer ID = 3, tryout = 2
And I want to have all distinct customerIDs but for each one only the row, that contains the highest tryout number for each customer in one table with data from all the other columns as well.
Ex.
tryouts, customerID, status, data1, data2
How can I achieve that ?

If you only want the customer ID and tryout value then you can try the following:
SELECT customerID, MAX(tryout) AS max_tryout
FROM yourTable
GROUP BY customerID
If you want the entire record, then one option would be to use ROW_NUMBER():
SELECT t.customerID, t.tryout, t.status, t.data1, t.data2
FROM
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY customerID ORDER BY tryout DESC) rn
) t
WHERE t.rn = 1

Try
SELECT
CustomerID,
MAX(tryout) AS [Max tryout]
FROM
TheTable
GROUP BY
CustomerID
This should give you what you want

Related

Finding Min and Max per Country

Im trying to find the distributor with the highest and lowest quantity for each country
in two columns distributor with minimum quantity and maximum quantity
I have been able to get the information from other posts but it is in a column however I want it on a row per country
See http://sqlfiddle.com/#!17/448f6/2
Desired result
"country" "min_qty_name" "max_qty_name"
1. "Madagascar" "Leonard Cardenas" "Gwendolyn Mccarty"
2. "Malaysia" "Arsenio Knowles" "Yael Carter"
3. "Palau" "Brittany Burris" "Clark Weaver"
4. "Tanzania" "Levi Douglas" "Levi Douglas"

You can use subqueries:
select distinct country,
(select distributor_name
from product
where country = p.country
order by quantity limit 1) as min_qty_name,
(select distributor_name
from product
where country = p.country
order by quantity desc limit 1) as max_qty_name
from product p;
Fiddle

You can do it with cte too (result here)
WITH max_table AS
(
SELECT ROW_NUMBER() OVER (partition by country order by country,quantity DESC) AS rank,
country, quantity,distributor_name
FROM
product
),
min_table AS
(
SELECT ROW_NUMBER() OVER (partition by country order by country,quantity) AS rank,
country, quantity,distributor_name
FROM
product
)
SELECT m1.country,m2.distributor_name,m1.distributor_name
from max_table m1, min_table m2
where m1.country = m2.country
and m1.rank = 1 and m2.rank = 1

You can do this with a single sort and pass through the data as follows:
with min_max as (
select distinct country,
first_value(distributor_name) over w as min_qty_name,
last_value(distributor_name) over w as max_qty_name
from product
window w as (partition by country
order by quantity
rows between unbounded preceding
and unbounded following)
)
select *
from min_max
order by min_max;
Updated Fiddle

Sum of one column grouped by 2nd column with groups made based on 3rd column

Data
So my data looks like:
product user_id value id
pizza 1 50 1
burger 1 30 2
pizza 2 50 3
fries 1 10 4
pizza 3 50 5
burger 1 30 6
burger 2 30 7
Problem Statement
And I wanted to compute Lifetime values of customers of each product as a metric to know which product is doing great in terms of user retention.
Desired Output
My desired output is:
product
value_by_customers_of_these_products
total_customers
ltv
pizza
250
3
250/3 = 83.33
burger
200
2
200/2 = 100
fries
120
1
120/1 = 120
Columns Description:
value_by_customers_of_these_products : Total value generated by
customers of each product including orders which do not contain the
product
total_customers : Simple COUNT(DISTINCT user_id) GROUP BY product
Current Workaround
Currently I am doing this:
SELECT "pizza" AS product, SUM(value) value_by_customers_of_these_products, COUNT(DISTINCT user_id) users FROM orders WHERE user_id in (SELECT user_id FROM orders WHERE product = "pizza")
UNION ALL
SELECT "burger" AS product, SUM(value) value_by_customers_of_these_products, COUNT(DISTINCT user_id) users FROM orders WHERE user_id in (SELECT user_id FROM orders WHERE product = "burger")
UNION ALL
SELECT "fries" AS product, SUM(value) value_by_customers_of_these_products, COUNT(DISTINCT user_id) users FROM orders WHERE user_id in (SELECT user_id FROM orders WHERE product = "fries")
I have a python script obtaining DISTINCT product names from my table and then repeating the query string for each product and updating query from time to time. This is really a pain as I have to do every time a new product is launched and sky-rocketing length of query is another issue. How can I achieve this via built-in BigQuery functions or minimal headache?
Code to generate Sample Data
WITH orders as (SELECT "pizza" AS product,
1 AS user_id,
50 AS value, 1 AS id,
UNION ALL SELECT "burger", 1, 30,2
UNION ALL SELECT "pizza", 2, 50,3
UNION ALL SELECT "fries", 1, 10,4
UNION ALL SELECT "pizza", 3, 50,5
UNION ALL SELECT "burger", 1, 30, 6
UNION ALL SELECT "burger", 3, 30, 7)

Use below
with user_value as (
select user_id, sum(value) values
from `project.dataset.table`
group by user_id
), product_user as (
select distinct product, user_id
from `project.dataset.table`
)
select product,
sum(values) as value_by_customers_of_these_products,
count(user_id) as total_customers,
round(sum(values) / count(user_id), 2) as ltv
from product_user
join user_value
using(user_id)
group by product
if applied to sample data in your question - output is

how to list records that conform to a sequentially incrementing id in postgres

Is there a way to select records are sequentially incremented?
for example, for a list of records
id 0
id 1
id 3
id 4
id 5
id 8
a command like:
select id incrementally from 3
Will return values 3,4 and 5. It won't return 8 because it's not sequentially incrementing from 5.

step-by-step demo:db<>fiddle
WITH groups AS ( -- 2
SELECT
*,
id - row_number() OVER (ORDER BY id) as group_id -- 1
FROM mytable
)
SELECT
*
FROM groups
WHERE group_id = ( -- 4
SELECT group_id FROM groups WHERE id = 3 -- 3
)
row_number() window function create a consecutive row count. With this difference you are able to create groups of consecutive records (id values which are increasing by 1)
This query is put into a WITH clause because we reuse the result twice in the next step
Select the recently created group_id
Filter the table for this group.
Additionally: If you want to start your output at id = 4, for example, you need to add a AND id >= 4 filter to the WHERE clause

Updating a column in Postgres only returning one value incorrectly (Postgres 9.5.1)

Suppose I have a table created by the following:
create table test.sales
(
customer text,
purchased text,
date_purchased date,
rownumber integer,
primary key (customer, date_purchased)
)
;
insert into test.sales
values
('kevin', 'books', '2017-01-01'::date, null),
('kevin', 'movies', '2017-01-02'::date, null),
('paul', 'books', '2017-01-05'::date, null),
('paul', 'movies', '2017-01-07'::date, null)
At this point, rownumber is always NULL and I want to set the value of rownumber with row_number() over (partition by customer order by date_purchased) as rownumber. The way I am going about this is the following:
update test.sales as a
set (rownumber) =
(
select
row_number() over (partition by customer order by date_purchased) as rownumber
from test.sales as b
-- These fields correspond to the primary keys of the table.
where a.customer = b.customer
and a.date_purchased = b.date_purchased
)
But for some reason this returned:
customer purchased date_purchased rownumber
kevin books 1/1/17 1
kevin movies 1/2/17 1
paul books 1/5/17 1
paul movies 1/7/17 1
I'm expecting this:
customer purchased date_purchased rownumber
kevin books 1/1/17 1
kevin movies 1/2/17 2
paul books 1/5/17 1
paul movies 1/7/17 2
Notice that in the actual results, rownumber is always 1. Why is this?

You are ussing a scalar subquery. Use a correlated subquery instead:
UPDATE test.sales dst
SET rownumber = src.rownumber
FROM ( SELECT customer,date_purchased
, row_number() over (partition by customer order by date_purchased) as rownumber
from test.sales
) src
WHERE dst.customer = src.customer
AND dst.date_purchased = src.date_purchased
;

Selecting distinct substring values

I have a field that is similar to a MAC address in that the first part is a group ID and the second part is a serial number. My field is alphanumeric and 5 digits in length, and the first 3 are the group ID.
I need a query that gives me all distinct group IDs and the first serial number lexicographically. Here is sample data:
ID
-----
X4MCC
X4MEE
X4MFF
V21DD
8Z6BB
8Z6FF
Desired Output:
ID
-----
X4MCC
V21DD
8Z6BB
I know I can do SELECT DISTINCT SUBSTRING(ID, 1, 3) but I don't know how to get the first one lexicographically.

Another way which seems to have the same cost as the query by gbn:
SELECT MIN(id)
FROM your_table
GROUP BY SUBSTRING(id, 1, 3);

SELECT
ID
FROM
(
SELECT
ID,
ROW_NUMBER() OVER (PARTITION BY SUBSTRING(ID, 1, 3) ORDER BY ID) AS rn
FROM MyTable
) oops
WHERE
rn = 1

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

T-SQL select distinct based on highest values from another column - tsql

Try SELECT CustomerID, MAX(tryout) AS [Max tryout] FROM TheTable GROUP BY CustomerID This should give you what you want

Related

Finding Min and Max per Country

Sum of one column grouped by 2nd column with groups made based on 3rd column

how to list records that conform to a sequentially incrementing id in postgres

Updating a column in Postgres only returning one value incorrectly (Postgres 9.5.1)

Selecting distinct substring values

Categories

Resources