SQL select the max from each group and given them different lables - postgresql

For the following tables:
-- People
id | category | count
----+----------+-------
1 | a | 2
1 | a | 3
1 | b | 2
2 | a | 2
2 | b | 3
3 | a | 1
3 | a | 2
I know that I can find the max count for each id in each category by doing:
SELECT id, category, max(count) from People group by category, id;
With result:
id | category | max
----+----------+-------
1 | a | 3
1 | b | 2
2 | a | 2
2 | b | 3
3 | a | 2
But what if now I want to label the max values differently, like:
id | max_b_count | max_a_count
----+-------------+------------
1 | 2 | 3
2 | 3 | 2
3 | Null | 2
Should I do something like the following?
WITH t AS (SELECT id, category, max(count) from People group by category, id)
SELECT t.id, t.count as max_a_count from t where t.category = 'a'
FULL OUTER JOIN t.id, t.count as max_b_count from t where t.category = 'b'
on t.id;
It looks weird to me.

This is the exact use case why the filter_clause was added to the Aggregate Expressions
With filter_clause you may limit which row you aggregate
aggregate_name ( * ) [ FILTER ( WHERE filter_clause ) ]
Your example
SELECT id,
max(count) filter (where category = 'a') as max_a_count,
max(count) filter (where category = 'b') as max_b_count
from People
group by id
order by 1;
id|max_a_count|max_b_count|
--+-----------+-----------+
1| 3| 2|
2| 2| 3|
3| 2| |

This is one way you can do it:
with T as (select id, category, max(count_ab) maks
from people
group by id, category
order by id)
select t3.id
, (select t1.maks from T t1 where category = 'b' and t1.id = t3.id) max_b_count
, (select t2.maks from T t2 where category = 'a' and t2.id = t3.id) max_a_count
from T t3
group by t3.id
order by t3.id
Here is a demo
Also, as you can see, I have changed the name of the column count to count_ab because it is not a good practice to use keywords as columns names.

Related

SELECT DISTINCT on a ordered subquery's table

I'm working on a problem, involving these two tables.
books
isbn | title | author
------------+-----------------------------------------+------------------
1840918626 | Hogwarts: A History | Bathilda Bagshot
3458400871 | Fantastic Beasts and Where to Find Them | Newt Scamander
9136884926 | Advanced Potion-Making | Libatius Borage
transactions
id | patron_id | isbn | checked_out_date | checked_in_date
----+-----------+------------+------------------+-----------------
1 | 1 | 1840918626 | 2012-05-05 | 2012-05-06
2 | 4 | 9136884926 | 2012-05-05 | 2012-05-06
3 | 2 | 3458400871 | 2012-05-05 | 2012-05-06
4 | 3 | 3458400871 | 2018-04-29 | 2018-05-02
5 | 2 | 9136884926 | 2018-05-03 | NULL
6 | 1 | 3458400871 | 2018-05-03 | 2018-05-05
7 | 5 | 3458400871 | 2018-05-05 | NULL
the query "Make a list of all book titles and denote whether or not a copy of that book is checked out." so pretty much just the first table with a checked out column.
im trying to SELECT DISTINCT on a sub query with the checkout books first, but that doesn't work. I've researched and others say to accomplish this use a GROUP BY clause instead of DISTINCT but the examples they provide are one column queries and when more columns are added it doesn't work.
this is my closest attempt
SELECT DISTINCT ON (title)
title, checked_out
FROM(
SELECT b.title, t.checked_in_date IS NULL AS checked_out
FROM transactions t
natural join books b
ORDER BY checked_out DESC
) t;
or you can join only transactions where books are not checked in:
SELECT b.title, t.isbn IS NOT NULL AS checked_out
, t.checked_out_date
FROM books b
LEFT JOIN transactions t ON t.isbn = b.isbn AND t.checked_in_date IS NULL
ORDER BY checked_out DESC
I adjusted your attempt a little bit. Basically I changed the way your data is joined
SELECT DISTINCT ON (title)
title, checked_out
FROM(
SELECT b.title, t.checked_in_date IS NULL AS checked_out
FROM books b
LEFT OUTER JOIN transactions t USING (isbn)
ORDER BY checked_out DESC
) t;

Postgresql summing duplicate elements

In the table can exsist 2 lines that give the same information only a single column value is different. Basically the data is duplicated because of this 1 column. Can I somehow sum otherelement in such a manner that it takes this duplication into account ?
To illustrate the idea of the problem
Example:
|id|type|val1|val2|
|1 | 2 | 1 | 1 |
|1 | 3 | 1 | 1 |
|1 | 2 | 2 | 2 |
|1 | 3 | 2 | 2 |
Expected result
|id|type|val1|val2|count|
|1 |2,3 | 3 | 3 | 2 |
Actual result
|id|type|val1|val2|count|
|1 |2,3 | 6 | 6 | 4 |
In the actual data the type and val come from 2 different tables connected by 3rd table, so the query is like this:
SELECT id,
array_to_string(array_agg(DISTINCT x.type ORDER BY x.type), ','::text) AS type,
sum(y.val1) AS val1,
sum(y.val2) AS val2,
count(y.val1) AS count
FROM a
JOIN x ON x.a_id = a.id AND x.active = true
JOIN y ON y.a_id = a.id AND y.active = true
GROUP BY a.id
SOLUTION
SELECT id,
array_to_string(array_agg(DISTINCT x.type ORDER BY x.type), ','::text) AS type,
sum(distinct y.val1) AS val1,
sum(distinct y.val2) AS val2,
count(distinct y.val1) AS count
FROM a
JOIN x ON x.a_id = a.id AND x.active = true
JOIN y ON y.a_id = a.id AND y.active = true
GROUP BY a.id

Grouping by unique values inside a JSONB array

Consider the following table structure:
CREATE TABLE residences (id int, price int, categories jsonb);
INSERT INTO residences VALUES
(1, 3, '["monkeys", "hamsters", "foxes"]'),
(2, 5, '["monkeys", "hamsters", "foxes", "foxes"]'),
(3, 7, '[]'),
(4, 11, '["turtles"]');
SELECT * FROM residences;
id | price | categories
----+-------+-------------------------------------------
1 | 3 | ["monkeys", "hamsters", "foxes"]
2 | 5 | ["monkeys", "hamsters", "foxes", "foxes"]
3 | 7 | []
4 | 11 | ["turtles"]
Now I would like to know how many residences there are for each category, as well as their sum of prices. The only way I found was to do this was using a sub-query:
SELECT category, SUM(price), COUNT(*) AS residences_no
FROM
residences a,
(
SELECT DISTINCT(jsonb_array_elements(categories)) AS category
FROM residences
) b
WHERE a.categories #> category
GROUP BY category
ORDER BY category;
category | sum | residences_no
------------+-----+---------------
"foxes" | 8 | 2
"hamsters" | 8 | 2
"monkeys" | 8 | 2
"turtles" | 11 | 1
Using jsonb_array_elements without subquery would return three residences for foxes because of the duplicate entry in the second row. Also the price of the residence would be inflated by 5.
Is there any way to do this without using the sub-query, or any better way to accomplish this result?
EDIT
Initially I did not mention the price column.
select category, count(distinct (id, category))
from residences, jsonb_array_elements(categories) category
group by category
order by category;
category | count
------------+-------
"foxes" | 2
"hamsters" | 2
"monkeys" | 2
"turtles" | 1
(4 rows)
You have to use a derived table to aggregate another column (all prices at 10):
select category, count(*), sum(price) total
from (
select distinct id, category, price
from residences, jsonb_array_elements(categories) category
) s
group by category
order by category;
category | count | total
------------+-------+-------
"foxes" | 2 | 20
"hamsters" | 2 | 20
"monkeys" | 2 | 20
"turtles" | 1 | 10
(4 rows)

How to eliminate repeated field with GROUP BY clause?

I have 3 tables called:
1.app_tenant pk:id, fk:pasar_id
---+--------+-----------+
id | nama | pasar_id |
----+--------+-----------+
1 | joe | 1 |
2 | adi | 2 |
3 | adam | 3 |
2.app_pasar pk:id
----+------------- +
id | nama |
----+------------- +
1 | kosambi |
2 | gede bage |
3 | pasar minggu |
3.app_kios pk:id, fk:tenant_id
----+---------------+----------
id | nama |tenant_id
----+-------------- +----------
1 | kios1 |1
2 | kios2 |2
3 | kios3 |3
4 | kios4 |1
5 | kios5 |1
6 | kios6 |2
7 | kios7 |2
8 | kios8 |3
9 | kios9 |3
Then with a LEFT JOIN query and grouping by id in every table I want to displaying data like this:
----+---------------+------------+-----------
id | nama_tenant |nama_pasar |nama_kios
----+-------------- +------------------------
1 | joe |kosambi |kios 1
2 | adi |gede bage |kios 2
2 | adam |pasar minggu|kios 3
but after I execute this query, data are not shown as expected. The problem is
redundancy in the nama_tenant field. How can I eliminate repeated nama_tenantrecords?
This is my query:
select a.id,a.nama as nama_tenant,
b.nama as nama_pasar,
c.nama as nama_kios
from app_tenant a
left join app_pasar b on a.id=b.id
left join app_kios c on a.id= c.tenant_id
group by
a.id,
b.id,
c.id
Table definitions:
CREATE TABLE app_tenant (
id serial PRIMARY KEY,
nama character varying,
pasar_id integer);
CREATE TABLE app_kios (
id serial PRIMARY KEY,
nama character varying,
tenant_id integer REFERENCES app_tenant);
The problem is that tenants can have multiple kiosks. From your sample data it looks like you want to display the first kiosk of every tenant (although "first" is a vague concept on strings, here I use alphabetical sort order). Your query would be like this:
SELECT t.id, t.nama AS nama_tenant, p.nama AS nama_pasar, k.nama AS nama_kios
FROM app_tenant t
LEFT JOIN app_pasar p ON p.id = t.pasar_id
LEFT JOIN (
SELECT tenant_id, nama, rank() OVER (PARTITION BY tenant_id ORDER BY nama) AS rnk
FROM app_kios
WHERE rnk = 1) k ON k.tenant_id = t.id
ORDER BY t.id
The sub-query on app_kios uses a window function to get the first kiosk name after sorting the names of the kiosk for each tenant.
I would also suggest to use meaningful aliases for table names instead of simply a, b, c.

SQL Server recursive query·

I have a table in SQL Server 2008 R2 which contains product orders. For the most part, it is one entry per product
ID | Prod | Qty
------------
1 | A | 1
4 | B | 1
7 | A | 1
8 | A | 1
9 | A | 1
12 | C | 1
15 | A | 1
16 | A | 1
21 | B | 1
I want to create a view based on the table which looks like this
ID | Prod | Qty
------------------
1 | A | 1
4 | B | 1
9 | A | 3
12 | C | 1
16 | A | 2
21 | B | 1
I've written a query using a table expression, but I am stumped on how to make it work. The sql below does not actually work, but is a sample of what I am trying to do. I've written this query multiple different ways, but cannot figure out how to get the right results. I am using row_number to generate a sequential id. From that, I can order and compare consecutive rows to see if the next row has the same product as the previous row since ReleaseId is sequential, but not necessarily contiguous.
;with myData AS
(
SELECT
row_number() over (order by a.ReleaseId) as 'Item',
a.ReleaseId,
a.ProductId,
a.Qty
FROM OrdersReleased a
UNION ALL
SELECT
row_number() over (order by b.ReleaseId) as 'Item',
b.ReleaseId,
b.ProductId,
b.Qty
FROM OrdersReleased b
INNER JOIN myData c ON b.Item = c.Item + 1 and b.ProductId = c.ProductId
)
SELECT * from myData
Usually you drop the ID out of something like this, since it is a summary.
SELECT a.ProductId,
SUM(a.Qty) AS Qty
FROM OrdersReleased a
GROUP BY a.ProductId
ORDER BY a.ProductId
-- if you want to do sub query you can do it as a column (if you don't have a very large dataset).
SELECT a.ProductId,
SUM(a.Qty) AS Qty,
(SELECT COUNT(1)
FROM OrdersReleased b
WHERE b.ReleasedID - 1 = a.ReleasedID
AND b.ProductId = b.ProductId) as NumberBackToBack
FROM OrdersReleased a
GROUP BY a.ProductId
ORDER BY a.ProductId