How to eliminate repeated field with GROUP BY clause? - postgresql

I have 3 tables called:
1.app_tenant pk:id, fk:pasar_id
---+--------+-----------+
id | nama | pasar_id |
----+--------+-----------+
1 | joe | 1 |
2 | adi | 2 |
3 | adam | 3 |
2.app_pasar pk:id
----+------------- +
id | nama |
----+------------- +
1 | kosambi |
2 | gede bage |
3 | pasar minggu |
3.app_kios pk:id, fk:tenant_id
----+---------------+----------
id | nama |tenant_id
----+-------------- +----------
1 | kios1 |1
2 | kios2 |2
3 | kios3 |3
4 | kios4 |1
5 | kios5 |1
6 | kios6 |2
7 | kios7 |2
8 | kios8 |3
9 | kios9 |3
Then with a LEFT JOIN query and grouping by id in every table I want to displaying data like this:
----+---------------+------------+-----------
id | nama_tenant |nama_pasar |nama_kios
----+-------------- +------------------------
1 | joe |kosambi |kios 1
2 | adi |gede bage |kios 2
2 | adam |pasar minggu|kios 3
but after I execute this query, data are not shown as expected. The problem is
redundancy in the nama_tenant field. How can I eliminate repeated nama_tenantrecords?
This is my query:
select a.id,a.nama as nama_tenant,
b.nama as nama_pasar,
c.nama as nama_kios
from app_tenant a
left join app_pasar b on a.id=b.id
left join app_kios c on a.id= c.tenant_id
group by
a.id,
b.id,
c.id
Table definitions:
CREATE TABLE app_tenant (
id serial PRIMARY KEY,
nama character varying,
pasar_id integer);
CREATE TABLE app_kios (
id serial PRIMARY KEY,
nama character varying,
tenant_id integer REFERENCES app_tenant);

The problem is that tenants can have multiple kiosks. From your sample data it looks like you want to display the first kiosk of every tenant (although "first" is a vague concept on strings, here I use alphabetical sort order). Your query would be like this:
SELECT t.id, t.nama AS nama_tenant, p.nama AS nama_pasar, k.nama AS nama_kios
FROM app_tenant t
LEFT JOIN app_pasar p ON p.id = t.pasar_id
LEFT JOIN (
SELECT tenant_id, nama, rank() OVER (PARTITION BY tenant_id ORDER BY nama) AS rnk
FROM app_kios
WHERE rnk = 1) k ON k.tenant_id = t.id
ORDER BY t.id
The sub-query on app_kios uses a window function to get the first kiosk name after sorting the names of the kiosk for each tenant.
I would also suggest to use meaningful aliases for table names instead of simply a, b, c.

Related

SQL Join multiple table without repetition

I've got 3 tables
Table A
----------------------
| ID| Data1 | Data2 |
---------------------
| 1 |John | 2021 |
| 2 |Steve | 2020 |
Table B
----------------------
|Row|ID|Value1|Value2|
----------------------
|1 |1 |iR3000|0.5 |
|2 |1 |iRC252|0.7 |
|3 |2 |Dr2000|0.4 |
Table C
----------------------
|Row|ID|Value3|Value4|
----------------------
|1 |1 |aaaaaa|12345 |
|2 |1 |bbbbbb|6789 |
My goal is to add a result like this :
-------------------------------------------------
| ID| Data1 | Data2 |Value1|Value2|Value3|Value4|
-------------------------------------------------
| 1 |John | 2021 |iR3000|0.5 |aaaaaa|12345 |
| 1 |John | 2021 |iRC252|0.7 |bbbbbb|6789 |
| 2 |Steve | 2020 |Dr2000|0.4 |null |null |
Actually with my query, the ID 1 is duplicate 4 times.
Here is my query :
SELECT
a.id, a.data1,a.data2
,b.value1, b.value2
,c.value3,c.value4
FROM TableA a
JOIN TableB b
ON b.ID=a.ID
JOIN TableC c
ON c.ID=a.ID
What you had was close; only the JOIN to TableC was wrong. It needs to be an OUTER JOIN and also match on the Row column:
SELECT a.ID, a.Data1, a.Data2, b.Value1, b.Value2, c.Value3, c.Value4
FROM TableA a
INNER JOIN TableB b on b.ID = a.ID
LEFT JOIN TableC c on c.ID = b.ID AND c.Row = b.Row
Update based on the comment:
I cannot use row column cause they are not always match with the same number.
Okay. If the Row column at least exists, we can still work with that to create projections that might be more consistent between tables:
With TableB2 AS (
SELECT *, row_number() over (partition by ID order by row) As Row2
FROM TableB
),
TableC2 As (
SELECT *, row_number() over (partition by ID order by row) As Row2
FROM TableC
)
SELECT a.ID, a.Data1, a.Data2, b.Value1, b.Value2, c.Value3, c.Value4
FROM TableA a
INNER JOIN TableB2 b on b.ID = a.ID
LEFT JOIN TableC2 c on c.ID = b.ID AND c.Row = b.Row
What we cannot do is rely on the order of the records on disk or the insertion order. There MUST be some field to indicate, e.g. the iR3000 row in TableB relates to the aaaaaa row in TableC rather than the bbbbbb row.
The order records appear in the table is not good enough. Databases are based on relational set theory, so what we think of as "Tables" are more-formally defined as "Unordered Relations". Note the word "unordered" in that definition. While table order may seem to be stable over stretches, databases are free to re-ordered the rows on disk after insertion. They can and will do this to make queries more efficient, conform better with indexes, fill up pages, etc.

SELECT DISTINCT on a ordered subquery's table

I'm working on a problem, involving these two tables.
books
isbn | title | author
------------+-----------------------------------------+------------------
1840918626 | Hogwarts: A History | Bathilda Bagshot
3458400871 | Fantastic Beasts and Where to Find Them | Newt Scamander
9136884926 | Advanced Potion-Making | Libatius Borage
transactions
id | patron_id | isbn | checked_out_date | checked_in_date
----+-----------+------------+------------------+-----------------
1 | 1 | 1840918626 | 2012-05-05 | 2012-05-06
2 | 4 | 9136884926 | 2012-05-05 | 2012-05-06
3 | 2 | 3458400871 | 2012-05-05 | 2012-05-06
4 | 3 | 3458400871 | 2018-04-29 | 2018-05-02
5 | 2 | 9136884926 | 2018-05-03 | NULL
6 | 1 | 3458400871 | 2018-05-03 | 2018-05-05
7 | 5 | 3458400871 | 2018-05-05 | NULL
the query "Make a list of all book titles and denote whether or not a copy of that book is checked out." so pretty much just the first table with a checked out column.
im trying to SELECT DISTINCT on a sub query with the checkout books first, but that doesn't work. I've researched and others say to accomplish this use a GROUP BY clause instead of DISTINCT but the examples they provide are one column queries and when more columns are added it doesn't work.
this is my closest attempt
SELECT DISTINCT ON (title)
title, checked_out
FROM(
SELECT b.title, t.checked_in_date IS NULL AS checked_out
FROM transactions t
natural join books b
ORDER BY checked_out DESC
) t;
or you can join only transactions where books are not checked in:
SELECT b.title, t.isbn IS NOT NULL AS checked_out
, t.checked_out_date
FROM books b
LEFT JOIN transactions t ON t.isbn = b.isbn AND t.checked_in_date IS NULL
ORDER BY checked_out DESC
I adjusted your attempt a little bit. Basically I changed the way your data is joined
SELECT DISTINCT ON (title)
title, checked_out
FROM(
SELECT b.title, t.checked_in_date IS NULL AS checked_out
FROM books b
LEFT OUTER JOIN transactions t USING (isbn)
ORDER BY checked_out DESC
) t;

How to order rows with linked parts in PostgreSQL

I have a table A with columns: id, title, condition
And i have another table B with information about position for some rows from table A. Table B have columns id, next_id, prev_id
How to sort rows from A based on information from table B?
For example,
Table A
id| title
---+-----
1 | title1
2 | title2
3 | title3
4 | title4
5 | title5
Table B
id| next_id | prev_id
---+-----
2 | 1 | null
5 | 4 | 3
I want to get this result:
id| title
---+-----
2 | title2
1 | title1
3 | title3
5 | title5
4 | title4
And after apply this sort, i want to sort by condition column yet.
I've already spent a lot of time looking for a solution, and hope for your help.
You have to add weights to your data, so you can order accordingly. This example uses next_id, not sure if you need to use prev_id, you don't explain the use of it.
Anyway, here's a code example:
-- Temporal Data for the test:
CREATE TEMP TABLE table_a(id integer,tittle text);
CREATE TEMP TABLE table_b(id integer,next_id integer, prev_id integer);
INSERT INTO table_a VALUES
(1,'title1'),
(2,'title2'),
(3,'title3'),
(4,'title4'),
(5,'title5');
INSERT INTO table_b VALUES
(2,1,null),
(5,4,3);
-- QUERY:
SELECT
id,tittle,
CASE -- Adding weight
WHEN next_id IS NULL THEN (id + 0.1)
ELSE next_id
END AS orden
FROM -- Joining tables
(SELECT ta.*,tb.next_id
FROM table_a ta
LEFT JOIN table_b tb
ON ta.id=tb.id)join_a_b
ORDER BY orden
And here's the result:
id | tittle | orden
--------------------------
2 | title2 | 1
1 | title1 | 1.1
3 | title3 | 3.1
5 | title5 | 4
4 | title4 | 4.1

Postgresql summing duplicate elements

In the table can exsist 2 lines that give the same information only a single column value is different. Basically the data is duplicated because of this 1 column. Can I somehow sum otherelement in such a manner that it takes this duplication into account ?
To illustrate the idea of the problem
Example:
|id|type|val1|val2|
|1 | 2 | 1 | 1 |
|1 | 3 | 1 | 1 |
|1 | 2 | 2 | 2 |
|1 | 3 | 2 | 2 |
Expected result
|id|type|val1|val2|count|
|1 |2,3 | 3 | 3 | 2 |
Actual result
|id|type|val1|val2|count|
|1 |2,3 | 6 | 6 | 4 |
In the actual data the type and val come from 2 different tables connected by 3rd table, so the query is like this:
SELECT id,
array_to_string(array_agg(DISTINCT x.type ORDER BY x.type), ','::text) AS type,
sum(y.val1) AS val1,
sum(y.val2) AS val2,
count(y.val1) AS count
FROM a
JOIN x ON x.a_id = a.id AND x.active = true
JOIN y ON y.a_id = a.id AND y.active = true
GROUP BY a.id
SOLUTION
SELECT id,
array_to_string(array_agg(DISTINCT x.type ORDER BY x.type), ','::text) AS type,
sum(distinct y.val1) AS val1,
sum(distinct y.val2) AS val2,
count(distinct y.val1) AS count
FROM a
JOIN x ON x.a_id = a.id AND x.active = true
JOIN y ON y.a_id = a.id AND y.active = true
GROUP BY a.id

postgresql Recycle ID numbers

I have a table as follows
|GroupID | UserID |
--------------------
|1 | 1 |
|1 | 2 |
|1 | 3 |
|2 | 1 |
|2 | 2 |
|3 | 20 |
|3 | 30 |
|5 | 200 |
|5 | 100 |
Basically what this does is create a "group" which user IDs get associated with, so when I wish to request members of a group I can call on the table.
Users have the option of leaving a group, and creating a new one.
When all users have left a group, there's no longer that groupID in my table.
Lets pretend this is for a chat application where users might close and open chats constantly, the group IDs will add up very quickly, but the number of chats will realistically not reach millions of chats with hundreds of users.
I'd like to recycle the group ID numbers, such that when I goto insert a new record, if group 4 is unused (as is the case above), it gets assigned.
There are good reasons not to do this, but it's pretty straightforward in PostgreSQL. The technique--using generate_series() to find gaps in a sequence--is useful in other contexts, too.
WITH group_id_range AS (
SELECT generate_series((SELECT MIN(group_id) FROM groups),
(SELECT MAX(group_id) FROM groups)) group_id
)
SELECT min(gir.group_id)
FROM group_id_range gir
LEFT JOIN groups g ON (gir.group_id = g.group_id)
WHERE g.group_id IS NULL;
That query will return NULL if there are no gaps or if there are no rows at all in the table "groups". If you want to use this to return the next group id number regardless of the state of the table "groups", use this instead.
WITH group_id_range AS (
SELECT generate_series(
(COALESCE((SELECT MIN(group_id) FROM groups), 1)),
(COALESCE((SELECT MAX(group_id) FROM groups), 1))
) group_id
)
SELECT COALESCE(min(gir.group_id), (SELECT MAX(group_id)+1 FROM groups))
FROM group_id_range gir
LEFT JOIN groups g ON (gir.group_id = g.group_id)
WHERE g.group_id IS NULL;