Postgres sql query difficult join - postgresql

Here are the tables I have:
Table A which has entries with "item" and "grade" fields
Table B which has entries with A.id
Tuple table B-C
I want all the A entries that have item= "x" and grade = "y"
And all the C entries that are associated with a B entry that is associated with an A entry that has item = "x" and grade = "y"
For example
A table:
A.item = "x", A.Grade = "y", A.id = 1
A.item = "x", A.Grade = "y", A.id = 2
A.item = "x", A.Grade = "y", A.id = 3
A.item = "r", A.Grade = "z", A.id = 4
B Table
B.AID = 1, B.id = 10
B.AID = 1, B.id = 11
B.AID = 2, B.id = 13
B.AID = 3, B.id = 14
B.AID = 4, B.id = 15
B-C Tuple Table
BID = 10, CID = 20
BID = 11, CID = 20
BID = 13, CID = 20
BID = 15, CID = 21
The query should return all the entries in the A table and the entry 20 but not 21 in the C table because C.id = 21 is only tupled with a B that is associated with an A that does not meet the item and grade requirements.

The associations, while sounding complicated in written form, are just a simple join among three tables: a joins to b joins to c.
You identify how the columns need to be joined: "a B entry that is associated with an A entry", and looking at the columns sounds like you want to join on b.aid = a.id. Similarly for b and c.
SELECT ...
FROM
a
JOIN b ON b.aid = a.id
JOIN b_c ON b_c.bid = b.id
WHERE
...
This constructs the original dataset before it was split into the three normalised tables.
The next step is to filter by the given conditions. You only want rows where " item = "x" and grade = "y"", so add those to WHERE clause prefixing them with the table name, which is optional in this case):
WHERE
a.item = 'x'
AND a.grade = 'y'
Finally, you can pick which columns you really need, in the SELECT clause. I'm gussing SELECT b_c.cid would do. Though if you also have a c table you might want to join on that table, too, and select columns from it.

Related

Finding set of rows in table based on matching rows from another table

I know the topic is a bit vague at best, but cannot find a way to describe my problem better...
An example, I have the following two tables:
TableA
IdA
Code
Value
123
A
1
123
B
2
123
C
3
456
A
4
456
F
6
456
E
7
...
TableB
IdB
Code
Value
X
A
1
X
B
2
X
C
3
Y
G
2
Y
D
8
Y
C
3
Z
A
1
Z
B
2
Z
C
3
Z
D
5
...
A set of records for a given IdA in TableA correlates to an equivalent set of records in TableB having a specific IdB.
For instance, for IdA = 123 in TableA, I have exactly three rows with certain codes and values, this would "map" to rows with IdB = X in TableB because it has the same combination of Codes and Values and the same number of rows. Note that it would not map to IdB = Z in TableB, because it has an additional row for Code D which IdA = 123 doesn't have in TableA.
Given only IdA, how to best write a query to find IdB?
If the codes and values were known, I could have done something similar to this:
SELECT b.IdB FROM TableB b
WHERE
EXISTS(SELECT * FROM TableB x WHERE x.IdB = b.IdB AND x.Code = 'A' AND x.Value = '1') AND
EXISTS(SELECT * FROM TableB x WHERE x.IdB = b.IdB AND x.Code = 'B' AND x.Value = '2') AND
EXISTS(SELECT * FROM TableB x WHERE x.IdB = b.IdB AND x.Code = 'C' AND x.Value = '3') AND
(SELECT COUNT(*) FROM TableB x WHERE x.IdB = b.IdB) = 3
But now I'm only given a value for IdA, so I need to look up values from TableA and combine that in the query for TableB. Any clever ideas on how to tackle this?
This is a question of Relational Division Without Remainder.
There are many solutions, here is one:
Take TableB and left join TableA to it
But calculate a total over the whole set of values from A
Group by IdB
Filter so we only have rows where the total count is equal to the number of matches to A (because COUNT(IdA) only counts non-nulls) and the total count must also be the same as the total number of rows that we want to match to.
DECLARE #idA int = 123;
SELECT
b.IdB
FROM TableB b
LEFT JOIN (
SELECT *,
total = COUNT(*) OVER ()
FROM TableA a
WHERE a.IdA = #idA
) a ON b.Code = a.Code AND b.Value = a.Value
GROUP BY
b.IdB
HAVING COUNT(*) = COUNT(a.IdA)
AND COUNT(*) = MIN(a.total);
db<>fiddle

Replace subquery with appropriate join

how can i remove subquery with a join?
SELECT distinct t."groupId" FROM "contacts" c
INNER JOIN
(
SELECT DISTINCT td.* FROM "groups" g
INNER JOIN
"territory" td
ON
td."groupId" = g.id
WHERE g."orgId" = 3
)
t
ON
ST_Intersects(t.points, c."geoPoint")
WHERE c.id = 33 and c."orgId" = 3
There is nothing wrong with a subquery, but you should get rid of the dreaded DISTINCT:
SELECT td."groupId"
FROM territory AS td
WHERE EXISTS (SELECT 1 FROM contacts AS c
WHERE ST_Intersects(td.points, c."geoPoint")
AND c.id = 33
AND c."orgId" = 3)
AND EXISTS (SELECT 1 FROM groups AS g
WHERE td."groupId" = g.id
AND g."orgId" = 3);
If you insist in having no subqueries, use
SELECT DISTINCT t."groupId"
FROM contacts c
INNER JOIN territory td
ON ST_Intersects(td.points, c."geoPoint")
INNER JOIN groups g
ON td."groupId" = g.id
WHERE g."orgId" = 3
AND c.id = 33
AND c."orgId" = 3;
If you need to make sure that the st_intersects function is only called for rows from territory that match the join with groups, you will have to use a subquery. There is no other way to force a join order.

Is there a way to merge these json aggregations?

I am trying to create json object from getting some info from one table, then creating interger arrays from some other tables' id's and adding n > 1 (2 or more) arrays to the json object. I am using Postgres version 10.7:
select json_build_object(
'id', bi.id,
'street', ba.street,
'features1', features1.f1_json_arr,
'features2', features2.f2_json_arr
)
from business.info bi
inner join business.address ba on bi.id = ba.location_id
left outer join (
select f1.location_id,
json_agg(f1_id) as f1_json_arr
from business.features1 as f1
group by f1.location_id
) features1 on bi.id = features1.location_id
left outer join (
select f2.location_id,
json_agg(f2_id) as f2_json_arr
from business.feature2 as f2
group by f2.location_id
) features2 on bi.id = features2.location_id
where bi.id='1234';
which gives me a result as I want, like so:
{
"id": "1234",
"street", "some street",
"features1": [
2,
1
],
"features2": [
3,
2,
1
]
}
Is there a cleaner way to do this? I tried this:
select json_build_object(
'id', bi.id,
'street', ba.street_name,
'features1', f1_and_f2.f1_json_arr,
'features2', f1_and_f2.f2_json_arr
)
from business.info bi
inner join business.address ba
on bi.id = ba.location_id
left outer join (
select f1.location_id,
json_agg(f1_id) as f1_json_arr,
json_agg(f2_id) as f2_json_arr
from business.feature1 as f1
inner join business.feature2 as f2 on f1.location_id = f2.location_id
group by f1.location_id
) f1_and_f2 on bi.id = f1_and_f2.location_id
where bi.id = '1234';
but got a result like this:
{
"id": "1234",
"street_name": "a street",
"features1": [
2,
2,
2,
1,
1,
1
],
"features2": [
3,
2,
1,
3,
2,
1
]
}
SELECT A.*, B.*, C_GROUPED.C_STUFF, D_GROUPED.D_STUFF
FROM A
INNER JOIN B ON B.A_ID = A.ID
LEFT JOIN ( SELECT A_ID, JSON_AGG(STUFF) AS C_STUFF FROM C GROUP BY A_ID ) AS C_GROUPED ON C_GROUPED.A_ID = A.ID
LEFT JOIN ( SELECT A_ID, JSON_AGG(OTHER_STUFF) AS D_STUFF FROM D GROUP BY A_ID ) AS D_GROUPED ON D_GROUPED.A_ID = A.ID
WHERE A.ID = 123;
should return the same result as
SELECT
A.*,
B.*,
( SELECT JSON_AGG(STUFF) FROM C WHERE C.A_ID = A.ID ) AS C_STUFF,
( SELECT JSON_AGG(OTHER_STUFF) FROM D WHERE D.A_ID = A.ID ) AS D_STUFF
FROM A
INNER JOIN B ON B.A_ID = A.ID
WHERE A.ID = 123
In fact, I would expect the second query be faster.
Ps - Since LEFT JOIN and LEFT OUTER JOIN are the same, I would suggest writing them in the same way in your query.

difficult (for me) postgres sql query

Here are the tables I have:
AB tuple table
C table which has entries with A.id, B.id, C.units
D table which has entries with C.id
I want to count all the entries in D table which have a C.id that has the same A.id and B.id and subtract that count from the sum of all C.units that have the same A.id and B.id as a new column "difference"
So I want the query to return the "difference", the common A.id and the common B.id in a single line
It should also return an entry if the count is 0 and the "difference" will just be be equal to sum(C.units)
For example
D table
D.id = 1, open=true, D.CID = 2
D.id = 2, open=true, D.CID = 3
D.id = 3, open=true, D.CID = 3
D.id = 4, open=true, D.CID = 4
C table
C.id = 2, A.id = 3, B.id = 5, units =4
C.id = 3, A.id = 3, B.id = 5, units = 6
C.id = 4, A.id = 4, B.id = 6, units = 8
C.id = 5, A.id = 4, B.id = 6, units = 10
Bc the first 3 entries in D have CID's with the same AID and BID they are counted in the same entry. Also, the C entries that have the same A.id and B.id have their units summed. Even when a C entry has no associated D entry. Therefore, the query should return the following 2 entries:
1. difference = (6+4)-3 = 7 A.id = 3 B.id = 5
2. difference = (10+8)-1 = 17 A.id = 4 B.id = 6
Setup (which you really should include in your question):
CREATE TABLE c
(
id int NOT NULL PRIMARY KEY,
aid int NOT NULL,
bid int NOT NULL,
units int NOT NULL
);
CREATE TABLE d
(
id int NOT NULL PRIMARY KEY,
open boolean NOT NULL,
cid int NOT NULL
);
INSERT INTO c VALUES (2,3,5,4),(3,3,5,6),(4,4,6,8),(5,4,6,10),(6,7,8,9);
INSERT INTO d VALUES (1,true,2),(2,true,3),(3,true,3),(4,true,4);
It's a little hard to understand the question, but I think you might be looking for something like this:
WITH n AS (
SELECT aid, bid, count(*) AS cnt
from c
JOIN d ON (d.cid = c.id)
GROUP BY aid, bid
)
SELECT aid, bid, sum(c.units) - COALESCE(n.cnt, 0) AS difference
FROM c
LEFT JOIN n USING (aid, bid)
GROUP BY aid, bid, n.cnt
ORDER BY aid, bid;
I get these results:
aid | bid | difference
-----+-----+------------
3 | 5 | 7
4 | 6 | 17
7 | 8 | 9
(3 rows)

Duplicates removing [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Delete duplicate records from a SQL table without a primary key
I have data:
SELECT
a
, b
FROM
(
select a = 1, b = 30
union all
select a = 2, b = 50
union all
select a = 3, b = 50
union all
select a = 4, b = 50
union all
select a = 5, b = 60
) t
I have to get output (next (order by a) dublicate records should be excluded from result set):
a b
----------- -----------
1 30
2 50
3 50 -- should be excluded
4 50 -- should be excluded
5 60
SELECT
min(a) as a
, b
FROM
(
select a = 1, b = 30
union all
select a = 2, b = 50
union all
select a = 3, b = 50
union all
select a = 4, b = 50
union all
select a = 5, b = 60
) t
GROUP BY b
ORDER BY a
In oracle I was able to do this using a group by clause, you should be able to do similar.
select min(a), b
from (select 1 a, 30 b
from dual
union all
select 2 a, 50 b
from dual
union all
select 3 a, 50 b
from dual
union all
select 4 a, 50 b
from dual
union all
select 5 a, 60 b from dual)
group by b;
edit: looks like someone else came up with a MS sql solution, I'll leave this here for posterity though.
The easiest way to do this is with a simple GROUP BY:
SELECT
a
, b
INTO #tmp
FROM
(
select a = 1, b = 30
union all
select a = 2, b = 50
union all
select a = 3, b = 50
union all
select a = 4, b = 50
union all
select a = 5, b = 60
) t
SELECT DISTINCT MIN(a) AS a,b
FROM #tmp
GROUP BY b
ORDER BY a