difficult (for me) postgres sql query - postgresql

Here are the tables I have:
AB tuple table
C table which has entries with A.id, B.id, C.units
D table which has entries with C.id
I want to count all the entries in D table which have a C.id that has the same A.id and B.id and subtract that count from the sum of all C.units that have the same A.id and B.id as a new column "difference"
So I want the query to return the "difference", the common A.id and the common B.id in a single line
It should also return an entry if the count is 0 and the "difference" will just be be equal to sum(C.units)
For example
D table
D.id = 1, open=true, D.CID = 2
D.id = 2, open=true, D.CID = 3
D.id = 3, open=true, D.CID = 3
D.id = 4, open=true, D.CID = 4
C table
C.id = 2, A.id = 3, B.id = 5, units =4
C.id = 3, A.id = 3, B.id = 5, units = 6
C.id = 4, A.id = 4, B.id = 6, units = 8
C.id = 5, A.id = 4, B.id = 6, units = 10
Bc the first 3 entries in D have CID's with the same AID and BID they are counted in the same entry. Also, the C entries that have the same A.id and B.id have their units summed. Even when a C entry has no associated D entry. Therefore, the query should return the following 2 entries:
1. difference = (6+4)-3 = 7 A.id = 3 B.id = 5
2. difference = (10+8)-1 = 17 A.id = 4 B.id = 6

Setup (which you really should include in your question):
CREATE TABLE c
(
id int NOT NULL PRIMARY KEY,
aid int NOT NULL,
bid int NOT NULL,
units int NOT NULL
);
CREATE TABLE d
(
id int NOT NULL PRIMARY KEY,
open boolean NOT NULL,
cid int NOT NULL
);
INSERT INTO c VALUES (2,3,5,4),(3,3,5,6),(4,4,6,8),(5,4,6,10),(6,7,8,9);
INSERT INTO d VALUES (1,true,2),(2,true,3),(3,true,3),(4,true,4);
It's a little hard to understand the question, but I think you might be looking for something like this:
WITH n AS (
SELECT aid, bid, count(*) AS cnt
from c
JOIN d ON (d.cid = c.id)
GROUP BY aid, bid
)
SELECT aid, bid, sum(c.units) - COALESCE(n.cnt, 0) AS difference
FROM c
LEFT JOIN n USING (aid, bid)
GROUP BY aid, bid, n.cnt
ORDER BY aid, bid;
I get these results:
aid | bid | difference
-----+-----+------------
3 | 5 | 7
4 | 6 | 17
7 | 8 | 9
(3 rows)

Related

Finding set of rows in table based on matching rows from another table

I know the topic is a bit vague at best, but cannot find a way to describe my problem better...
An example, I have the following two tables:
TableA
IdA
Code
Value
123
A
1
123
B
2
123
C
3
456
A
4
456
F
6
456
E
7
...
TableB
IdB
Code
Value
X
A
1
X
B
2
X
C
3
Y
G
2
Y
D
8
Y
C
3
Z
A
1
Z
B
2
Z
C
3
Z
D
5
...
A set of records for a given IdA in TableA correlates to an equivalent set of records in TableB having a specific IdB.
For instance, for IdA = 123 in TableA, I have exactly three rows with certain codes and values, this would "map" to rows with IdB = X in TableB because it has the same combination of Codes and Values and the same number of rows. Note that it would not map to IdB = Z in TableB, because it has an additional row for Code D which IdA = 123 doesn't have in TableA.
Given only IdA, how to best write a query to find IdB?
If the codes and values were known, I could have done something similar to this:
SELECT b.IdB FROM TableB b
WHERE
EXISTS(SELECT * FROM TableB x WHERE x.IdB = b.IdB AND x.Code = 'A' AND x.Value = '1') AND
EXISTS(SELECT * FROM TableB x WHERE x.IdB = b.IdB AND x.Code = 'B' AND x.Value = '2') AND
EXISTS(SELECT * FROM TableB x WHERE x.IdB = b.IdB AND x.Code = 'C' AND x.Value = '3') AND
(SELECT COUNT(*) FROM TableB x WHERE x.IdB = b.IdB) = 3
But now I'm only given a value for IdA, so I need to look up values from TableA and combine that in the query for TableB. Any clever ideas on how to tackle this?
This is a question of Relational Division Without Remainder.
There are many solutions, here is one:
Take TableB and left join TableA to it
But calculate a total over the whole set of values from A
Group by IdB
Filter so we only have rows where the total count is equal to the number of matches to A (because COUNT(IdA) only counts non-nulls) and the total count must also be the same as the total number of rows that we want to match to.
DECLARE #idA int = 123;
SELECT
b.IdB
FROM TableB b
LEFT JOIN (
SELECT *,
total = COUNT(*) OVER ()
FROM TableA a
WHERE a.IdA = #idA
) a ON b.Code = a.Code AND b.Value = a.Value
GROUP BY
b.IdB
HAVING COUNT(*) = COUNT(a.IdA)
AND COUNT(*) = MIN(a.total);
db<>fiddle

What is the equivalent pg SQL for this Mysql statement (inner join in UPDATE statement)?

Mysql:
UPDATE a INNER JOIN b on a.b_id = b.id SET n=1 WHERE b.n > 2
Postgresql (I know):
UPDATE a SET n=1 FROM b WHERE b.n > 2 AND a.b_id = b.id
But what are the equivalent pg statements for:
UPDATE a OUTER JOIN b on a.b_id = b.id SET n=1 WHERE b.n > 2
UPDATE a LEFT JOIN b on a.b_id = b.id SET n=1 WHERE b.n > 2
More generally, what's the equivalent pg statement if I have several inner join tables (e.g. 3 tables)in Mysql like:
UPDATE a
INNER JOIN b on a.b_id = b.id
INNER JOIN c on b.c_id = c.id
INNER JOIN d on c.d_id = d.id
SET n=1 WHERE d.n > 2
Generally, you can create a subQuery like this (which is very flexible and clear):
UPDATE tblA
SET colA = subQuery.colA
FROM (
SELECT tblA.id, tblA.colA
FROM tblA
INNER JOIN tblB AS b ON b.id = tblA.b_id
INNER JOIN tblC AS c ON c.id = b.c_id
WHERE c.someData > 100
) AS subQuery
WHERE tblA.id = subQuery.id
what's the use of the left join if you're filtering it anyway using n > 2 ?
Table a:
id | firstname | b_id
1 | elisabeth | 2
2 | sam | 2
3 | john | 3
table b:
id | surname
2 | smith
3 | doe
UPDATE a LEFT JOIN b on a.b_id = b.id SET firstname = null WHERE b.id > 2
Only john doe will be updated.
As for this one:
UPDATE a
INNER JOIN b on a.b_id = b.id
INNER JOIN c on b.c_id = c.id
INNER JOIN d on c.d_id = d.id
SET n=1 WHERE d.n > 2
In postgres :
UPDATE a
SET n=1
FROM b, c, d
WHERE a.b_id = b.id
AND b.c_id = c.id
AND c.d_id = d.id
AND d.n > 2

Postgres sql query difficult join

Here are the tables I have:
Table A which has entries with "item" and "grade" fields
Table B which has entries with A.id
Tuple table B-C
I want all the A entries that have item= "x" and grade = "y"
And all the C entries that are associated with a B entry that is associated with an A entry that has item = "x" and grade = "y"
For example
A table:
A.item = "x", A.Grade = "y", A.id = 1
A.item = "x", A.Grade = "y", A.id = 2
A.item = "x", A.Grade = "y", A.id = 3
A.item = "r", A.Grade = "z", A.id = 4
B Table
B.AID = 1, B.id = 10
B.AID = 1, B.id = 11
B.AID = 2, B.id = 13
B.AID = 3, B.id = 14
B.AID = 4, B.id = 15
B-C Tuple Table
BID = 10, CID = 20
BID = 11, CID = 20
BID = 13, CID = 20
BID = 15, CID = 21
The query should return all the entries in the A table and the entry 20 but not 21 in the C table because C.id = 21 is only tupled with a B that is associated with an A that does not meet the item and grade requirements.
The associations, while sounding complicated in written form, are just a simple join among three tables: a joins to b joins to c.
You identify how the columns need to be joined: "a B entry that is associated with an A entry", and looking at the columns sounds like you want to join on b.aid = a.id. Similarly for b and c.
SELECT ...
FROM
a
JOIN b ON b.aid = a.id
JOIN b_c ON b_c.bid = b.id
WHERE
...
This constructs the original dataset before it was split into the three normalised tables.
The next step is to filter by the given conditions. You only want rows where " item = "x" and grade = "y"", so add those to WHERE clause prefixing them with the table name, which is optional in this case):
WHERE
a.item = 'x'
AND a.grade = 'y'
Finally, you can pick which columns you really need, in the SELECT clause. I'm gussing SELECT b_c.cid would do. Though if you also have a c table you might want to join on that table, too, and select columns from it.

write tsql query in entity

I have a table and it's data looks like this:
id name date
--------- --------- ----------
1 a 2012-08-30 10:36:27.393
1 b 2012-08-30 14:36:27.393
2 c 2012-08-30 13:36:27.393
2 d 2012-08-30 16:36:27.393
I retrieve the max date time with this query:
select
t1.id
,t1.name
,t1.date
from
table1 t1
inner join (
SELECT id,Max(date) as mymaxdate
FROM table1
group by id
) mt1
on t1.id = mt1.id
and t1.date = mt1.mymaxdate
result:
1 b 2012-08-30 14:36:27.393
2 d 2012-08-30 16:36:27.393
how can write this query in entity?
Thanks
The tricky part is that you also want the name of the item having the max date, otherwise the grouping would be much simpler. But it is possible by an almost 1:1 reproduction of the sql query:
from t in table1
join m in
(from t in table1
group t by t.id into g
select new { g.Key, mymaxdate = g.Max (x => x.date) })
on new { t.id, t.date } equals new { id = m.Key, date = m.mymaxdate }
select t
A join on multiple fields is done by creating anonymous types (new {t.id, t.date} and new {id = m.Key, date = m.mymaxdate}) where the names and types of the properties (id and date) should match.

Duplicates removing [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Delete duplicate records from a SQL table without a primary key
I have data:
SELECT
a
, b
FROM
(
select a = 1, b = 30
union all
select a = 2, b = 50
union all
select a = 3, b = 50
union all
select a = 4, b = 50
union all
select a = 5, b = 60
) t
I have to get output (next (order by a) dublicate records should be excluded from result set):
a b
----------- -----------
1 30
2 50
3 50 -- should be excluded
4 50 -- should be excluded
5 60
SELECT
min(a) as a
, b
FROM
(
select a = 1, b = 30
union all
select a = 2, b = 50
union all
select a = 3, b = 50
union all
select a = 4, b = 50
union all
select a = 5, b = 60
) t
GROUP BY b
ORDER BY a
In oracle I was able to do this using a group by clause, you should be able to do similar.
select min(a), b
from (select 1 a, 30 b
from dual
union all
select 2 a, 50 b
from dual
union all
select 3 a, 50 b
from dual
union all
select 4 a, 50 b
from dual
union all
select 5 a, 60 b from dual)
group by b;
edit: looks like someone else came up with a MS sql solution, I'll leave this here for posterity though.
The easiest way to do this is with a simple GROUP BY:
SELECT
a
, b
INTO #tmp
FROM
(
select a = 1, b = 30
union all
select a = 2, b = 50
union all
select a = 3, b = 50
union all
select a = 4, b = 50
union all
select a = 5, b = 60
) t
SELECT DISTINCT MIN(a) AS a,b
FROM #tmp
GROUP BY b
ORDER BY a