PostgreSQL UPDATE - query with left join problem

PostgreSQL UPDATE - query with left join problem - postgresql

UPDATE user
SET balance = balance + p.amount
FROM payments p WHERE user.id = p.user_id AND p.id IN (36,38,40)
But it adds to the balance, only the value amount of the first payment 1936.
Please help me how to fix it, i do not want to make cycle in the code to run a lot of requests.

In a multiple-table UPDATE, each row in the target table is updated only once, even it's returned more than once by the join.
From the docs:
When a FROM clause is present, what essentially happens is that the target table is joined to the tables mentioned in the fromlist, and each output row of the join represents an update operation for the target table. When using FROM you should ensure that the join produces at most one output row for each row to be modified. In other words, a target row shouldn't join to more than one row from the other table(s). If it does, then only one of the join rows will be used to update the target row, but which one will be used is not readily predictable.
Use this instead:
UPDATE user u
SET balance = balance + p.amount
FROM (
SELECT user_id, SUM(amount) AS amount
FROM payment
WHERE id IN (36, 38, 40)
GROUP BY
user_id
) p
WHERE u.id = p.user_id

Related

Postgresql finding max transaction_id for each type giving duplicates (when it's not supposed to for PK)

Question as title; So I have a code as shown below to find the ID with highest amount transacted by type of card
SELECT tr.identifier, cc.type, tr.amount as max_amount
FROM credit_cards cc, transactions tr
WHERE (tr.amount, cc.type) IN (SELECT MAX(tr.amount), cc.type
FROM credit_cards cc, transactions tr
WHERE cc.number = tr.number
GROUP BY cc.type)
GROUP BY tr.identifier, cc.type;
When I run the code, I get duplicate transaction_identifier which shouldn't happen since it's the PK of the transactions table; output when I run above code is shown below
ID --------Card type--------------- Max amount
2196 "diners-club-carte-blanche" 1000.62
2196 "visa" 1000.62
11141 "mastercard" 1000.54
2378 "mastercard" 1000.54
e.g. 2196 in above exists for diners carte-blanche not visa;
'mastercard' is correct since 2 different IDs can have same max transaction.
However, this code should run because it is possible for 2 different id to have the same max amount for each type.
Does anyone know how to prevent the duplicates from occurring?
is this due to the WHERE ... IN clause which matches either the max amount or the card type? (the ones with duplicate is Visa and Diners-Carte-Blanche which both have same max value of 1000.62 so I think that's where they're matching wrong)

TL/DR: add WHERE cc.number = tr.number to the outer query.
Long version
When you query FROM table_1, table_2 in the outer query and don't connect the tables (via a join or where clause) the result is a cartesian product, meaning EVERY row from table_1 is joined to EVERY row from table_2. This is the same as a CROSS JOIN.
So while your inner query has a where clause and (correctly) returns the max for each credit card type... your outer query does not, and so all possible combinations of credit card and transaction are being compared to the maximums, not just the valid ones.
For example, if cc has rows three rows (mastercard, visa, amex) and tr has three rows (1,2,3) selecting "from cc, tr" is resulting in nine rows:
mastercard,1
mastercard,2
mastercard,3
visa,1
visa,2
visa,3
amex,1
amex,2
amex,3
where what you want is:
mastercard,1
visa,3
amex,2
Each row in the first table will be repeated for each row in the second. Then the WHERE (...) IN (...) restrict this set of rows to only those that match a row in the inner query. As you can imagine, this can easily lead to duplicate results. Some of those duplicates are being removed by the outer GROUP BY, which should not be necessary once this issue is fixed.
As a general rule, I never use join [table_1], [table_2] and prefer to ALWAYS be explicit about doing an inner or outer join (or, in some situations, a cross join) to help avoid this kind of issue and make it clearer to the reader.
SELECT tr.identifier, cc.type, tr.amount as max_amount
FROM credit_cards cc INNER JOIN transactions tr ON (cc.number = tr.number)
WHERE (tr.amount, cc.type) IN (
SELECT MAX(tr.amount), cc.type
FROM credit_cards cc
INNER JOIN transactions tr ON (cc.number = tr.number)
GROUP BY cc.type
)
NOTE: In the case of a tie, this will give you every transaction for each credit card type that is tied for the maximum amount.

Querying Postgres INHERITED tables directly

Postgres allows you to create a table using inheritance. We have a design where we have 1400 tables that inherit from one main table. These tables are for each of our vendor's inventory.
When I want to query stock for a vendor, I just query the main table. When running Explain, the explanation says that it is going through all 1400 indexes and quite a few of the inherited tables. This causes the query to run very slowly. If I query only the vendor's stock table, I cut the query time to less than 50% of the time by querying the main table.
We have a join on another table that pulls identifiers for the vendor's partner vendors and we also want to query their stock. Example:
SELECT
(select m2.company from sup.members m2 where m2.id = u.id) as company,
u.id,
u.item,
DATE_PART('day', CURRENT_TIMESTAMP - u.datein::timestamp) AS daysinstock,
u.grade as condition,
u.stockno AS stocknumber,
u.ic,
CASE WHEN u.rprice > 0 THEN
u.rprice
ELSE
NULL
END AS price,
u.qty
FROM pub.net u
LEFT JOIN sup.members m1
ON m1.id = u.id OR u.id = any(regexp_split_to_array(m1.partnerslist,','))
WHERE u.ic in ('01036') -- part to query
AND m1.id = 'N40' -- vendor to query
The n40_stock table has stock for the vendor with id = N40 and N40's partner vendors (partnerslist) are G01, G06, G21, K17, N49, V02, M16 so I would also want
to query the g01_stock, g06_stock, g21_stock, k17_stock, n49_stock, v02_stock, and m16_stock tables.
I know about the ONLY clause but is there away to modify this query to get the data from ONLY the specific inherited tables?
Edit
This decreases the time to under 800ms, but I'd like it less:
WITH cte as (
SELECT partnerslist as a FROM sup.members WHERE id = 'N40'
)
SELECT
(select m2.company from sup.members m2 where m2.id = u.id) as company,
u.id,
u.item,
DATE_PART('day', CURRENT_TIMESTAMP - u.datein::timestamp) AS daysinstock,
u.grade as condition,
u.stockno AS stocknumber,
u.ic,
CASE WHEN u.rprice > 0 THEN
u.rprice
ELSE
NULL
END AS price,
u.qty
FROM pub.net u
WHERE u.ic in ('01036') -- part to query
AND u.id = any(regexp_split_to_array('N40,'||(select a from cte), ','))
I cannot retrieve the company from sup.members in the cte because I need the one from the u.id, which is different when the partner changes in the where clause.

Inherited table lookups are based on the actual WHERE clause, which maps to the CHECK table constraint. Simply inheriting tables is not good enough.
https://www.postgresql.org/docs/9.6/static/ddl-partitioning.html
Caveat, you can not use a dynamically created variables where the actual value is not implemented in the raw query. This results in a check of all inherited tables.

Update with leftjoin requires to join the update table itself

I want to update a table with values from another table, which not always exist. So I need to left join the other table. The only way I found is this:
UPDATE lessonentity update
SET title=a.test
FROM lessonentity l
LEFT JOIN (SELECT 'hoho1' test) a ON(true)
where l.lessonid=48552
AND update.lessonid=l.lessonid
My question: Is it possible to left-join another table, without inner-joining (where) the updating-table again?

Yes, but not using an explicit join. In your case, this is sufficient given that a has only one row:
UPDATE lessonentity le
SET title = a.test
FROM (SELECT 'hoho1' test) a
WHERE le.lessonid = 48552;
Normally, there would be an additional condition in the WHERE, connecting a and le, but that is not necessary in this case because the table has a single row.

Postgres: left join with order by and limit 1

I have the situation:
Table1 has a list of companies.
Table2 has a list of addresses.
Table3 is a N relationship of Table1 and Table2, with fields 'begin' and 'end'.
Because companies may move over time, a LEFT JOIN among them results in multiple records for each company.
begin and end fields are never NULL. The solution to find the latest address is use a ORDER BY being DESC, and to remove older addresses is a LIMIT 1.
That works fine if the query can bring only 1 company. But I need a query that brings all Table1 records, joined with their current Table2 addresses. Therefore, the removal of outdated data must be done (AFAIK) in LEFT JOIN's ON clause.
Any idea how I can build the clause to not create duplicated Table1 companies and bring latest address?

Use a dependent subquery with max() function in a join condition.
Something like in this example:
SELECT *
FROM companies c
LEFT JOIN relationship r
ON c.company_id = r.company_id
AND r."begin" = (
SELECT max("begin")
FROM relationship r1
WHERE c.company_id = r1.company_id
)
INNER JOIN addresses a
ON a.address_id = r.address_id
demo: http://sqlfiddle.com/#!15/f80c6/2

Since PostgreSQL 9.3 there is JOIN LATERAL (https://www.postgresql.org/docs/9.4/queries-table-expressions.html) that allows to make a sub-query to join, so it solves your issue in an elegant way:
SELECT * FROM companies c
JOIN LATERAL (
SELECT * FROM relationship r
WHERE c.company_id = r.company_id
ORDER BY r."begin" DESC LIMIT 1
) r ON TRUE
JOIN addresses a ON a.address_id = r.address_id
The disadvantage of this approach is the indexes of the tables inside LATERAL do not work outside.

I managed to solve it using Windows Function:
WITH ranked_relationship AS(
SELECT
*
,row_number() OVER (PARTITION BY fk_company ORDER BY dt_start DESC) as dt_last_addr
FROM relationship
)
SELECT
company.*
address.*,
dt_last_addr as dt_relationship
FROM
company
LEFT JOIN ranked_relationship as relationship
ON relationship.fk_company = company.pk_company AND dt_last_addr = 1
LEFT JOIN address ON address.pk_address = relationship.fk_address
row_number() creates an int counter for each record, inside each window based to fk_company. For each window, the record with latest date comes first with rank 1, then dt_last_addr = 1 makes sure the JOIN happens only once for each fk_company, with the record with latest address.
Window Functions are very powerful and few ppl use them, they avoid many complex joins and subqueries!

SELECT 0 for multiple rows of join

I have an invoices and a payments table.
For each invoice record (which contains an OriginalInvoiceValue field), there can be multiple payments associated in the payments table. In a select that joins the two tables, each invoice record is of course repeated for each occurrence of an associated payment record.
What I would like though, is to have the OriginalInvoiceValue field returned only once per invoice, and then have it return 0 (or NULL) for each additional occurrence of an associated payment record. (Such that if I were to export the data to excel and sum the OriginalInvoiceValue column, I actually get the real total of all invoices, instead of getting it multiplied by each additional occurrence of a payment).
Is this possible in T-SQL?

If Sql Server 2005 or newer, you might assign row numbers to individual payments of an invoice, and left join to invoices selecting actual invoice record for first payment only. Put order by you require in row_number() part; I've chosen PaymentID, but payment date is probably more appropriate.
; with p as (
select *,
row_number() over (partition by InvoiceID
order by PaymentID) rn
from payments
)
select *
from p
left join invoices i
on p.InvoiceID = i.InvoiceID
and p.rn = 1
order by p.InvoiceID, rn
And here is SQL FIDDLE with example.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

PostgreSQL UPDATE - query with left join problem - postgresql

UPDATE user SET balance = balance + p.amount FROM payments p WHERE user.id = p.user_id AND p.id IN (36,38,40) But it adds to the balance, only the value amount of the first payment 1936. Please help me how to fix it, i do not want to make cycle in the code to run a lot of requests.

Related

Postgresql finding max transaction_id for each type giving duplicates (when it's not supposed to for PK)

Querying Postgres INHERITED tables directly

Update with leftjoin requires to join the update table itself

Postgres: left join with order by and limit 1

SELECT 0 for multiple rows of join

Categories

Resources