postgres error 42803 with subquery and group by - postgresql

Hi all postgres developer,
below code can run success
select p.* from
(select p.*, count(distinct p1.id) n from TMB p
left join TMB p1 on p.id = p1.pid
left join TUR u on p.id = any(u.jks)
group by p.id) p
join TUR u on u.id = p.uid
but, below code with error message
[42803] ERROR: column "p.xxxx" must appear in the GROUP BY clause or be used in an aggregate function
select p.* from
(select p.*, count(distinct p1.id) n from (select * from TMB) p
left join TMB p1 on p.id = p1.pid
left join TUR u on p.id = any(u.jks)
group by p.id) p
join TUR u on u.id = p.uid
I want to do some where filter on TMB table before left join, so I think can speed up left join.
I think (select * from TMB) is a subquery equal as TMB. I can not understand why this error message. anyone can tell me detail?

The difference is that without the subquery, PostgreSQL can deduce that id is the primary key of tmb, so you need not add all columns of tmb to the GROUP BY clause. With the subquery, PostgreSQL cannot make that deduction, so you have to add all columns.

Related

Is it possible to do a "LIMIT 1" on a left join in Postgres?

I have two tables: one for money and attributes surrounding it (e.g. who earnt it) and a child table for the "ledger" - this contains one or more entries that represent the history of money that has moved.
SELECT SUM(pl.achieved)
FROM payout p
LEFT JOIN payout_ledgers pl ON pl.payout_id = p.id
This query works well when there is only one ledger item, but when more are added the SUM will increase. I want to join only the latest row. So hypothetically:
SELECT SUM(pl.achieved)
FROM payout p
LEFT JOIN payout_ledgers pl ON pl.payout_id = p.id ORDER BY pl.ts DESC LIMIT 1
WHERE ...
ORDER BY ...
LIMIT ...
(which sadly doesn't work)
What I have tried:
Using a subquery works, but is painfully slow given the size of the data set (and other omitted properties and where clauses etc.):
SELECT SUM(pl.achieved)
FROM payout p
LEFT JOIN payout_ledgers pl ON pl.payout_id = p.id AND pl.id = (SELECT id FROM payout_ledgers WHERE payout_id = p.id ORDER BY ts DESC LIMIT 1)
Incidentally, I'm unsure why this subquery is so slow (~12 seconds, as opposed to 150ms with no subquery). I would have expected it to be quicker given that we're only selecting based on the foreign key (payout_id).
Another thing I tried was to do a select from the join - my logic being that if we select from small joined dataset instead of the whole table it would be quicker. However I was met with relation "pl" does not exist error:
SELECT SUM(pl.achieved)
FROM payouts p
LEFT JOIN payout_ledgers pl ON pl.payout_id = p.id
WHERE pl.id = (SELECT id FROM pl ORDER BY ts DESC LIMIT 1)
Thank you in advance for any suggestions. I am also open to suggestions for schema changes that could make this type of logic easier, although my preference would be to try and get the query working since the schema is not easy to change on our production environment.
If you're on Postgres 9.4+, you can use a LEFT JOIN LATERAL (docs)
SELECT SUM(sub.achieved)
FROM payout p
LEFT JOIN LATERAL (SELECT achieved
FROM payout_ledgers pl
WHERE pl.payout_id = p.id
ORDER BY pl.ts DESC LIMIT 1) sub ON true
This will return the sum of the "achieved" field in the most recent entry in payout_ledgers for all payouts.
window functions:
-- using row_number()
SELECT SUM(sss.achieved)
FROM (SELECT pl.achieved
, row_number() OVER (PARTITION BY pl.payout_id, ORDER BY pl.ts DESC)
FROM payouts p
JOIN payout_ledgers pl ON pl.payout_id = p.id
) sss
WHERE sss.rn =1
;
-- using last_value()
SELECT SUM(sss.achieved)
FROM (SELECT
, last_value(achieved) OVER (PARTITION BY pl.payout_id, ORDER BY pl.ts ASC) AS achieved
FROM payouts p
JOIN payout_ledgers pl ON pl.payout_id = p.id
) sss
;
BTW: you do not need the LEFT JOIN (adding no value to the SUM does not change the sum)

Strange Behaviour on Postgresql query

We created a view in Postgres and I am getting strange result.
View Name: event_puchase_product_overview
When I try to get records with *, I get the correct result. but when I try to get specific fields, I get wrong values.
I hope the screens attached here can explain the problem well.
select *
from event_purchase_product_overview
where id = 15065;
select id, departure_id
from event_puchase_product_overview
where id = 15065;
VIEW definition:
CREATE OR REPLACE VIEW public.event_puchase_product_overview AS
SELECT row_number() OVER () AS id,
e.id AS departure_id,
e.type AS event_type,
e.name,
p.id AS product_id,
pc.name AS product_type,
product_date.attribute AS option,
p.upcomming_date AS supply_date,
pr.date_end AS bid_deadline,
CASE
WHEN (pt.categ_id IN ( SELECT unnest(tt.category_ids) AS unnest
FROM ( SELECT string_to_array(btrim(ir_config_parameter.value, '[]'::text), ', '::text)::integer[] AS category_ids
FROM ir_config_parameter
WHERE ir_config_parameter.key::text = 'trip_product_flight.product_category_hotel'::text) tt)) THEN e.maximum_rooms
WHEN (pt.categ_id IN ( SELECT unnest(tt.category_ids) AS unnest
FROM ( SELECT string_to_array(btrim(ir_config_parameter.value, '[]'::text), ', '::text)::integer[] AS category_ids
FROM ir_config_parameter
WHERE ir_config_parameter.key::text = 'trip_product_flight.product_category_flight'::text) tt)) THEN e.maximum_seats
WHEN (pt.categ_id IN ( SELECT unnest(tt.category_ids) AS unnest
FROM ( SELECT string_to_array(btrim(ir_config_parameter.value, '[]'::text), ', '::text)::integer[] AS category_ids
FROM ir_config_parameter
WHERE ir_config_parameter.key::text = 'trip_product_flight.product_category_bike'::text) tt)) THEN e.maximum_bikes
ELSE e.maximum_seats
END AS departure_qty,
CASE
WHEN now()::date > pr.date_end AND po.state::text = 'draft'::text THEN true
ELSE false
END AS is_deadline,
pl.product_qty::integer AS purchased_qty,
pl.comments,
pl.price_unit AS unit_price,
rp.id AS supplier,
po.id AS po_ref,
po.state AS po_state,
po.date_order AS po_date,
po.user_id AS operator,
pl.po_state_line AS line_status
FROM event_event e
LEFT JOIN product_product p ON p.related_departure = e.id
LEFT JOIN product_template pt ON pt.id = p.product_tmpl_id
LEFT JOIN product_category pc ON pc.id = pt.categ_id
LEFT JOIN purchase_order_line pl ON pl.product_id = p.id
LEFT JOIN purchase_order po ON po.id = pl.order_id
LEFT JOIN purchase_order_purchase_requisition_rel prr ON prr.purchase_order_id = po.id
LEFT JOIN purchase_requisition pr ON pr.id = prr.purchase_requisition_id
LEFT JOIN res_partner rp ON rp.id = po.partner_id
LEFT JOIN ( SELECT p_1.id AS product_id,
pav.name AS attribute
FROM product_product p_1
LEFT JOIN product_attribute_value_product_product_rel pa ON pa.prod_id = p_1.id
LEFT JOIN product_attribute_value pav ON pav.id = pa.att_id
LEFT JOIN product_attribute pat ON pat.id = pav.attribute_id
WHERE pat.name::text <> ALL (ARRAY['Date'::character varying, 'Departure'::character varying]::text[])) product_date ON product_date.product_id = p.id
WHERE (p.id IN ( SELECT DISTINCT mrp_bom_line.product_id
FROM mrp_bom_line)) AND p.active
ORDER BY e.id, pt.categ_id, p.id;
If I add new event_event or new product_product I'll get a new definition of row_number in my view, then the column ID of my view is not stable.
at least you can't use row_number as Id of the view,
If you insist to use row_number, you can use the Order By "creation DATE" by this way all new records will be as last lines in the view and this will not change the correspondency between ID (row_number) and other columns.
Hope that helps !
Very likely the execution plan of your query depends on the columns you select. Compare the execution plans!
Your id is generated using the row_number window function. Now window functions are executed before the ORDER BY clause, so the order will depend on the execution plan and hence on the columns you select.
Using row_number without an explicit ordering doesn't make any sense.
To fix that, don't use
row_number() OVER ()
but
row_number() OVER (ORDER BY e.id, pt.categ_id, p.id)
so that you have a reliable ordering.
In addition, you should omit the ORDER BY clause at the end.

What's the difference between these joins?

What's the difference between
SELECT COUNT(*)
FROM TOOL T
LEFT OUTER JOIN PREVENT_USE P ON T.ID = P.TOOL_ID
WHERE
P.ID IS NULL
and
SELECT COUNT(*)
FROM TOOL T
LEFT OUTER JOIN PREVENT_USE P ON T.ID = P.TOOL_ID AND P.ID IS NULL
?
The bottom query is equivalent to
SELECT COUNT(*)
FROM TOOL T
since it is not limiting the result set but rather producing a joined table with a lot of null fields for the right part of the join.
The first query is a left anti join.

how to solve this complicated sql query

these are the five given tables
http://i58.tinypic.com/53wcxe.jpg
this is the recomanded result
http://i58.tinypic.com/2vsrts7.jpg
please help how can i write a query to have this result.
no idea how!!!!
SELECT K.* , COUNT (A.Au_ID) AS AnzahlAuftr
FROM Kunde K
LEFT JOIN Auftrag A ON K.Kd_ID = A.Au_Kd_ID
GROUP BY K.Kd_ID,K.Kd_Firma,K.Kd_Strasse,K.Kd_PLZ,K.Kd_Ort
ORDER BY K.Kd_PLZ DESC;
SELECT COUNT (F.F_ID) AS AnzahlFahrt
FROM Fahrten F
RIGHT JOIN Auftrag A ON A.Au_ID = F.F_Au_ID
SELECT SUM (T.Ts_Strecke) AS SumStrecke
FROM Teilstrecke T
LEFT JOIN Fahrten F ON F.F_ID = T.Ts_F_ID
how to join these 3 in one?
Grouping on Strasse etc. is not necessary and can be quite expensive. What about this approach:
SELECT K.*, ISNULL(Au.AnzahlAuftr,0) AS AnzahlAuftr, ISNULL(Au.AnzahlFahrt,0) AS AnzahlFahrt, ISNULL(Au.SumStrecke,0) AS SumStrecke
FROM Kunde K
LEFT OUTER JOIN
(SELECT A.Au_Kd_ID, COUNT(*) AS AnzahlAuftr, SUM(Fa.AnzahlFahrt1) AS AnzahlFahrt, SUM(Fa.SumStrecke2) AS SumStrecke
FROM Auftrag A LEFT OUTER JOIN
(SELECT F.F_Au_ID, COUNT(*) AS AnzahlFahrt1, SUM(Ts.SumStrecke1) AS SumStrecke2
FROM Fahrten F LEFT OUTER JOIN
(SELECT T.Ts_F_ID, SUM(T.Ts_Strecke) AS SumStrecke1
FROM Teilstrecke T
GROUP BY T.Ts_F_ID) AS Ts
ON Ts.Ts_F_ID = F.F_ID
GROUP BY F.F_Au_ID) AS Fa
ON Fa.F_Au_ID = A.Au_ID
GROUP BY A.Au_Kd_ID) AS Au
ON Au.Au_Kd_ID = K.Kd_ID

How to aggregate calculation in SQL Server?

I have a following script to get the total unit but it gives me an error
Cannot perform an aggregate function on an expression containing an aggregate or a subquery.
Do I need to calculate SUM(ta.Qty) outside the main table?
SELECT
ta.ProductName
, SUM(ta.Total)
, SUM(SUM(ta.Qty) * ta.Unit)
FROM
tableA tA
INNER JOIN
tableB tB on tA.ID = tb.TableAID
INNER JOIN
tableC tc on ta.ID = tc.TableAID
INNER JOIN
tableD td on td.ID = tb.TableBID
GROUP BY
ta.ProductName
Here is a query in the AdventureWorks database that produces the same error (but might make some sense):
SELECT v.Name AS Vendor, SUM(SUM(p.ListPrice*d.OrderQty)+h.Freight)
FROM Production.Product p
INNER JOIN Purchasing.PurchaseOrderDetail d ON p.ProductID = d.ProductID
INNER JOIN Purchasing.PurchaseOrderHeader h ON h.PurchaseOrderID = d.PurchaseOrderID
INNER JOIN Purchasing.Vendor v ON v.BusinessEntityID = h.VendorID
GROUP BY v.Name
And here are two ways that I could rewrite that query to avoid the error:
SELECT v.Name AS Vendor, SUM(x.TotalAmount+h.Freight)
FROM (
SELECT PurchaseOrderID, SUM(p.ListPrice*d.OrderQty) AS TotalAmount
FROM Production.Product p
INNER JOIN Purchasing.PurchaseOrderDetail d ON p.ProductID = d.ProductID
GROUP BY PurchaseOrderID
) x
INNER JOIN Purchasing.PurchaseOrderHeader h ON h.PurchaseOrderID = x.PurchaseOrderID
INNER JOIN Purchasing.Vendor v ON v.BusinessEntityID = h.VendorID
GROUP BY v.Name
SELECT v.Name AS Vendor, SUM(x.TotalAmount+h.Freight)
FROM Purchasing.PurchaseOrderHeader h
INNER JOIN Purchasing.Vendor v ON v.BusinessEntityID = h.VendorID
CROSS APPLY (
SELECT SUM(p.ListPrice*d.OrderQty) AS TotalAmount
FROM Production.Product p
INNER JOIN Purchasing.PurchaseOrderDetail d ON p.ProductID = d.ProductID
WHERE d.PurchaseOrderID=h.PurchaseOrderID
) x
GROUP BY v.Name
The first query uses derived tables and the second one uses CROSS APPLY.