Optimise With Query in PostgreSQL - postgresql

I have a working PostgreSQL query, but it is taking a considerable amount of time to execute. I need help optimising it.
I have:
Removed inner queries as much as possible.
Removed the unnecessary data from the query.
Created a with query which gets the required data from the beginning
I need help to optimise this query
with data as (
select
e.id,
e.name,
t.barcode,
tt.variant,
t.cost_cents::decimal / 100 as ticket_cost,
t.fee_cents::decimal / 100 as booking_fee
from
tickets t
inner join events e on t.event_id = e.id
inner join ticket_types tt on t.ticket_type_id = tt.id
where
t.status = 2
and e.source in ('source1', 'source2')
)
select
d.name,
count(distinct d.barcode) as issued,
(select count(distinct d2.barcode) from data d2 where d2.id = d.id and d2.variant is null) as sold,
sum(d.ticket_cost) as ticket_revenue,
sum(d.booking_fee) as booking_fees
from
data d
group by
id,
name

Better to detect slow parts with using EXPLAIN .
It will show cost of all parts

You can speed up joins by creating proper indexes.
Also, remove the subquery
(select count(distinct d2.barcode) from data d2 where d2.id = d.id and d2.variant is null)
from the SELECT clause and add a join to d2 table something like this:
select
d.name,
count(distinct d.barcode) as issued,
count(distinct d2.barcode) as sold,
sum(d.ticket_cost) as ticket_revenue,
sum(d.booking_fee) as booking_fees
from
data d
left join data d2 on (d2.id = d.id and d2.variant is null)
group by
d.id,
d.name

Related

How to count amount of results in Postgresql ignoring one to many join

I have a pretty big query with multiple joins, and I need to make also a query with same joins but without limit/offset functionality just to take a total amount of rows for pagination.
I'm going to simplify query here to single join and omit all where clauses.
I have 5 offers and each offer has 2 events.
Main query:
SELECT o.id AS "id",
o.display_name AS "displayName",
o.offer_hidden AS "offerHidden",
o.offer_type AS "offerType",
array_to_json(o.countries) AS "countries",
string_agg(distinct concat(COALESCE(e.event_id::text, ''), ',, ', COALESCE(e.event_tag::text, ''), ',, ',
COALESCE(e.payout::text, '')), ';;') AS "events"
FROM offer o
INNER JOIN (SELECT e.event_id, e.event_tag, e.payout, e.offer_id FROM event e ORDER BY e.offer_id ASC) e
ON e.offer_id = o.id
GROUP BY o.id
As you can see here I'm making concat and string_add for events to get a single record for each offer
Count query:
SELECT count(1) AS count
FROM offer o
INNER JOIN (SELECT e.event_id, e.event_tag, e.payout, e.offer_id FROM event e ORDER BY e.offer_id ASC) e
ON e.offer_id = o.id
Here I'm trying to make a query lightweight as possible omitting all selects and using the only count, but I'm getting count 10 as each offer has 2 events (5*2 = 10).
Question is it possible to make count only by the main table, but still using data from joins for filtering/ordering?
Updated: I know I can add same concat and string_agg to the count query, but this query should be lightweight as it going to query all records with limit/offset
Updated: looks like I found a possible solution using distinct
SELECT count(distinct o.id) AS count
FROM offer o
INNER JOIN (SELECT e.event_id, e.event_tag, e.payout, e.offer_id FROM event e ORDER BY e.offer_id ASC) e
ON e.offer_id = o.id
but not sure is the best way to do it
You should use a semi-join:
SELECT count(*) AS count
FROM offer o
WHERE EXISTS (SELECT 1 FROM event e WHERE e.offer_id = o.id);
That will be faster than using DISTINCT.

Strange Behaviour on Postgresql query

We created a view in Postgres and I am getting strange result.
View Name: event_puchase_product_overview
When I try to get records with *, I get the correct result. but when I try to get specific fields, I get wrong values.
I hope the screens attached here can explain the problem well.
select *
from event_purchase_product_overview
where id = 15065;
select id, departure_id
from event_puchase_product_overview
where id = 15065;
VIEW definition:
CREATE OR REPLACE VIEW public.event_puchase_product_overview AS
SELECT row_number() OVER () AS id,
e.id AS departure_id,
e.type AS event_type,
e.name,
p.id AS product_id,
pc.name AS product_type,
product_date.attribute AS option,
p.upcomming_date AS supply_date,
pr.date_end AS bid_deadline,
CASE
WHEN (pt.categ_id IN ( SELECT unnest(tt.category_ids) AS unnest
FROM ( SELECT string_to_array(btrim(ir_config_parameter.value, '[]'::text), ', '::text)::integer[] AS category_ids
FROM ir_config_parameter
WHERE ir_config_parameter.key::text = 'trip_product_flight.product_category_hotel'::text) tt)) THEN e.maximum_rooms
WHEN (pt.categ_id IN ( SELECT unnest(tt.category_ids) AS unnest
FROM ( SELECT string_to_array(btrim(ir_config_parameter.value, '[]'::text), ', '::text)::integer[] AS category_ids
FROM ir_config_parameter
WHERE ir_config_parameter.key::text = 'trip_product_flight.product_category_flight'::text) tt)) THEN e.maximum_seats
WHEN (pt.categ_id IN ( SELECT unnest(tt.category_ids) AS unnest
FROM ( SELECT string_to_array(btrim(ir_config_parameter.value, '[]'::text), ', '::text)::integer[] AS category_ids
FROM ir_config_parameter
WHERE ir_config_parameter.key::text = 'trip_product_flight.product_category_bike'::text) tt)) THEN e.maximum_bikes
ELSE e.maximum_seats
END AS departure_qty,
CASE
WHEN now()::date > pr.date_end AND po.state::text = 'draft'::text THEN true
ELSE false
END AS is_deadline,
pl.product_qty::integer AS purchased_qty,
pl.comments,
pl.price_unit AS unit_price,
rp.id AS supplier,
po.id AS po_ref,
po.state AS po_state,
po.date_order AS po_date,
po.user_id AS operator,
pl.po_state_line AS line_status
FROM event_event e
LEFT JOIN product_product p ON p.related_departure = e.id
LEFT JOIN product_template pt ON pt.id = p.product_tmpl_id
LEFT JOIN product_category pc ON pc.id = pt.categ_id
LEFT JOIN purchase_order_line pl ON pl.product_id = p.id
LEFT JOIN purchase_order po ON po.id = pl.order_id
LEFT JOIN purchase_order_purchase_requisition_rel prr ON prr.purchase_order_id = po.id
LEFT JOIN purchase_requisition pr ON pr.id = prr.purchase_requisition_id
LEFT JOIN res_partner rp ON rp.id = po.partner_id
LEFT JOIN ( SELECT p_1.id AS product_id,
pav.name AS attribute
FROM product_product p_1
LEFT JOIN product_attribute_value_product_product_rel pa ON pa.prod_id = p_1.id
LEFT JOIN product_attribute_value pav ON pav.id = pa.att_id
LEFT JOIN product_attribute pat ON pat.id = pav.attribute_id
WHERE pat.name::text <> ALL (ARRAY['Date'::character varying, 'Departure'::character varying]::text[])) product_date ON product_date.product_id = p.id
WHERE (p.id IN ( SELECT DISTINCT mrp_bom_line.product_id
FROM mrp_bom_line)) AND p.active
ORDER BY e.id, pt.categ_id, p.id;
If I add new event_event or new product_product I'll get a new definition of row_number in my view, then the column ID of my view is not stable.
at least you can't use row_number as Id of the view,
If you insist to use row_number, you can use the Order By "creation DATE" by this way all new records will be as last lines in the view and this will not change the correspondency between ID (row_number) and other columns.
Hope that helps !
Very likely the execution plan of your query depends on the columns you select. Compare the execution plans!
Your id is generated using the row_number window function. Now window functions are executed before the ORDER BY clause, so the order will depend on the execution plan and hence on the columns you select.
Using row_number without an explicit ordering doesn't make any sense.
To fix that, don't use
row_number() OVER ()
but
row_number() OVER (ORDER BY e.id, pt.categ_id, p.id)
so that you have a reliable ordering.
In addition, you should omit the ORDER BY clause at the end.

Can't solve this SQL query

I have a difficulty dealing with a SQL query. I use PostgreSQL.
The query says: Show the customers that have done at least an order that contains products from 3 different categories. The result will be 2 columns, CustomerID, and the amount of orders. I have written this code but I don't think it's correct.
select SalesOrderHeader.CustomerID,
count(SalesOrderHeader.SalesOrderID) AS amount_of_orders
from SalesOrderHeader
inner join SalesOrderDetail on
(SalesOrderHeader.SalesOrderID=SalesOrderDetail.SalesOrderID)
inner join Product on
(SalesOrderDetail.ProductID=Product.ProductID)
where SalesOrderDetail.SalesOrderDetailID in
(select DISTINCT count(ProductCategoryID)
from Product
group by ProductCategoryID
having count(DISTINCT ProductCategoryID)>=3)
group by SalesOrderHeader.CustomerID;
Here are the database tables needed for the query:
where SalesOrderDetail.SalesOrderDetailID in
(select DISTINCT count(ProductCategoryID)
Is never going to give you a result as an ID (SalesOrderDetailID) will never logically match a COUNT (count(ProductCategoryID)).
This should get you the output I think you want.
SELECT soh.CustomerID, COUNT(soh.SalesOrderID) AS amount_of_orders
FROM SalesOrderHeader soh
INNER JOIN SalesOrderDetail sod ON soh.SalesOrderID = sod.SalesOrderID
INNER JOIN Product p ON sod.ProductID = p.ProductID
HAVING COUNT(DISTINCT p.ProductCategoryID) >= 3
GROUP BY soh.CustomerID
Try this :
select CustomerID,count(*) as amount_of_order from
SalesOrder join
(
select SalesOrderID,count(distinct ProductCategoryID) CategoryCount
from SalesOrderDetail JOIN Product using (ProductId)
group by 1
) CatCount using (SalesOrderId)
group by 1
having bool_or(CategoryCount>=3) -- At least on CategoryCount>=3

Postgres join not respecting outer where clause

In SQL Server, I know for sure that the following query;
SELECT things.*
FROM things
LEFT OUTER JOIN (
SELECT thingreadings.thingid, reading
FROM thingreadings
INNER JOIN things on thingreadings.thingid = things.id
ORDER BY reading DESC LIMIT 1) AS readings
ON things.id = readings.thingid
WHERE things.id = '1'
Would join against thingreadings only once the WHERE id = 1 had restricted the record set down. It left joins against just one row. However in order for performance to be acceptable in postgres, I have to add the WHERE id= 1 to the INNER JOIN things on thingreadings.thingid = things.id line too.
This isn't ideal; is it possible to force postgres to know that what I am joining against is only one row without explicitly adding the WHERE clauses everywhere?
An example of this problem can be seen here;
I am trying to recreate the following query in a more efficient way;
SELECT things.id, things.name,
(SELECT thingreadings.id FROM thingreadings WHERE thingid = things.id ORDER BY id DESC LIMIT 1),
(SELECT thingreadings.reading FROM thingreadings WHERE thingid = things.id ORDER BY id DESC LIMIT 1)
FROM things
WHERE id IN (1,2)
http://sqlfiddle.com/#!15/a172c/2
Not really sure why you did all that work. Isn't the inner query enough?
SELECT t.*
FROM thingreadings tr
INNER JOIN things t on tr.thingid = t.id AND t.id = '1'
ORDER BY tr.reading DESC
LIMIT 1;
sqlfiddle demo
When you want to select the latest value for each thingID, you can do:
SELECT t.*,a.reading
FROM things t
INNER JOIN (
SELECT t1.*
FROM thingreadings t1
LEFT JOIN thingreadings t2
ON (t1.thingid = t2.thingid AND t1.reading < t2.reading)
WHERE t2.thingid IS NULL
) a ON a.thingid = t.id
sqlfiddle demo
The derived table gets you the record with the most recent reading, then the JOIN gets you the information from things table for that record.
The where clause in SQL applies to the result set you're requesting, NOT to the join.
What your code is NOT saying: "do this join only for the ID of 1"...
What your code IS saying: "do this join, then pull records out of it where the ID is 1"...
This is why you need the inner where clause. Incidentally, I also think Filipe is right about the unnecessary code.

Subquery in JPA

I am trying to write the following SQL query as a JPA query. The SQL query works (MySQL database) but I don't know how to translate it. I get a error token right after the first FROM. There are probably other errors here too because I was not able to find any guides on how to do sub-queries in the from part, aliasing and so on.
SQL query
SELECT tbl.* from (
SELECT u.*, COUNT(u.id) AS question_count FROM app_user AS u
INNER JOIN question AS q ON u.id = q.user_id GROUP BY u.id
) AS tbl ORDER BY tbl.question_count DESC LIMIT 10;
JPA query:
SELECT tbl FROM (SELECT u, COUNT(u.id) question_count FROM User u
INNER JOIN u.questions q ON u.id = q.user_id GROUP BY u.id) tbl
ORDER BY tbl.question_count LIMIT 10")
I can't test this with anything right now, but something along the lines of:
final String queryStr = "SELECT u, COUNT(u.id) FROM User u, Questions q WHERE u.id = q.user_id GROUP BY u.id ORDER BY COUNT(u.id) DESC";
Query query = em().createQuery(queryStr);
query.setMaxResults(10);
List<Object[]> results = query.getResultList(); //Index [0] will contain the User-object, [1] will contain Long with result of COUNT(u.id)