What is index do I need create? - postgresql

I have query which sometimes really slow, how can I speed it up?
SELECT PRODUCTS.ID,
SPECIALPRODUCTGROUPS."id" AS "isProductGroup",
PRODUCTS."OEM",
PRODUCTS.NAME,
MAIN."stockBalance" AS STOCKBALANCE,
PRODUCTS."minShippingRate",
PRODUCTS."externalId",
ARTICLE,
"categoryId",
BRAND,
PRICES."price" AS "price"
FROM PUBLIC."Products" AS PRODUCTS
INNER JOIN PUBLIC."Prices" AS PRICES ON PRODUCTS.ID = PRICES."productId"
AND PRICES."accountId" = 13576
AND PRICES."price" >= 0
AND PRICES."price" <= 337802
INNER JOIN PUBLIC."RegionalWarehouseStockBalances" AS MAIN ON PRODUCTS.ID = MAIN."productId"
AND MAIN."warehouseId" = 1
AND MAIN."stockBalance" > 0
LEFT JOIN PUBLIC."SpecialProductGroups" AS SPECIALPRODUCTGROUPS ON PRODUCTS."productGroupId" = SPECIALPRODUCTGROUPS."productGroupId"
AND SPECIALPRODUCTGROUPS."accountId" = 13576
AND NOW() < SPECIALPRODUCTGROUPS."finishedAt"
WHERE PRODUCTS."active" = TRUE
ORDER BY BRAND ASC
LIMIT 50
There is explain of this query
Explain
I can't add explain in text because stackoverflow complains about the amount of code
Added explain https://explain.depesz.com/s/4UAg
I tried to create indexes on RegionalWarehouseStockBalances, but all my variants doesn't help me
I am using PostgreSQL 12

You need to run
VACUUM prices;
so that the index-only scan has few "heap fetches". That will make all the difference.
Reduce autovacuum_vacuum_scale_factor for that table so that the system vacuums the table frequently.

Related

Select query became very very very slow in postgresql

I have one table which contains "133,072,194" records and I am trying to execute
SELECT COUNT(test)
FROM mytable
WHERE test = false
but it is taking Execution time: 128320.712 ms
I already have indexing on test column. Could you please let me know, what I can optimize or change, so my query became faster?
Because of this, my other select query is also not working.
If there are many rows where test is FALSE, you won't be able to get an exact result faster than with a sequential scan, which is slow for big tables.
If you have only few rows that satisfy the condition, you should create a partial index:
CREATE INDEX mytable_notest_ind ON mytable(id) WHERE NOT test;
(assuming that id is the primary key) and keep mytable autovacuumed often enough that you get an index only scan.
But usually exact results for queries like this are not required.
You could calculate an estimated count from the table statistics with a query like this:
SELECT t.reltuples
* (1 - t.nullfrac)
* mcv.freq AS count_false
FROM pg_stats AS s
CROSS JOIN LATERAL unnest(s.most_common_vals::text::boolean[],
s.most_common_freqs) AS mcv(val, freq)
JOIN pg_class AS t
ON s.tablename = t.relname
AND s.schemaname = t.relnamespace::regnamespace::text
WHERE s.tablename = 'mytable'
AND s.attname = 'test'
AND mcv.val = FALSE;
That would be very fast.
See my blog post for more considerations about the speed of SELECT count(*).

selecting from a view is taking longer than 30+ minutes

I am working on making this view fast enough to fetch the result set in reasonable time which is at the moment taking more than 30+ minutes, going parallel and causing all sorts of pain with increased cpu time. I have identified the problem query but I can't figure out a way to cut the execution time by either re-writing the query or adding appropriate index if needed. We already have clustered index on client_id and non clustered index on the hash_key column in both the tables. Also these respective join tables have close to around 238 million records from work_orders and a total of 287011570 records from s_inspections table.
select
wo.client_id,
wo.work_orders_hash_key,
wo.work_order_number,
wo.work_order_id,
si.inspection_id,
si.inspection_name,
si.inspection_detail,
si.master_inspection_id,
si.master_inspection_detail,
si.status_id,
si.exception,
si.inspection_order,
si.comment,
si.[procedure_id],
si.[flag_id],
si.[asset_id],
si.[asset_name],
si.[inspection_status],
si.[is_removed],
si.[response],
row_number() over(partition by si.work_orders_hash_key, si.inspection_id order by si.dss_version desc) rnk
from
datavault.dbo.h_work_orders wo with (readuncommitted)
join datavault.dbo.s_inspections si with (readuncommitted) on wo.client_id = si.client_id and wo.work_orders_hash_key = si.work_orders_hash_key
where
wo.client_id in (7700876368663, 8800387996408)
Below is the estimated execution plan as it was taking quite sometime so I couldn't provide the actual execution plan.
https://www.brentozar.com/pastetheplan/?id=ryLzvNwUN
Any help would be greatly appreciated.
Your compute scalar is 59% of your query cost.
I would guess it's this line:
row_number() over(partition by si.work_orders_hash_key, si.inspection_id order by si.dss_version desc) rnk
It's estimating 159014000000000 rows!
Whack this line (lot of work to return a row number) and run it again.
maybe this will work to keep you in business since the row_number() was the issue. try:
;with x as (
select
wo.client_id,
wo.work_orders_hash_key,
wo.work_order_number,
wo.work_order_id,
si.inspection_id,
si.inspection_name,
si.inspection_detail,
si.master_inspection_id,
si.master_inspection_detail,
si.status_id,
si.exception,
si.inspection_order,
si.comment,
si.[procedure_id],
si.[flag_id],
si.[asset_id],
si.[asset_name],
si.[inspection_status],
si.[is_removed],
si.[response],
si.dss_version
from
datavault.dbo.h_work_orders wo with (readuncommitted)
join datavault.dbo.s_inspections si with (readuncommitted) on wo.client_id = si.client_id and wo.work_orders_hash_key = si.work_orders_hash_key
where
wo.client_id in (7700876368663, 8800387996408)
)
select
x.client_id,
x.work_orders_hash_key,
x.work_order_number,
x.work_order_id,
x.inspection_id,
x.inspection_name,
x.inspection_detail,
x.master_inspection_id,
x.master_inspection_detail,
x.status_id,
x.exception,
x.inspection_order,
x.comment,
x.[procedure_id],
x.[flag_id],
x.[asset_id],
x.[asset_name],
x.[inspection_status],
x.[is_removed],
x.[response],
row_number() over(partition by x.work_orders_hash_key, x.inspection_id order by x.dss_version desc) rnk
from x;

SQLITE : Optimize ORDER BY Query

All,
I am iOS developer. Currently we have stored 2.5 lacks data in database. And we have implemented search functionality on that. Below is the query that we are using.
select CustomerMaster.CustomerName ,CustomerMaster.CustomerNumber,
CallActivityList.CallActivityID,CallActivityList.CustomerID,CallActivityList.UserID,
CallActivityList.ActivityType,CallActivityList.Objective,CallActivityList.Result,
CallActivityList.Comments,CallActivityList.CreatedDate,CallActivityList.UpdateDate,
CallActivityList.CallDate,CallActivityList.OrderID,CallActivityList.SalesPerson,
CallActivityList.GratisProduct,CallActivityList.CallActivityDeviceID,
CallActivityList.IsExported,CallActivityList.isDeleted,CallActivityList.TerritoryID,
CallActivityList.TerritoryName,CallActivityList.Hours,UserMaster.UserName,
(FirstName ||' '||LastName) as UserNameFull,UserMaster.TerritoryID as UserTerritory
from
CallActivityList
inner join CustomerMaster
ON CustomerMaster.DeviceCustomerID = CallActivityList.CustomerID
inner Join UserMaster
On UserMaster.UserID = CallActivityList.UserID
where
(CustomerMaster.CustomerName like '%T%' or
CustomerMaster.CustomerNumber like '%T%' or
CallActivityList.ActivityType like '%T%' or
CallActivityList.TerritoryName like '%T%' or
CallActivityList.SalesPerson like '%T%' )
and CallActivityList.IsExported!='2' and CallActivityList.isDeleted != '1'
order by
CustomerMaster.CustomerName
limit 50 offset 0
Without using 'order by' The query is returning result in 0.5 second. But when i am attaching 'order by', Time is increasing to 2 seconds.
I have tried indexing but it is not making any noticeable change. Any one please help. If we are not going through Query then how can we do it fast.
Thanks in advance.
This is due to the the limit. Without ORDER BY only 50 records have to be processed and any 50 will be returned. With ORDER BY all the records have to be processed in order to determine which ones are the first 50 (in order).
The problem is that the ORDER BY is performed on a joined table. Otherise you could apply the limit on the main table (I assume it is the CallActivityList) first and then join.
SELECT ...
FROM
(SELECT ... FROM CallActivityList ORDER BY ... LIMIT 50 OFFSET 0) AS CAL
INNER JOIN CustomerMaster ON ...
INNER JOIN UserMaster ON ...
ORDER BY ...
This would reduce the costs for joining the tables. If this is not possible, try at least to join CallActivityList with CustomerMaster. Apply the limit to those and finally join with UserMaster.
SELECT ...
FROM
(SELECT ...
FROM
CallActivityList
INNER JOIN CustomerMaster ON ...
ORDER BY CustomerMaster.CustomerName
LIMIT 50 OFFSET 0) AS ActCust
INNER JOIN UserMaster ON ...
ORDER BY ...
Also, in order to make the ordering unambiguous, I would include more columns into the order by, like call date and call id. Otherwise this could result in a inconsistent paging.

Sequential scan rather than index scan

I have a bunch of tables in postgresql and I run a query as follows
SELECT DISTINCT ON ...some stuff...
FROM "rent_flats" INNER JOIN "rent_flats_linked_users"
ON "rent_flats_linked_users"."rent_flat_id" = "rent_flats"."id"
INNER JOIN "users"
ON "users"."id" = rent_flats_linked_users"."user_id"
INNER JOIN "owners"
ON "owners"."id" = "users"."profile_id" AND "users"."profile_type" = 'Owner'
INNER JOIN "phone_numbers"
ON "phone_numbers"."person_id" = "owners"."id" AND "phone_numbers"."person_type" = 'Owner'
INNER JOIN "phone_number_categories"
ON "phone_number_categories"."id" = "phone_numbers"."phone_number_category_id"
INNER JOIN "localities"
ON "localities"."id" = "rent_flats"."locality_id"
INNER JOIN "regions"
ON "regions"."id" = "localities"."region_id"
INNER JOIN "cities"
ON "cities"."id" = "regions"."city_id"
INNER JOIN "property_types"
ON "property_types"."id" = "rent_flats"."property_type_id"
INNER JOIN "apartment_types"
ON "apartment_types"."id" = "rent_flats"."apartment_type_id"
WHERE "rent_flats"."status" = 3
AND (((extract(epoch from age(current_date,rent_flats.date_added))/86400)::int) IN (cities.short_period,cities.long_period))
AND (phone_number_categories.name IN ('SMS','SMS & Mobile'))
ORDER BY rf_id, phone_numbers.priority ASC
Note: The rent_flats table contains around 5 million rows, and rent_flats_linked_users contains around 600k rows and users contains 350k rows.Other tables are small in size.
The query takes about 6.8 secs to execute and the explain analyses shows that around 50% of the total time goes in sequential scans of the rent_flats, users and rent_flats_linked_users tables and the other 30% in Hash joins.
On setting seq_scan to off...the query takes even longer to ~11 secs (in this case Hash and Hash join take upto 97.5% of the time)
Here's the explain query plan analyses.
I have put indices on the fields involved in the inner joins as well as on fields involved in the filters like phone_numbers.priority and cities.short_period and cities.long_period. But I still get a sequential scan. What can be the reasons and possible solutions to fasten the query?
I suspect that if there is a part of that query worth optimising then it is this:
(((extract(epoch from age(current_date,rent_flats.date_added))/86400)::int) IN (cities.short_period,cities.long_period))
You really need to turn that into something like:
rent_flats.date_added in (...)
Then you can index date_added, and maybe index (date_added, status).
the next step would be to make sure that the join columns are indexed.

Sql Server Union Query Optimization

I have given a task to optimize the below sql query. Currently the query is timing out and causing a lot of blocking . I just started using t-sql, so please help me with optimizing the query.
select ExcludedID
from OfferConditions with (NoLock)
where OfferID = 27251
and ExcludedID in (210,223,409,423,447,480,633,...lots and lots of these...,
13346,13362,13380,13396,13407,1,2)
union
select CustomerGroupID as ExcludedID
from CPE_IncentiveCustomerGroups ICG with (NoLock)
inner join CPE_RewardOptions RO with (NoLock)
on RO.RewardOptionID = ICG.RewardOptionID
where RO.IncentiveID = 27251
AND ICG.Deleted = 0 and RO.Deleted = 0 and
and ExcludedUsers = 1
and CustomerGroupID in (210,223,409,423,447,480,633,...lots and lots of these...,
13346,13362,13380,13396,13407,1,2);
You can try to insert those IDs to temp table and join it instead of using IN statement.
The key to solving you problem is NOT to fix the SQL, but to fix indexes on your tables. For example, you should have a compound index on the OfferConditions table with OfferID and ExcludedID.
When you create the indexes on the other tables, remember that if the field is in the where OR the join filter, it should be part of your compound index.