SQLITE : Optimize ORDER BY Query - iphone

All,
I am iOS developer. Currently we have stored 2.5 lacks data in database. And we have implemented search functionality on that. Below is the query that we are using.
select CustomerMaster.CustomerName ,CustomerMaster.CustomerNumber,
CallActivityList.CallActivityID,CallActivityList.CustomerID,CallActivityList.UserID,
CallActivityList.ActivityType,CallActivityList.Objective,CallActivityList.Result,
CallActivityList.Comments,CallActivityList.CreatedDate,CallActivityList.UpdateDate,
CallActivityList.CallDate,CallActivityList.OrderID,CallActivityList.SalesPerson,
CallActivityList.GratisProduct,CallActivityList.CallActivityDeviceID,
CallActivityList.IsExported,CallActivityList.isDeleted,CallActivityList.TerritoryID,
CallActivityList.TerritoryName,CallActivityList.Hours,UserMaster.UserName,
(FirstName ||' '||LastName) as UserNameFull,UserMaster.TerritoryID as UserTerritory
from
CallActivityList
inner join CustomerMaster
ON CustomerMaster.DeviceCustomerID = CallActivityList.CustomerID
inner Join UserMaster
On UserMaster.UserID = CallActivityList.UserID
where
(CustomerMaster.CustomerName like '%T%' or
CustomerMaster.CustomerNumber like '%T%' or
CallActivityList.ActivityType like '%T%' or
CallActivityList.TerritoryName like '%T%' or
CallActivityList.SalesPerson like '%T%' )
and CallActivityList.IsExported!='2' and CallActivityList.isDeleted != '1'
order by
CustomerMaster.CustomerName
limit 50 offset 0
Without using 'order by' The query is returning result in 0.5 second. But when i am attaching 'order by', Time is increasing to 2 seconds.
I have tried indexing but it is not making any noticeable change. Any one please help. If we are not going through Query then how can we do it fast.
Thanks in advance.

This is due to the the limit. Without ORDER BY only 50 records have to be processed and any 50 will be returned. With ORDER BY all the records have to be processed in order to determine which ones are the first 50 (in order).
The problem is that the ORDER BY is performed on a joined table. Otherise you could apply the limit on the main table (I assume it is the CallActivityList) first and then join.
SELECT ...
FROM
(SELECT ... FROM CallActivityList ORDER BY ... LIMIT 50 OFFSET 0) AS CAL
INNER JOIN CustomerMaster ON ...
INNER JOIN UserMaster ON ...
ORDER BY ...
This would reduce the costs for joining the tables. If this is not possible, try at least to join CallActivityList with CustomerMaster. Apply the limit to those and finally join with UserMaster.
SELECT ...
FROM
(SELECT ...
FROM
CallActivityList
INNER JOIN CustomerMaster ON ...
ORDER BY CustomerMaster.CustomerName
LIMIT 50 OFFSET 0) AS ActCust
INNER JOIN UserMaster ON ...
ORDER BY ...
Also, in order to make the ordering unambiguous, I would include more columns into the order by, like call date and call id. Otherwise this could result in a inconsistent paging.

Related

Extremely slow planning for query with lot of joins in PostgreSQL

(Postgres v13)
I've got a query which takes 2 - 5 seconds to plan. The query joins my languages table and translations table to get translation results for multiple languages. When I add even more languages/translations to load the execution time is exponentially growing.
select
key0_.id as col_0_0_,
key0_.name as col_1_0_,
(select
count(screenshot60_.id)
from
screenshot screenshot60_
inner join
key key61_
on screenshot60_.key_id=key61_.id
where
key0_.id=key61_.id) as col_2_0_,
languages2_.tag as col_3_0_,
translatio31_.id as col_4_0_,
translatio31_.text as col_5_0_,
translatio31_.state as col_6_0_,
translatio31_.auto as col_7_0_,
translatio31_.mt_provider as col_8_0_,
languages3_.tag as col_11_0_,
translatio32_.id as col_12_0_,
translatio32_.text as col_13_0_,
translatio32_.state as col_14_0_,
translatio32_.auto as col_15_0_,
translatio32_.mt_provider as col_16_0_,
... the same over and over many times ...
languages30_.tag as col_227_0_,
translatio59_.id as col_228_0_,
translatio59_.text as col_229_0_,
translatio59_.state as col_230_0_,
translatio59_.auto as col_231_0_,
translatio59_.mt_provider as col_232_0_,
0 as col_233_0_,
0 as col_234_0_
from
key key0_
inner join
project project1_
on key0_.project_id=project1_.id
inner join
language languages2_
on project1_.id=languages2_.project_id
and (
languages2_.tag='en-US'
)
inner join
language languages3_
on project1_.id=languages3_.project_id
and (
languages3_.tag='es-PE'
)
... many times the same ...
inner join
language languages30_
on project1_.id=languages30_.project_id
and (
languages30_.tag='es-MX'
)
left outer join
translation translatio31_
on key0_.id=translatio31_.key_id
and (
translatio31_.language_id=languages2_.id
)
... many times the same ...
left outer join
translation translatio59_
on key0_.id=translatio59_.key_id
and (
translatio59_.language_id=languages30_.id
)
where
(
key0_.name in (
'base_administrative_notes.desc'
)
)
and
key0_.project_id=836
group by
key0_.id ,
languages2_.tag ,
translatio31_.id ,
languages3_.tag ,
translatio32_.id ,
... many times the same ...
languages30_.tag ,
translatio59_.id
order by
key0_.name asc nulls first,
key0_.id asc nulls first limit 1
The visualised EXPLAIN ANALYSE result: https://explain.dalibo.com/plan/uWS (the full query can be found there as well as raw output from explain (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON)).
I found in other threads that this can be caused by using too many indexes on the tables, but I only have a unique index on my translations table on key_id and language_id columns.
EDIT:
I've found out that setting join_collapse_limit to some value between 1 to 5 reduces the planning to under 200ms. Don't know if this is the best solution, but I am going to use it as a workaround for now.
As Laurenz Albe explained, the planner is probably trying to reorder the joins to optimize the query.
With n tables, the number of possible joins order is n! (factorial n).
My suggestion is to :
make sure the order is the best in your query
set that particular parameter to 1 before the query
play the query
reset the parameter
You can check Alicja's slide deck (from slide 22) where she illustrates that particular problem with examples here: https://www.postgresql.eu/events/pgconfeu2017/sessions/session/1617/slides/9/FromMinutesToMilliseconds.pdf

postgresql multiple Common Table Expressions

I have to tables : mutation and ann_commune. The colums code_postal and id_commune are in ann_commune, all the other ones in mutation.
I need to get together the average for the top 3 values of several departments
WITH six as (
SELECT commune as commune_six, AVG(valeur_fonciere) as moyenne_six
FROM mutation
LEFT JOIN ann_commune on mutation.id_commune = ann_commune.id_commune
WHERE code_postal LIKE '06%'
GROUP BY commune_six
ORDER BY moyenne_six DESC
LIMIT 3
),
treize as (
SELECT commune as commune_treize, AVG(valeur_fonciere) as moyenne_treize
FROM mutation
LEFT JOIN ann_commune on mutation.id_commune = ann_commune.id_commune
WHERE code_postal LIKE '13%'
GROUP BY commune_treize
ORDER BY moyenne_treize DESC
LIMIT 3
)
select AVG(moyenne_six)::NUMERIC(7,0) as Moyenne_top3_06,
AVG(moyenne_treize)::NUMERIC(7,0) as Moyenne_top3_13
from six, treize
I am sure there is a way to create a nicer quode, because I have to do the same with 10 averages. I did not find the solutions, but I am sure it exists.
Thank you in advance :).

SQL Server: How to customize one column in Order By clause out of many other columns in order by

I have store procedure which return data fine and it was developed by some one else who now not in touch.
Output now looks like
Here i am attaching a part of the query which return data.
SET #sql = '
Select XX.*,'''' scale,Isnull(AllowComma,''FALSE'') AllowComma,Isnull(AllowedDecimalPlace,''0'') AllowedDecimalPlace,
Isnull(AllowPercentageSign,''FALSE'') AllowPercentageSign,Isnull(CurrencySign,'''') CurrencySign,Isnull(BM_Denominator,'''') BM_Denominator
From
(
---- Broker Detail
Select AA.Section,AA.LineItem,Csm.DisplayInCSM ,AA.BrokerCode Broker,AA.BrokerName,'''' BM_Element,'''' BM_Code,AA.Ord,AA.[Revise Date],AA.LineItemId,
Csm.ID,[FontName],[FontStyle],[FontSize],[UnderLine],[BGColor],[FGColor],[Indent],[Box],[HeadingSubHeading],
'+#PeriodCols+','+#PeriodColsComment +',LineItem_Comment,BrokerName_Comment,Date_Comment
From tblCSM_ModelDetails Csm LEFT OUTER JOIN (
Select b.*,L.ID LineItemId
From #TmpAll_Broker_LI b
INNER JOIN TblLineItemTemplate L ON TickerID='''+#TickerID+''' AND b.LineItem= L.LineItem
) AA ON Csm.LineItemId=AA.LineItemId
WHERE Csm.CSM_ID='+TRIM(CONVERT(CHAR(10),#CSM_Id))+' AND Csm.BMID=0 AND Type !=''SHEET''
UNION
----- Consensus
Select Section, b.LineItem,DisplayInCSM, '''' Broker,'''' BrokerName,'''' BM_Element,'''' BM_Code, Ord,'''' [Revise Date],L.ID LineItemID,
Csm.ID,[FontName],[FontStyle],[FontSize],[UnderLine],[BGColor],[FGColor],[Indent],[Box],[HeadingSubHeading],
'+#PeriodCols+','+#PeriodColsComment +',LineItem_Comment,BrokerName_Comment,Date_Comment
From #TmpZacksCons b
INNER JOIN TblLineItemTemplate L ON TickerID='''+#TickerID+''' AND b.LineItem= L.LineItem
INNER JOIN tblCSM_ModelDetails Csm ON Csm.LineItemID=L.ID
WHERE Csm.CSM_ID='+TRIM(CONVERT(CHAR(10),#CSM_Id))+' AND Csm.BMID=0
---- Blue Metrics
UNION
Select Section, b.LineItem,DisplayInCSM,'''' Broker,'''' BrokerName,BM_Element,Code BM_Code, Ord,'''' [Revise Date],L.ID LineItemID,
Csm.ID,[FontName],[FontStyle],[FontSize],[UnderLine],[BGColor],[FGColor],[Indent],[Box],[HeadingSubHeading],
'+#PeriodCols+','+#PeriodColsComment +',LineItem_Comment,BrokerName_Comment,Date_Comment
From #TmpBM b
INNER JOIN TblLineItemTemplate L ON TickerID='''+#TickerID+''' AND b.LineItem= L.LineItem
INNER JOIN tblCSM_ModelDetails Csm ON Csm.BMID=b.code AND Csm.LineItemID=L.ID
WHERE Csm.CSM_ID='+TRIM(CONVERT(CHAR(10),#CSM_Id))+'
AND Ord IS NOT NULL
) XX
Left Outer Join tblLiConfig ZZ
On XX.Section=ZZ.Section And XX.LineItem=ZZ.LI And ZZ.Ticker='''+#Ticker+'''
Order by ID,Ord,BM_Code,LineItem,BrokerName'
Now broker Name is not coming as alphabetical order and it is the issue.
see this line at the bottom Order by ID,Ord,BM_Code,LineItem,BrokerName
When i try to change this order by like Order by ID,Ord,BM_Code,LineItem,BrokerName IN (SELECT BrokerName FROM #Brokers ORDER BY BrokerName ASC)' then getting error like clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP, OFFSET or FOR XML is also specified.
in my order by there are many columns and data is getting order by that way but i need to show broker name in alphabetical order but i am not being able. so please some one guide me how can i customize this sql.
Here i have not attached my full store procedure code because it is very large. looking for suggestion & help. Thanks
Short version
The ORDER BY is doing what is expected - ordering first by ID, then Ord, then BM_Code, then LineItem, then BrokerName.
Within ID 76187, the next field to order by is Ord - which it sorts from 30911, to 31097.
If it previously ordered by BrokerName, it was only by chance - or that Ord was ordered the same way as BrokerName.
My initial suggestion is to re-order your sort e.g., ORDER BY ID, BM_Code, LineItem, BrokerName, Ord
Longer explanation of issue
In SQL, underlying data is treated as a set and ordering doesn't matter.
For example, if you have a variable #x and you were testing IF #x IN (1,2,3,4,5) will produce the same result as IF #x in (5,4,3,2,1).
In your example, you're putting an ORDER BY into the sub-query you're checking with the IN e.g., ORDER BY ... BrokerName IN (SELECT BrokerName FROM #Brokers ORDER BY BrokerName ASC). The order of that sub-query isn't allowed, and wouldn't do anything anyway.
The only sort that matters (other than for a few things like TOP) is the final sort - when displaying the data.
That being said, even if you removed the ORDER BY in the sub-query, it wouldn't help you with your issue
The SQL is not likely to work anyway - ORDER BY needs a value - you may have needed to make it CASE WHEN BrokerName IN (...) THEN 0 ELSE 1 END
Which also won't help, as the issue is that Ord is sorted before BrokerName anyway.
UPDATE following comment
Fundamentally, the statement that provides the actual report is
SET #sql = '
Select XX.*,'''' scale,Isnull(AllowComma,''FALSE'') AllowComma,Isnull(AllowedDecimalPlace,''0'') AllowedDecimalPlace,
Isnull(AllowPercentageSign,''FALSE'') AllowPercentageSign,Isnull(CurrencySign,'''') CurrencySign,Isnull(BM_Denominator,'''') BM_Denominator
From (<a whole lot of calculations/cpode>) XX
Left Outer Join tblLiConfig ZZ
On XX.Section=ZZ.Section And XX.LineItem=ZZ.LI And ZZ.Ticker='''+#Ticker+'''
Order by ID,Ord,BM_Code,LineItem,BrokerName'
The last line on there provides the ordering of the data coming from this procedure.
To get a different order, you need to change the order of the fields shown - moving BrokerName more towards the start of the list, and Ord towards the end.
e.g.,
SET #sql = '
Select XX.*,'''' scale,Isnull(AllowComma,''FALSE'') AllowComma,Isnull(AllowedDecimalPlace,''0'') AllowedDecimalPlace,
Isnull(AllowPercentageSign,''FALSE'') AllowPercentageSign,Isnull(CurrencySign,'''') CurrencySign,Isnull(BM_Denominator,'''') BM_Denominator
From (<a whole lot of calculations/cpode>) XX
Left Outer Join tblLiConfig ZZ
On XX.Section=ZZ.Section And XX.LineItem=ZZ.LI And ZZ.Ticker='''+#Ticker+'''
Order by ID,BrokerName,BM_Code,LineItem,Ord'
The above probably sorts by BrokerName too early - but it's up to you to determine what you need.

EXISTS in filter returning too many values

I need to write a query that uses EXISTS, rather than IN, so that it will run fast. The filter is being fed so many parameter values that EXISTS seems like the only option. The difference is between a 20+ minute query and a 5 second query.
This is the query I have:
SELECT DISTINCT d.GROUP_NAME
FROM [EMPLOYEE] e JOIN [DATA_FACT] d ON (e.KEY = d.KEY)
WHERE d.DATE BETWEEN #Start and #End
AND EXISTS
(
select '1234567' -- #ID
)
AND e.Location IN (#Location)
ORDER BY d.GROUP_NAME ASC
The problem is that it is returning too many records. Based on the values I'm passing to filter on, I should get 1 row back but instead I am getting 28.
If I remove the EXISTS and add the following then I get the 1 record I need:
AND e.ID IN ('1234567')
Is there a way to fix the query to work with EXISTS so that I get the correct results?
This is essentially what you want if you are going to try to use exists to filter your data_fact table by parameters in your employee table. Not sure how much it's going to improve your performance though when you throw a massive number of employee IDs at it.
SELECT
d.GROUP_NAME
FROM [DATA_FACT] AS d
WHERE d.DATE BETWEEN #Start and #End
AND EXISTS
(
select 1
from EMPLOYEE AS e
WHERE d.[KEY] = e.[KEY]
AND e.[Location] IN (#Location)
AND e.ID IN ('1234567')
)
ORDER BY d.GROUP_NAME ASC

Sequential scan rather than index scan

I have a bunch of tables in postgresql and I run a query as follows
SELECT DISTINCT ON ...some stuff...
FROM "rent_flats" INNER JOIN "rent_flats_linked_users"
ON "rent_flats_linked_users"."rent_flat_id" = "rent_flats"."id"
INNER JOIN "users"
ON "users"."id" = rent_flats_linked_users"."user_id"
INNER JOIN "owners"
ON "owners"."id" = "users"."profile_id" AND "users"."profile_type" = 'Owner'
INNER JOIN "phone_numbers"
ON "phone_numbers"."person_id" = "owners"."id" AND "phone_numbers"."person_type" = 'Owner'
INNER JOIN "phone_number_categories"
ON "phone_number_categories"."id" = "phone_numbers"."phone_number_category_id"
INNER JOIN "localities"
ON "localities"."id" = "rent_flats"."locality_id"
INNER JOIN "regions"
ON "regions"."id" = "localities"."region_id"
INNER JOIN "cities"
ON "cities"."id" = "regions"."city_id"
INNER JOIN "property_types"
ON "property_types"."id" = "rent_flats"."property_type_id"
INNER JOIN "apartment_types"
ON "apartment_types"."id" = "rent_flats"."apartment_type_id"
WHERE "rent_flats"."status" = 3
AND (((extract(epoch from age(current_date,rent_flats.date_added))/86400)::int) IN (cities.short_period,cities.long_period))
AND (phone_number_categories.name IN ('SMS','SMS & Mobile'))
ORDER BY rf_id, phone_numbers.priority ASC
Note: The rent_flats table contains around 5 million rows, and rent_flats_linked_users contains around 600k rows and users contains 350k rows.Other tables are small in size.
The query takes about 6.8 secs to execute and the explain analyses shows that around 50% of the total time goes in sequential scans of the rent_flats, users and rent_flats_linked_users tables and the other 30% in Hash joins.
On setting seq_scan to off...the query takes even longer to ~11 secs (in this case Hash and Hash join take upto 97.5% of the time)
Here's the explain query plan analyses.
I have put indices on the fields involved in the inner joins as well as on fields involved in the filters like phone_numbers.priority and cities.short_period and cities.long_period. But I still get a sequential scan. What can be the reasons and possible solutions to fasten the query?
I suspect that if there is a part of that query worth optimising then it is this:
(((extract(epoch from age(current_date,rent_flats.date_added))/86400)::int) IN (cities.short_period,cities.long_period))
You really need to turn that into something like:
rent_flats.date_added in (...)
Then you can index date_added, and maybe index (date_added, status).
the next step would be to make sure that the join columns are indexed.