Query Optimization. Why did TOAD do this? - tsql

SQL Server 2008. I have this very large query which has a high cost associated with it. TOAD has Query Tuning functionality in it and the only change made was the following:
Before:
LEFT OUTER JOIN (SELECT RIN_EXT.rejected,
RIN_EXT.scar,
RIN.fcreceiver,
RIN.fcitemno
FROM RCINSP_EXT RIN_EXT
INNER JOIN dbo.rcinsp RIN
ON RIN_EXT.fkey_id = RIN.identity_column) RIN1
ON RCI.freceiver = RIN1.fcreceiver
AND RCI.fitemno = RIN1.fcitemno
WHERE RED.[YEAR] = '2009'
After:
LEFT OUTER JOIN (SELECT RIN_EXT.rejected,
RIN_EXT.scar,
RIN.fcreceiver,
RIN.fcitemno
FROM dbo.rcinsp RIN
INNER JOIN RCINSP_EXT RIN_EXT
ON RIN.identity_column = COALESCE (RIN_EXT.fkey_id , RIN_EXT.fkey_id)) RIN1
ON RCI.freceiver = RIN1.fcreceiver
AND RCI.fitemno >= RIN1.fcitemno -- ***** RIGHT HERE
AND RCI.fitemno <= RIN1.fcitemno
WHERE RED.[YEAR] = '2009'
The field is a char(3) field and this is SQL Server 2008.
Any idea why theirs is so much faster than mine?

You didn't show the ON condition in the "Before" query, so I don't know what TOAD changed. However, I'll take a guess about what happened.
The SQL Server query optimizer uses cost estimates to choose the query plan. The cost estimates are based on rowcount estimates. If the rowcount estimates are not accurate, the optimizer might not choose the best plan.
Some rowcount estimates are typically accurate, like those of the form (column = value) for a column with statistics. However, some rowcount estimates can only be guessed at, like (column = othercolumn) if the columns aren't related by a foreign key constraint, or (expression = value), where the expression isn't trivial or involves more than one column.
When statistics don't guide a rowcount estimate, SQL Server uses generic estimates. If you compare the rowcount estimates in an estimated plan to the actual rowcounts in an actual plan, you can sometimes see the problem and "trick" the optimizer into changing its rowcount estimate.
If you add predicates with AND that don't actually restrict the results, you may lower the rowcount estimate if the optimizer can't recognize that they are superfluous. Similarly, if you add predicates with OR that don't actually yield additional results, you may raise the rowcount estimate.
Perhaps here the rowcount estimate was too high, and the extra predicates are correcting them, resulting in better cost estimates for the plans being considered and a better plan choice in the end.

Looks like an ascending index search argument thing - since it added a >= - We're not seeing enough about the rest of your query, but obviously there is further data about RCI.fitemno which it was able to deduce from the rest of your query.
It's odd that this:
AND RCI.fitemno >= RIN1.fcitemno -- ***** RIGHT HERE
AND RCI.fitemno <= RIN1.fcitemno
was not turned into this:
AND RCI.fitemno = RIN1.fcitemno
Since they are equivalent.

Adding larger-than and smaller-than in a query is an old trick which sometimes nudges the query optimizer to use an index on that column. So this trick:
AND RCI.fitemno >= RIN1.fcitemno
AND RCI.fitemno <= RIN1.fcitemno
forces the database to use indexes on RIN1 and RCI fitemno columns if present. I'm not sure if temporary indexes get created on the fly when you do this.
I used to do these tricks with a DB2 database, and they worked nicely.

Related

Can't count() a PostgreSql table [duplicate]

I need to know the number of rows in a table to calculate a percentage. If the total count is greater than some predefined constant, I will use the constant value. Otherwise, I will use the actual number of rows.
I can use SELECT count(*) FROM table. But if my constant value is 500,000 and I have 5,000,000,000 rows in my table, counting all rows will waste a lot of time.
Is it possible to stop counting as soon as my constant value is surpassed?
I need the exact number of rows only as long as it's below the given limit. Otherwise, if the count is above the limit, I use the limit value instead and want the answer as fast as possible.
Something like this:
SELECT text,count(*), percentual_calculus()
FROM token
GROUP BY text
ORDER BY count DESC;
Counting rows in big tables is known to be slow in PostgreSQL. The MVCC model requires a full count of live rows for a precise number. There are workarounds to speed this up dramatically if the count does not have to be exact like it seems to be in your case.
(Remember that even an "exact" count is potentially dead on arrival under concurrent write load.)
Exact count
Slow for big tables.
With concurrent write operations, it may be outdated the moment you get it.
SELECT count(*) AS exact_count FROM myschema.mytable;
Estimate
Extremely fast:
SELECT reltuples AS estimate FROM pg_class where relname = 'mytable';
Typically, the estimate is very close. How close, depends on whether ANALYZE or VACUUM are run enough - where "enough" is defined by the level of write activity to your table.
Safer estimate
The above ignores the possibility of multiple tables with the same name in one database - in different schemas. To account for that:
SELECT c.reltuples::bigint AS estimate
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname = 'mytable'
AND n.nspname = 'myschema';
The cast to bigint formats the real number nicely, especially for big counts.
Better estimate
SELECT reltuples::bigint AS estimate
FROM pg_class
WHERE oid = 'myschema.mytable'::regclass;
Faster, simpler, safer, more elegant. See the manual on Object Identifier Types.
Replace 'myschema.mytable'::regclass with to_regclass('myschema.mytable') in Postgres 9.4+ to get nothing instead of an exception for invalid table names. See:
How to check if a table exists in a given schema
Better estimate yet (for very little added cost)
This does not work for partitioned tables because relpages is always -1 for the parent table (while reltuples contains an actual estimate covering all partitions) - tested in Postgres 14.
You have to add up estimates for all partitions instead.
We can do what the Postgres planner does. Quoting the Row Estimation Examples in the manual:
These numbers are current as of the last VACUUM or ANALYZE on the
table. The planner then fetches the actual current number of pages in
the table (this is a cheap operation, not requiring a table scan). If
that is different from relpages then reltuples is scaled
accordingly to arrive at a current number-of-rows estimate.
Postgres uses estimate_rel_size defined in src/backend/utils/adt/plancat.c, which also covers the corner case of no data in pg_class because the relation was never vacuumed. We can do something similar in SQL:
Minimal form
SELECT (reltuples / relpages * (pg_relation_size(oid) / 8192))::bigint
FROM pg_class
WHERE oid = 'mytable'::regclass; -- your table here
Safe and explicit
SELECT (CASE WHEN c.reltuples < 0 THEN NULL -- never vacuumed
WHEN c.relpages = 0 THEN float8 '0' -- empty table
ELSE c.reltuples / c.relpages END
* (pg_catalog.pg_relation_size(c.oid)
/ pg_catalog.current_setting('block_size')::int)
)::bigint
FROM pg_catalog.pg_class c
WHERE c.oid = 'myschema.mytable'::regclass; -- schema-qualified table here
Doesn't break with empty tables and tables that have never seen VACUUM or ANALYZE. The manual on pg_class:
If the table has never yet been vacuumed or analyzed, reltuples contains -1 indicating that the row count is unknown.
If this query returns NULL, run ANALYZE or VACUUM for the table and repeat. (Alternatively, you could estimate row width based on column types like Postgres does, but that's tedious and error-prone.)
If this query returns 0, the table seems to be empty. But I would ANALYZE to make sure. (And maybe check your autovacuum settings.)
Typically, block_size is 8192. current_setting('block_size')::int covers rare exceptions.
Table and schema qualifications make it immune to any search_path and scope.
Either way, the query consistently takes < 0.1 ms for me.
More Web resources:
The Postgres Wiki FAQ
The Postgres wiki pages for count estimates and count(*) performance
TABLESAMPLE SYSTEM (n) in Postgres 9.5+
SELECT 100 * count(*) AS estimate FROM mytable TABLESAMPLE SYSTEM (1);
Like #a_horse commented, the added clause for the SELECT command can be useful if statistics in pg_class are not current enough for some reason. For example:
No autovacuum running.
Immediately after a large INSERT / UPDATE / DELETE.
TEMPORARY tables (which are not covered by autovacuum).
This only looks at a random n % (1 in the example) selection of blocks and counts rows in it. A bigger sample increases the cost and reduces the error, your pick. Accuracy depends on more factors:
Distribution of row size. If a given block happens to hold wider than usual rows, the count is lower than usual etc.
Dead tuples or a FILLFACTOR occupy space per block. If unevenly distributed across the table, the estimate may be off.
General rounding errors.
Typically, the estimate from pg_class will be faster and more accurate.
Answer to actual question
First, I need to know the number of rows in that table, if the total
count is greater than some predefined constant,
And whether it ...
... is possible at the moment the count pass my constant value, it will
stop the counting (and not wait to finish the counting to inform the
row count is greater).
Yes. You can use a subquery with LIMIT:
SELECT count(*) FROM (SELECT 1 FROM token LIMIT 500000) t;
Postgres actually stops counting beyond the given limit, you get an exact and current count for up to n rows (500000 in the example), and n otherwise. Not nearly as fast as the estimate in pg_class, though.
I did this once in a postgres app by running:
EXPLAIN SELECT * FROM foo;
Then examining the output with a regex, or similar logic. For a simple SELECT *, the first line of output should look something like this:
Seq Scan on uids (cost=0.00..1.21 rows=8 width=75)
You can use the rows=(\d+) value as a rough estimate of the number of rows that would be returned, then only do the actual SELECT COUNT(*) if the estimate is, say, less than 1.5x your threshold (or whatever number you deem makes sense for your application).
Depending on the complexity of your query, this number may become less and less accurate. In fact, in my application, as we added joins and complex conditions, it became so inaccurate it was completely worthless, even to know how within a power of 100 how many rows we'd have returned, so we had to abandon that strategy.
But if your query is simple enough that Pg can predict within some reasonable margin of error how many rows it will return, it may work for you.
Reference taken from this Blog.
You can use below to query to find row count.
Using pg_class:
SELECT reltuples::bigint AS EstimatedCount
FROM pg_class
WHERE oid = 'public.TableName'::regclass;
Using pg_stat_user_tables:
SELECT
schemaname
,relname
,n_live_tup AS EstimatedCount
FROM pg_stat_user_tables
ORDER BY n_live_tup DESC;
How wide is the text column?
With a GROUP BY there's not much you can do to avoid a data scan (at least an index scan).
I'd recommend:
If possible, changing the schema to remove duplication of text data. This way the count will happen on a narrow foreign key field in the 'many' table.
Alternatively, creating a generated column with a HASH of the text, then GROUP BY the hash column.
Again, this is to decrease the workload (scan through a narrow column index)
Edit:
Your original question did not quite match your edit. I'm not sure if you're aware that the COUNT, when used with a GROUP BY, will return the count of items per group and not the count of items in the entire table.
You can also just SELECT MAX(id) FROM <table_name>; change id to whatever the PK of the table is
In Oracle, you could use rownum to limit the number of rows returned. I am guessing similar construct exists in other SQLs as well. So, for the example you gave, you could limit the number of rows returned to 500001 and apply a count(*) then:
SELECT (case when cnt > 500000 then 500000 else cnt end) myCnt
FROM (SELECT count(*) cnt FROM table WHERE rownum<=500001)
For SQL Server (2005 or above) a quick and reliable method is:
SELECT SUM (row_count)
FROM sys.dm_db_partition_stats
WHERE object_id=OBJECT_ID('MyTableName')
AND (index_id=0 or index_id=1);
Details about sys.dm_db_partition_stats are explained in MSDN
The query adds rows from all parts of a (possibly) partitioned table.
index_id=0 is an unordered table (Heap) and index_id=1 is an ordered table (clustered index)
Even faster (but unreliable) methods are detailed here.

PostgreSQL 11.5 doing sequential scan for SELECT EXISTS query

I have a multi tenant environment where each tenant (customer) has its own schema to isolate their data. Not ideal I know, but it was a quick port of a legacy system.
Each tenant has a "reading" table, with a composite index of 4 columns:
site_code char(8), location_no int, sensor_no int, reading_dtm timestamptz.
When a new reading is added, a function is called which first checks if there has already been a reading in the last minute (for the same site_code.location_no.sensor_no):
IF EXISTS (
SELECT
FROM reading r
WHERE r.site_code = p_site_code
AND r.location_no = p_location_no
AND r.sensor_no = p_sensor_no
AND r.reading_dtm > p_reading_dtm - INTERVAL '1 minute'
)
THEN
RETURN;
END IF;
Now, bare in mind there are many tenants, all behaving fine except 1. In 1 of the tenants, the call is taking nearly half a second rather than the usual few milliseconds because it is doing a sequential scan on a table with nearly 2 million rows instead of an index scan.
My random_page_cost is set to 1.5.
I could understand a sequential scan if the query was returning possibly many rows, checking for the existance of any.
I've tried ANALYZE on the table, VACUUM FULL, etc but it makes no difference.
If I put "SET LOCAL enable_seqscan = off" before the query, it works perfectly... but it feels wrong, but it will have to be a temporary solution as this is a live system and it needs to work.
What else can I do to help Postgres make what is clearly the better decision of using the index?
EDIT: If I do a similar query manually (outside of a function) it chooses an index.
My guess is that the engine is evaluating the predicate and considers is not selective enough (thinks too many rows will be returned), so decides to use a table scan instead.
I would do two things:
Make sure you have the correct index in place:
create index ix1 on reading (site_code, location_no,
sensor_no, reading_dtm);
Trick the optimizer by making the selectivity look better. You can do that by adding the extra [redundant] predicate and r.reading_dtm < :p_reading_dtm:
select 1
from reading r
where r.site_code = :p_site_code
and r.location_no = :p_location_no
and r.sensor_no = :p_sensor_no
and r.reading_dtm > :p_reading_dtm - interval '1 minute'
and r.reading_dtm < :p_reading_dtm

postgres chooses an aweful query plan , how can that be fixed

I'm trying to optimize this query :
EXPLAIN ANALYZE
select
dtt.matching_protein_seq_ids
from detected_transcript_translation dtt
join peptide_spectrum_match psm
on psm.detected_transcript_translation_id =
dtt.detected_transcript_translation_id
join peptide_spectrum_match_sequence psms
on psm.peptide_spectrum_match_sequence_id =
psms.peptide_spectrum_match_sequence_id
WHERE
dtt.matching_protein_seq_ids && ARRAY[654819, 294711]
;
When seq_scan are allowed (set enable_seqscan = on), the optimizer chooses a pretty awful plan that runs in 49.85 seconds :
https://explain.depesz.com/s/WKbew
With set enable_seqscan = off, the plan chosen uses proper indexes and the query runs instantely.
https://explain.depesz.com/s/ISHV
note that I did run a ANALYZE on all three tables...
Your problem is that PostgreSQL cannot estimate the WHERE condition well, so it estimates it as a certain percentage of the estimated total rows, which is way too much.
If you know that there will always few result rows for a query like this, you could cheat by defining a function
CREATE OR REPLACE FUNCTION matching_transcript_translations(integer[])
RETURNS SETOF detected_transcript_translation
LANGUAGE SQL
STABLE STRICT
ROWS 2 /* pretend there are always exactly two matching rows */
AS
'SELECT * FROM detected_transcript_translation
WHERE matching_protein_seq_ids && $1';
You could use that like
select
dtt.matching_protein_seq_ids
from matching_transcript_translations(ARRAY[654819, 294711]) dtt
join peptide_spectrum_match psm
on psm.detected_transcript_translation_id =
dtt.detected_transcript_translation_id
join peptide_spectrum_match_sequence psms
on psm.peptide_spectrum_match_sequence_id =
psms.peptide_spectrum_match_sequence_id;
Then PostgreSQL should be cheated into thinking that there will be exactly one matching row.
However, if there are a lot of matching rows, the resulting plan will be even worse than your current plan is…

Is there a logically equivalent and efficient version of this query without using a CTE?

I have a query on a postgresql 9.2 system that takes about 20s in it's normal form but only takes ~120ms when using a CTE.
I simplified both queries for brevity.
Here is the normal form (takes about 20s):
SELECT *
FROM tableA
WHERE (columna = 1 OR columnb = 2) AND
atype = 35 AND
aid IN (1, 2, 3)
ORDER BY modified_at DESC
LIMIT 25;
Here is the explain for this query: http://explain.depesz.com/s/2v8
The CTE form (about 120ms):
WITH raw AS (
SELECT *
FROM tableA
WHERE (columna = 1 OR columnb = 2) AND
atype = 35 AND
aid IN (1, 2, 3)
)
SELECT *
FROM raw
ORDER BY modified_at DESC
LIMIT 25;
Here is the explain for the CTE: http://explain.depesz.com/s/uxy
Simply by moving the ORDER BY to the outer part of the query reduces the cost by 99%.
I have two questions: 1) is there a way to construct the first query without using a CTE in such a way that it is logically equivalent more performant and 2) what does this difference in performance say about how the planner is determining how to fetch the data?
Regarding the questions above, are there additional statistics or other planner hints that would help improve the performance of the first query?
Edit: Taking away the limit also causes the query to use a heap scan as opposed to an index scan backwards. Without the LIMIT the query completes in 40ms.
After seeing the effect of the LIMIT I tried with LIMIT 1, LIMIT 2, etc. The query performs in under 100ms when using LIMIT 1 and 10s+ with LIMIT > 1.
After thinking about this some more, question 2 boils down to why does the planner use an index scan backwards in one case and a bitmap heap scan + sort in another logically equivalent case? And how can I "help" the planner use the efficient plan in both cases?
Update:
I accepted Craig's answer because it was the most comprehensive and helpful. The way I ended up solving the problem was by using a query that was practically equivalent though not logically equivalent. At the root of the issue was an index scan backwards of the index on modified_at. In order to inform the planner that this was not a good idea I add a predicate of the form WHERE modified_at >= NOW() - INTERVAL '1 year'. This included enough data for the application but prevented the planner from going down the backwards index scan path.
This was a much lower impact solution that prevented the need to rewrite the queries using either a sub query or a CTE. YMMV.
Here's why this is happening, with the following explanation current until at least 9.3 (if you're reading this and on a newer version, check to make sure it hasn't changed):
PostgreSQL doesn't optimize across CTE boundaries. Each CTE clause is run in isolation and its results are consumed by other parts of the query. So a query like:
WITH blah AS (
SELECT * FROM some_table
)
SELECT *
FROM blah
WHERE id = 4;
will cause the full inner query to get executed. PostgreSQL won't "push down" the id = 4 qualification into the inner query. CTEs are "optimization fences" in that regard, which can be both good or bad; it lets you override the planner when you want to, but prevents you from using CTEs as simple syntactic cleanup for a deeply nested FROM subquery chain if you do need push-down.
If you rephrase the above as:
SELECT *
FROM (SELECT * FROM some_table) AS blah
WHERE id = 4;
using a sub-query in FROM instead of a CTE, Pg will push the qual down into the subquery and it'll all run nice and quickly.
As you have discovered, this can also work to your benefit when the query planner makes a poor decision. It appears that in your case a backward index scan of the table is immensely more expensive a bitmap or index scan of two smaller indexes followed by a filter and sort, but the planner doesn't think it will be so it plans the query to scan the index.
When you use the CTE, it can't push the ORDER BY into the inner query, so you're overriding its plan and forcing it to use what it thinks is an inferior execution plan - but one that turns out to be much better.
There's a nasty workaround that can be used for these situations called the OFFSET 0 hack, but you should only use it if you can't figure out a way to make the planner do the right thing - and if you have to use it, please boil this down to a self-contained test case and report it to the PostgreSQL mailing list as a possible query planner bug.
Instead, I recommend first looking at why the planner is making the wrong decision.
The first candidate is stats / estimates problems, and sure enough when we look at your problematic query plan there's a factor of 3500 mis-estimation of the expected result rows. That's big, but not impossibly big, though it's more interesting that you actually only get one row where the planner is expecting a non-trivial row set. That doesn't help us much, though; if the row count is lower than expected that means that choosing to use the index was a better choice than expected.
The main issue looks like it's not using the smaller, more selective indexes sierra_kilo and papa_lima because it sees the ORDER BY and thinks that it'll save more time doing a backward index scan and avoiding the sort than it really does. That makes sense given that there's only one matching row to sort! If it got the expected 3500 rows then it might've made more sense to avoid the sort, though that's still a fairly small rowset to just sort in memory.
Do you set any parameters like enable_seqscan, etc? If you do, unset them; they're for testing only and totally inappropriate for production use. If you aren't using the enable_ params I think it's worth raising this on the PostgreSQL mailing list pgsql-perform. The anonymized plans make this a bit difficult, though, especially since there's no gurantee that identifiers from one plan refer to the same objects in the other plan, and they don't match what you wrote in the query on the question. You'll want to produce a properly hand-done version where everything matches up before asking on the mailing list.
There's a fairly good chance that you'll need to provide the real values for anyone to help. If you don't want to do that on a public mailing list, there's another option available. (I should note that I work for one of them, per my profile).
Just a shot in the dark, but what happens if you run this
SELECT *
FROM (
SELECT *
FROM tableA
WHERE (columna = 1 OR columnb = 2) AND
atype = 35 AND
aid IN (1, 2, 3)
) AS x
ORDER BY modified_at DESC
LIMIT 25;

record order in T-sql changing with statistics update

I am facing an issue with the record order in the given query.
SELECT
EA.eaid, --int , PK of table1
EA.an, --varchar(max)
EA.dn, --varchar(max)
ET.etid, --int
ET.st --int
FROM dbo.table1 EA
JOIN dbo.table2 ET ON EA.etid = ET.etid
JOIN #tableAttribute TA ON EA.eaid = TA.id -- TA.id is int and is not a PK
ORDER BY ET.st
The value of ET.st column is same for all records in the given scenario.
The order of records given by the query is changing randomly on updating statistics.
Sometimes it is in order of EA.eaid and sometimes in the order of TA.id.
Please provide an explanation for such a behaviour.How is the statistics affecting the ordering here?
I am using sql server 2008 R2.
The order of rows returned from a database query is undefined unless specified by an ORDER BY clause. Since you are only ordering by ET.st and all values of this column are the same, the results will be returned in a non-deterministic order (based on the plan determined by the optimizer and the order of indexes used). Updating index statistics allows the query optimizer to choose the best (usually the most deterministic) indexes; it is likely that the query plan has changed as a result which is causing a different ordering to come out.
It sounds to me like you want to order by something other than ET.st.