I want to know how the optimizer rewrote the query and how to read the execution plan in PostgreSQL
Here is the sample code.
DROP TABLE ords;
CREATE TABLE ords (
ORD_ID INT NOT NULL,
ORD_PROD_ID VARCHAR(2) NOT NULL,
ETC_CONTENT VARCHAR(100));
ALTER TABLE ords ADD CONSTRAINT ords_PK PRIMARY KEY(ORD_ID);
CREATE INDEX ords_X01 ON ords(ORD_PROD_ID);
INSERT INTO ords
SELECT i
,chr(64+case when i <= 10 then i else 26 end)
,rpad('x',100,'x')
FROM generate_series(1,10000) a(i);
SELECT COUNT(*) FROM ords WHERE ORD_PROD_ID IN ('A','B','C');
DROP TABLE delivery;
CREATE TABLE delivery (
ORD_ID INT NOT NULL,
VEHICLE_ID VARCHAR(2) NOT NULL,
ETC_REMARKS VARCHAR(100));
ALTER TABLE delivery ADD CONSTRAINT delivery_PK primary key (ORD_ID, VEHICLE_ID);
CREATE INDEX delivery_X01 ON delivery(VEHICLE_ID);
INSERT INTO delivery
SELECT i
, chr(88 + case when i <= 10 then mod(i,2) else 2 end)
, rpad('x',100,'x')
FROM generate_series(1,10000) a(i);
analyze ords;
analyze delivery;
This is the SQL I am interested in.
SELECT *
FROM ords a
WHERE ( EXISTS (SELECT 1
FROM delivery b
WHERE a.ORD_ID = b.ORD_ID
AND b.VEHICLE_ID IN ('X','Y')
)
OR a.ORD_PROD_ID IN ('A','B','C')
);
Here is the execution plan
| Seq Scan on portal.ords a (actual time=0.038..2.027 rows=10 loops=1) |
| Output: a.ord_id, a.ord_prod_id, a.etc_content |
| Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR ((a.ord_prod_id)::text = ANY ('{A,B,C}'::text[]))) |
| Rows Removed by Filter: 9990 |
| Buffers: shared hit=181 |
| SubPlan 1 |
| -> Index Only Scan using delivery_pk on portal.delivery b (never executed) |
| Index Cond: (b.ord_id = a.ord_id) |
| Filter: ((b.vehicle_id)::text = ANY ('{X,Y}'::text[])) |
| Heap Fetches: 0 |
| SubPlan 2 |
| -> Index Scan using delivery_x01 on portal.delivery b_1 (actual time=0.023..0.025 rows=10 loops=1) |
| Output: b_1.ord_id |
| Index Cond: ((b_1.vehicle_id)::text = ANY ('{X,Y}'::text[])) |
| Buffers: shared hit=8 |
| Planning: |
| Buffers: shared hit=78 |
| Planning Time: 0.302 ms |
| Execution Time: 2.121 ms
I don't know how the optimizer transformed the SQL.
What is the final SQL the optimizer rewrote?
I have only one EXISTS sub-query in the SQL above, why are there two sub-plans?
What does "hashed Sub-Plan 2" mean?
I would appreciate it if anyone share a little knowledge with me.
You have the misconception that the optimizer rewrites the SQL statement. That is not the case. Rewriting the query is the job of the query rewriter, which for example replaces views with their definition. The optimizer comes up with a sequence of execution steps to compute the result. It produces a plan, not an SQL statement.
The optimizer plans two alternatives: either execute subplan 1 for each row found, or execute subplan 2 once (note that it is independent of a), build a hash table from the result and probe that hash for each row found in a.
At execution time, PostgreSQL decides to use the latter strategy, that is why subplan 1 is never executed.
Related
I am trying to reduce the query execution time of the query given below. It joins 3 tables to get the data from very big Postgres tables, I have tried to introduce all the necessary indexes on relevant tables but still, the query is taking too long. The total size of the database is around 2TB.
Query:
explain (ANALYZE, COSTS, VERBOSE, BUFFERS)
with au as (
select tbl2.client, tbl2.uid
from tbl2 where tbl2.client = '123kkjk444kjkhj3ddd'
and (tbl2.property->>'num') IN ('1', '2', '3', '31', '12a', '45', '78', '99')
)
SELECT tbl1.id,
CASE WHEN tbl3.displayname IS NOT NULL THEN tbl3.displayname ELSE tbl1.name END AS name,
tbl1.tbl3number, tbl3.originalname as orgtbl3
FROM table_1 tbl1
inner JOIN au tbl2 ON tbl2.client = '123kkjk444kjkhj3ddd' AND tbl2.uid = tbl1.uid
LEFT JOIN tbl3 ON tbl3.client = '123kkjk444kjkhj3ddd' AND tbl3.originalname = tbl1.name
WHERE tbl1.client = '123kkjk444kjkhj3ddd'
AND tbl1.date_col BETWEEN '2021-08-01T05:32:40Z' AND '2021-08-29T05:32:40Z'
ORDER BY tbl1.date_col DESC, tbl1.sid, tbl1.tbl3number
LIMIT 50000;
I have the above Query running but the query execution even after the index scan is very slow. I have attached the Query plan.
Query Plan:
-> Limit (cost=7272.83..7272.86 rows=14 width=158) (actual time=40004.140..40055.737 rows=871 loops=1)
Output: tbl1.id, (CASE WHEN (tbl3.displayname IS NOT NULL) THEN tbl3.displayname ELSE tbl1.name END), tbl1.tbl3number, tbl3.originalsc
reenname, tbl1.date_col
Buffers: shared hit=249656881 dirtied=32
-> Sort (cost=7272.83..7272.86 rows=14 width=158) (actual time=40004.139..40055.671 rows=871 loops=1)
Output: tbl1.id, (CASE WHEN (tbl3.displayname IS NOT NULL) THEN tbl3.displayname ELSE tbl1.name END), tbl1.tbl3number, tbl3.orig
inalname, tbl1.date_col
Sort Key: tbl1.date_col DESC, tbl1.id, tbl1.tbl3number
Sort Method: quicksort Memory: 142kB
Buffers: shared hit=249656881 dirtied=32
-> Gather (cost=1001.39..7272.56 rows=14 width=158) (actual time=9147.574..40055.005 rows=871 loops=1)
Output: tbl1.id, (CASE WHEN (tbl3.displayname IS NOT NULL) THEN tbl3.displayname ELSE tbl1.name END), tbl1.tbl3number, scree
n.originalname, tbl1.date_col
Workers Planned: 4
Workers Launched: 4
Buffers: shared hit=249656881 dirtied=32
-> Nested Loop Left Join (cost=1.39..6271.16 rows=4 width=158) (actual time=3890.074..39998.436 rows=174 loops=5)
Output: tbl1.id, CASE WHEN (tbl3.displayname IS NOT NULL) THEN tbl3.displayname ELSE tbl1.name END, tbl1.tbl3number, s
creen.originalname, tbl1.date_col
Inner Unique: true
Buffers: shared hit=249656881 dirtied=32
Worker 0: actual time=1844.246..39996.744 rows=182 loops=1
Buffers: shared hit=49568277 dirtied=5
Worker 1: actual time=3569.032..39997.124 rows=210 loops=1
Buffers: shared hit=49968461 dirtied=10
Worker 2: actual time=2444.911..39997.561 rows=197 loops=1
Buffers: shared hit=49991521 dirtied=2
Worker 3: actual time=2445.013..39998.065 rows=110 loops=1
Buffers: shared hit=49670445 dirtied=10
-> Nested Loop (cost=1.12..6269.94 rows=4 width=610) (actual time=3890.035..39997.924 rows=174 loops=5)
Output: tbl1.id, tbl1.name, tbl1.tbl3number, tbl1.date_col
Inner Unique: true
Buffers: shared hit=249655135 dirtied=32
Worker 0: actual time=1844.200..39996.206 rows=182 loops=1
Buffers: shared hit=49567912 dirtied=5
Worker 1: actual time=3568.980..39996.522 rows=210 loops=1
Buffers: shared hit=49968040 dirtied=10
Worker 2: actual time=2444.872..39996.987 rows=197 loops=1
Buffers: shared hit=49991126 dirtied=2
Worker 3: actual time=2444.965..39997.712 rows=110 loops=1
Buffers: shared hit=49670224 dirtied=10
-> Parallel Index Only Scan using idx_sv_cuf8_110523 on public.table_1_110523 tbl1 (cost=0.69..5692.16 rows=220 width=692) (actual time=0.059..1458.129 rows=2922506 loops=5)
Output: tbl1.client, tbl1.id, tbl1.tbl3number, tbl1.date_col, tbl1.id, tbl1.name
Index Cond: ((tbl1.client = '123kkjk444kjkhj3ddd'::text) AND (tbl1.date_col >= '2021-08-01 05:32:40+00'::timestamp with time zone) AND (tbl1.date_col <= '2021-08-29 05:32:40+00'::timestamp with time zone))
Heap Fetches: 0
Buffers: shared hit=538663
Worker 0: actual time=0.059..1479.907 rows=2912875 loops=1
Buffers: shared hit=107477
Worker 1: actual time=0.100..1475.863 rows=2930306 loops=1
Buffers: shared hit=107817
Worker 2: actual time=0.054..1481.032 rows=2925849 loops=1
Buffers: shared hit=107812
Worker 3: actual time=0.058..1477.443 rows=2897544 loops=1
Buffers: shared hit=107047
-> Index Scan using tbl2_pkey_102328 on public.tbl2_102328 tbl2_1 (cost=0.43..2.63 rows=1 width=25) (actual time=0.013..0.013 rows=0 loops=14612531)
Output: tbl2_1.id
Index Cond: (((tbl2_1.id)::text = (tbl1.id)::text) AND ((tbl2_1.client)::text = '123kkjk444kjkhj3ddd'::text))
Filter: ((tbl2_1.property ->> 'num'::text) = ANY ('{"1","2","3","31","12a","45","78","99"}'::text[]))
Rows Removed by Filter: 1
Buffers: shared hit=249116472 dirtied=32
Worker 0: actual time=0.013..0.013 rows=0 loops=2912875
Buffers: shared hit=49460435 dirtied=5
Worker 1: actual time=0.013..0.013 rows=0 loops=2930306
Buffers: shared hit=49860223 dirtied=10
Worker 2: actual time=0.013..0.013 rows=0 loops=2925849
Buffers: shared hit=49883314 dirtied=2
Worker 3: actual time=0.013..0.013 rows=0 loops=2897544
Buffers: shared hit=49563177 dirtied=10
-> Index Scan using tbl3_unikey_104219 on public.tbl3_104219 tbl3 (cost=0.27..0.30 rows=1 width=52) (actual time=0.002..0.002 rows=0 loops=871)
Output: tbl3.client, tbl3.originalname, tbl3.displayname
Index Cond: (((tbl3.client)::text = '123kkjk444kjkhj3ddd'::text) AND ((tbl3.originalname)::text = (tbl1.name)::text))
Buffers: shared hit=1746
Worker 0: actual time=0.002..0.002 rows=0 loops=182
Buffers: shared hit=365
Worker 1: actual time=0.002..0.002 rows=0 loops=210
Buffers: shared hit=421
Worker 2: actual time=0.002..0.002 rows=0 loops=197
Buffers: shared hit=395
Worker 3: actual time=0.002..0.002 rows=0 loops=110
Buffers: shared hit=221
Planning Time: 0.361 ms
Execution Time: 40056.008 ms
Planning Time: 0.589 ms
Execution Time: 40071.485 ms
(89 rows)
Time: 40072.986 ms (00:40.073)
Can this query be further optimized to reduce the query execution time? Thank you in advance for your input.
The table definitions are as follows:
Table "public.tbl1"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
-------------------+-----------------------------+-----------+----------+---------+----------+--------------+-------------
client | character varying(32) | | not null | | extended | |
sid | character varying(32) | | not null | | extended | |
uid | character varying(32) | | | | extended | |
id | character varying(32) | | | | extended | |
tbl3number | integer | | not null | | plain | |
name | character varying(255) | | | | extended | |
date_col | timestamp without time zone | | | | plain | |
Indexes:
idx_sv_cuf8_110523(client,date_col desc,sid,tbl3number)
Table "public.tbl2"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------------------------+-----------------------------+-----------+----------+-------------------------+----------+--------------+-------------
id | character varying(32) | | not null | | extended | |
uid | character varying(255) | | | NULL::character varying | extended | |
client | character varying(32) | | not null | | extended | |
property | jsonb | | | | extended | |
Indexes:
"tbl2_pkey" PRIMARY KEY, btree (uid, client)
--
Table "public.tbl3"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------------------+------------------------+-----------+----------+---------+----------+--------------+-------------
client | character varying(500) | | not null | | extended | |
originalname | character varying(500) | | | | extended | |
displayname | character varying(500) | | | | extended | |
Indexes:
"tbl3_unikey" UNIQUE CONSTRAINT, btree (client, originalname)
tl;dr: Multicolumn covering indexes.
Query clarity
I have a preference for using a rigid format for queries, so it's easier to see the columns and tables being processed. I removed your CTE and moved its conditions to the main query for the same reason. I also removed the multiple identical client id constants. Here is my rewrite.
SELECT tbl1.id,
COALESCE(tbl3.displayname, tbl1.name) AS name,
tbl1.tbl3number,
tbl3.originalname as orgtbl3
FROM table_1 tbl1
INNER JOIN tbl2
ON tbl2.client = tbl1.client
AND tbl2.uid = tbl1.uid
AND (tbl2.property->>'num') IN ('1', '2', '3', '31', '12a', '45', '78', '99')
LEFT JOIN tbl3
ON tbl3.client = tbl1.client
AND tbl3.originalname = tbl1.name
WHERE tbl1.client = '123kkjk444kjkhj3ddd'
AND tbl1.date_col BETWEEN '2021-08-01T05:32:40Z' AND '2021-08-29T05:32:40Z'
ORDER BY tbl1.date_col DESC, tbl1.sid, tbl1.tbl3number
LIMIT 50000;
ORDER BY ... LIMIT ...
When you ORDER BY then LIMIT, you sometimes force the server to do a lot of datashuffling: sorting your result set then discarding some of it. Can you avoid either the ORDER BY or the LIMIT, or both?
It also may help to use the DESC keyword on the index for the column that's ordered by DESC.
Covering indexes
It's a big query. But I believe judicious choices of multicolumn covering indexes will help speed it up.
You filter tbl by a constant comparison on client and a range scan on date_col. You then use uid and output id, name, and tbl3number. Therefore, this BTREE index will allow an index-only range scan, which generally is fast. (Notice the DESC keyword on date_col. It's an attempt to help your ORDER BY clause.)
CREATE INDEX CONCURRENTLY tbl1_name_num_lookup
ON tbl1 (client, date_col DESC)
INCLUDE (uid, id, name, tbl3number);
From tbl2, you access client and uid, and then use the jsonb column property. So this index will likely help you.
CREATE INDEX CONCURRENTLY tbl2_name_num_lookup
ON tbl2 (client, uid)
INCLUDE (property);
From tbl3, you access it by client and originalname. You output displayname. So this index should help.
CREATE INDEX CONCURRENTLY tbl3_name_num_lookup
ON tbl3 (client, originalname)
INCLUDE (displayname);
Join column type mismatch
You join ON tbl2.uid = tbl1.uid. But those two columns have different data types: character varying(32) in tbl1 and 255 in tbl2. JOIN operations are faster when the ON columns have the same data type. Consider fixing one or the other.
The same goes for ON tbl3.originalname = tbl1.name.
That you have two different things aliased to tbl2 certainly does not enhance the readability of your plan. That your plan is over indented so that we need to keep scrolling left and right to see it doesn't help either.
Why does your plan show (tbl2_1.id)::text = (tbl1.id)::text while your query shows tbl2.uid = tbl1.uid? Is that a bit of mis-anonymization?
Essentially all the times goes to the join between tbl1 and tbl2, so that is the thing you need to optimize. If you eliminate the join to tbl3, that would simplify the EXPLAIN, and so make it easier to understand.
You are hitting tbl2 14 million times but only getting 174 rows. But we can't tell if the index finds one row for each 14 million inputs, but it gets filtered out, or it finds 0 rows on average. Maybe it would be more efficient to reverse the order of that join, which you might be able to do by creating an index on tbl2 (client, (property->>'num'),uid). Or maybe "id" rather than "uid", I don't really know what your true query is.
Your first query uses JSON and operate a filter (restriction) inside the JSON structure to find data :
tbl2.property->>'num'
This part of the WHERE predicate is not "sargable". So the only maner to answer your demand, is to scan every row in the table tbl2 and then for every row to scan the json text stream to find the desired value.
So the iteration is a kind of cross product between row cardinality of the table and parts of the JSON.
There is no way to optimize such a query...
Every time you will introduce an objects (in your query a JSON) which have an iterative comportement while querying it, inside a dataset that can be retrieve using set based algorithms (index, parallelism...) the result is to scan and scan, and scan...
Actually JSON cannot be indexed. PostgreSQL does not accepts JSON indexes nor XML ones, in contrary to DB2, Oracle or SQL Server that are able to create specialized indexes on XML...
I have the following table in my database:
business_db_dev=# \d schedules2
Table "public.schedules2"
Column | Type | Collation | Nullable | Default
-------------+--------------------------------+-----------+----------+----------------------------------------
id | bigint | | not null | nextval('schedules2_id_seq'::regclass)
monday | boolean | | not null |
tuesday | boolean | | not null |
wednesday | boolean | | not null |
thursday | boolean | | not null |
friday | boolean | | not null |
saturday | boolean | | not null |
sunday | boolean | | not null |
start1 | time(0) without time zone | | |
end1 | time(0) without time zone | | |
start2 | time(0) without time zone | | |
end2 | time(0) without time zone | | |
user_id | bigint | | not null |
inserted_at | timestamp(0) without time zone | | not null |
updated_at | timestamp(0) without time zone | | not null |
Indexes:
"schedules2_pkey" PRIMARY KEY, btree (id)
"schedules2_start1_end1_DESC_NULLS_LAST_index" btree (start1, end1 DESC NULLS LAST)
"schedules2_start2_end2_DESC_NULLS_LAST_index" btree (start2, end2 DESC NULLS LAST)
"schedules2_user_id_index" UNIQUE, btree (user_id)
Foreign-key constraints:
"schedules2_user_id_fkey" FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE
I also have other tables that I use to do a join with that one (users and strategies) which I will not post here for brevity, but if it is needed you can just ask and I will update the question with their structures too.
Giving this table, I'm trying to do the following query
select u.token
from strategies as st
inner join users as u on (st.user_id = u.id)
inner join schedules2 as sc on (st.user_id = sc.user_id)
where st.exchange = 'binance'
and st.market_pair = 'btc_usdt'
and st.timeframe = 'five_minutes'
and st.name = 'stoch_oscillator'
and st.inputs = '{5,3,3,80,20}'
and (sc.start1 is null or ('13:00:01'::time between sc.start1 and sc.end1) or ('13:00:01'::time between sc.start2 and sc.end2));
Running this query with explain analyze I got this result:
Nested Loop (cost=1.27..215.56 rows=16 width=6) (actual time=0.076..6.050 rows=942 loops=1)
Join Filter: (st.user_id = u.id)
-> Nested Loop (cost=0.98..197.89 rows=17 width=16) (actual time=0.070..3.650 rows=942 loops=1)
-> Index Only Scan using unique_strategy_and_user_id on strategies st (cost=0.69..7.29 rows=80 width=8) (actual time=0.056..1.083 rows=1000 loops=1)
Index Cond: ((exchange = 'binance'::text) AND (market_pair = 'btc_usdt'::text) AND (timeframe = 'five_minutes'::text) AND (name = 'stoch_oscillator'::text) AND (inputs = '{5,3,3,80,20}'::character varying[]))
Heap Fetches: 0
-> Index Scan using schedules2_user_id_index on schedules2 sc (cost=0.29..2.38 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=1000)
Index Cond: (user_id = st.user_id)
Filter: ((start1 IS NULL) OR (('13:00:01'::time without time zone >= start1) AND ('13:00:01'::time without time zone <= end1)) OR (('13:00:01'::time without time zone >= start2) AND ('13:00:01'::time without time zone <= end2)))
Rows Removed by Filter: 0
-> Index Scan using users_pkey on users u (cost=0.29..1.03 rows=1 width=14) (actual time=0.002..0.002 rows=1 loops=942)
Index Cond: (id = sc.user_id)
Planning Time: 0.834 ms
Execution Time: 6.130 ms
The important part is this one:
Index Scan using schedules2_user_id_index on schedules2 sc (cost=0.29..2.38 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=1000)
Index Cond: (user_id = st.user_id)
Filter: ((start1 IS NULL) OR (('13:00:01'::time without time zone >= start1) AND ('13:00:01'::time without time zone <= end1)) OR (('13:00:01'::time without time zone >= start2) AND ('13:00:01'::time without time zone <= end2)))
As you can see, Postgres is using Filter to check the values start1, end1, start2 and end2, but I was expecting that Postgres would use the two indexes I created for this exact condition:
"schedules2_start1_end1_DESC_NULLS_LAST_index" btree (start1, end1 DESC NULLS LAST)
"schedules2_start2_end2_DESC_NULLS_LAST_index" btree (start2, end2 DESC NULLS LAST)
Removing the join with schedules2 table and its condition basically halves the query time.
So, my question is, why is Postgres using Filter instead of my indexes, and how can I change the query or the indexes itself to optimize this query?
Edit: Note that the values used in the query (like '13:00:01'::time) are just examples, in my system this can be anything.
Index Cond: (user_id = st.user_id)
Filter: ....
Rows Removed by Filter: 0
The index it is already using is already perfect. It found no extra rows which had to be removed by the filter (or at least, so few that they rounded to zero). How could that be improved upon by using more indexes?
This seems like a straightforward question, but I can't find the answer online.
I'm using Postgres 9.4 and have this table:
Table "public.title"
Column | Type | Collation | Nullable | Default
---------------------------------+-------------------------+-----------+----------+-----------------------------------
id | integer | | not null | nextval('title_id_seq'::regclass)
name1 | character varying(1000) | | |
name2 | character varying(1000) | | |
name3 | character varying(1000) | | |
name4 | character varying(1000) | | |
And I have a multicolumn index:
"idx_title_names" btree (name1, name2, name3, name4)
But for OR queries, the index isn't being used:
EXPLAIN ANALYZE SELECT * FROM "title" WHERE ("title"."name1" = 'foo'
OR "title"."name3" = 'foo' OR "title"."name3" = 'foo' OR "title"."name4" = 'foo');
Gather (cost=1000.00..436451.46 rows=659 width=4500) (actual time=561.418..1297.877 rows=3222 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on title (cost=0.00..435385.56 rows=275 width=4500) (actual time=551.627..1286.724 rows=1074 loops=3)
Filter: (((name1)::text = 'foo'::text) OR ((name2)::text = 'foo'::text) OR ((name3)::text = 'foo'::text) OR ((name4)::text = 'foo'::text))
Rows Removed by Filter: 1231911
Planning Time: 0.102 ms
Execution Time: 1298.148 ms
Is this because these indexes don't work with OR queries?
And: if so, is my best bet just to create 4 separate standard indexes?
One option is to create a GIN index on the array of the columns, then use an array operator:
create index on title using gin (array[name1,name2,name3,name4]);
Then use
SELECT *
FROM title
WHERE array[name1,name2,name3,name4] #> array['foo'];
Note that a GIN index is a bit more expensive to maintain than a BTree index.
OR is often a performance problem in SQL.
This index cannot be used for a condition like that.
Your best bet is to create four single-column indexes and hope for a Bitmap Or:
CREATE INDEX ON public.title (name1);
CREATE INDEX ON public.title (name2);
CREATE INDEX ON public.title (name3);
CREATE INDEX ON public.title (name4);
Having index on (col1, col2, col3, etc) it will be used for conditions/ordering on col1, or col1 and col2, or col1, col2 and col3 etc. It will be not used for conditions/ordering only on col3 for example.
Look at this:
# create table t as select random() as a, random() as b from generate_series(1,1000000);
# create index i on t(a,b);
# analyze t;
# explain analyze select * from t where a > 0.9;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on t (cost=2246.83..8863.15 rows=96826 width=16) (actual time=10.973..28.023 rows=99311 loops=1)
Recheck Cond: (a > '0.9'::double precision)
Heap Blocks: exact=5406
-> Bitmap Index Scan on i (cost=0.00..2222.62 rows=96826 width=0) (actual time=10.251..10.252 rows=99311 loops=1)
Index Cond: (a > '0.9'::double precision)
Planning Time: 0.348 ms
Execution Time: 31.054 ms
# explain analyze select * from t where b > 0.9;
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Seq Scan on t (cost=0.00..17906.00 rows=99117 width=16) (actual time=0.015..70.505 rows=100137 loops=1)
Filter: (b > '0.9'::double precision)
Rows Removed by Filter: 899863
Planning Time: 0.090 ms
Execution Time: 73.656 ms
However when you are using or condition the DBMS actually should to perform several queries, for our example select * from t where a > 0.9 or b > 0.9 is equal to select * from t where a > 0.9 (index could be used) and select * from t where b > 0.9 (index could not be used) thus instead of two actions (scan index then scan whole table) DBMS performs only one action (scan whole table)
Hope it explains why your index is not used for your query.
I'm running into an issue in PostgreSQL (version 9.6.10) with indexes not working to speed up a MAX query with a simple equality filter on another column. Logically it seems that a simple multicolumn index on (A, B DESC) should make the query super fast.
I can't for the life of me figure out why I can't get a query to be performant regardless of what indexes are defined.
The table definition has the following:
- A primary key foo VARCHAR PRIMARY KEY (not used in the query)
- A UUID field that is NOT NULL called bar UUID
- A sequential_id column that was created as a BIGSERIAL UNIQUE type
Here's what the relevant columns look like exactly (with names modified for privacy):
Table "public.foo"
Column | Type | Modifiers
----------------------+--------------------------+--------------------------------------------------------------------------------
foo_uid | character varying | not null
bar_uid | uuid | not null
sequential_id | bigint | not null default nextval('foo_sequential_id_seq'::regclass)
Indexes:
"foo_pkey" PRIMARY KEY, btree (foo_uid)
"foo_bar_uid_sequential_id_idx", btree (bar_uid, sequential_id DESC)
"foo_sequential_id_key" UNIQUE CONSTRAINT, btree (sequential_id)
Despite having the index listed above on (bar_uid, sequential_id DESC), the following query requires an index scan and takes 100-300ms with a few million rows in the database.
The Query (get the max sequential_id for a given bar_uid):
SELECT MAX(sequential_id)
FROM foo
WHERE bar_uid = 'fa61424d-389f-4e75-ba2d-b77e6bb8491f';
The EXPLAIN ANALYZE result doesn't use the proper index. Also, for some reason it checks if sequential_id IS NOT NULL even though it's declared as not null.
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=0.75..0.76 rows=1 width=8) (actual time=321.110..321.110 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.43..0.75 rows=1 width=8) (actual time=321.106..321.106 rows=1 loops=1)
-> Index Scan Backward using foo_sequential_id_key on foo (cost=0.43..98936.43 rows=308401 width=8) (actual time=321.106..321.106 rows=1 loops=1)
Index Cond: (sequential_id IS NOT NULL)
Filter: (bar_uid = 'fa61424d-389f-4e75-ba2d-b77e6bb8491f'::uuid)
Rows Removed by Filter: 920761
Planning time: 0.196 ms
Execution time: 321.127 ms
(9 rows)
I can add a seemingly unnecessary GROUP BY to this query, and that speeds it up a bit, but it's still really slow for a query that should be near instantaneous with indexes defined:
SELECT MAX(sequential_id)
FROM foo
WHERE bar_uid = 'fa61424d-389f-4e75-ba2d-b77e6bb8491f'
GROUP BY bar_uid;
The EXPLAIN (ANALYZE, BUFFERS) result:
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=8510.54..65953.61 rows=6 width=24) (actual time=234.529..234.530 rows=1 loops=1)
Group Key: bar_uid
Buffers: shared hit=1 read=11909
-> Bitmap Heap Scan on foo (cost=8510.54..64411.55 rows=308401 width=24) (actual time=65.259..201.969 rows=309023 loops=1)
Recheck Cond: (bar_uid = 'fa61424d-389f-4e75-ba2d-b77e6bb8491f'::uuid)
Heap Blocks: exact=10385
Buffers: shared hit=1 read=11909
-> Bitmap Index Scan on foo_bar_uid_sequential_id_idx (cost=0.00..8433.43 rows=308401 width=0) (actual time=63.549..63.549 rows=309023 loops=1)
Index Cond: (bar_uid = 'fa61424d-389f-4e75-ba2d-b77e6bb8491f'::uuid)
Buffers: shared read=1525
Planning time: 3.067 ms
Execution time: 234.589 ms
(12 rows)
Does anyone have any idea what's blocking this query from being on the order of 10 milliseconds? This should logically be instantaneous with the right index defined. It should only require the time to follow links to the leaf value in the B-Tree.
Someone asked:
What do you get for SELECT * FROM pg_stats WHERE tablename = 'foo' and attname = 'bar_uid';?
schemaname | tablename | attname | inherited | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation | most_common_elems | most_common_elem_freqs | elem_count_histogram
------------+------------------------+-------------+-----------+-----------+-----------+------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------+------------------+-------------+-------------------+------------------------+----------------------
public | foo | bar_uir | f | 0 | 16 | 6 | {fa61424d-389f-4e75-ba2d-b77e6bb8491f,5c5dcae9-1b7e-4413-99a1-62fde2b89c32,50b1e842-fc32-4c2c-b00f-4a17c3c1c5fa,7ff1999c-c0ea-b700-343f-9a737f6ad659,f667b353-e199-4890-9ffd-4940ea11fe2c,b24ce968-29fd-4587-ba1f-227036ee3135} | {0.203733,0.203167,0.201567,0.195867,0.1952,0.000466667} | | -0.158093 | | |
(1 row)
Our application has a very slow statement, it takes more than 11 second, so I want to know is there any way to optimize it ?
The SQL statement
SELECT id FROM mapfriends.cell_forum_topic WHERE id in (
SELECT topicid FROM mapfriends.cell_forum_item WHERE skyid=103230293 GROUP BY topicid )
AND categoryid=29 AND hidden=false ORDER BY restoretime DESC LIMIT 10 OFFSET 0;
id
---------
2471959
2382296
1535967
2432006
2367281
2159706
1501759
1549304
2179763
1598043
(10 rows)
Time: 11444.976 ms
Plan
friends=> explain SELECT id FROM friends.cell_forum_topic WHERE id in (
friends(> SELECT topicid FROM friends.cell_forum_item WHERE skyid=103230293 GROUP BY topicid)
friends-> AND categoryid=29 AND hidden=false ORDER BY restoretime DESC LIMIT 10 OFFSET 0;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1443.15..1443.15 rows=2 width=12)
-> Sort (cost=1443.15..1443.15 rows=2 width=12)
Sort Key: cell_forum_topic.restoretime
-> Nested Loop (cost=1434.28..1443.14 rows=2 width=12)
-> HashAggregate (cost=1434.28..1434.30 rows=2 width=4)
-> Index Scan using cell_forum_item_idx_skyid on cell_forum_item (cost=0.00..1430.49 rows=1516 width=4)
Index Cond: (skyid = 103230293)
-> Index Scan using cell_forum_topic_pkey on cell_forum_topic (cost=0.00..4.40 rows=1 width=12)
Index Cond: (cell_forum_topic.id = cell_forum_item.topicid)
Filter: ((NOT cell_forum_topic.hidden) AND (cell_forum_topic.categoryid = 29))
(10 rows)
Time: 1.109 ms
Indexes
friends=> \d cell_forum_item
Table "friends.cell_forum_item"
Column | Type | Modifiers
---------+--------------------------------+--------------------------------------------------------------
id | integer | not null default nextval('cell_forum_item_id_seq'::regclass)
topicid | integer | not null
skyid | integer | not null
content | character varying(200) |
addtime | timestamp(0) without time zone | default now()
ischeck | boolean |
Indexes:
"cell_forum_item_pkey" PRIMARY KEY, btree (id)
"cell_forum_item_idx" btree (topicid, skyid)
"cell_forum_item_idx_1" btree (topicid, id)
"cell_forum_item_idx_skyid" btree (skyid)
friends=> \d cell_forum_topic
Table "friends.cell_forum_topic"
Column | Type | Modifiers
-------------+--------------------------------+-------------------------------------------------------------------------------------
-
id | integer | not null default nextval(('"friends"."cell_forum_topic_id_seq"'::text)::regclass)
categoryid | integer | not null
topic | character varying | not null
content | character varying | not null
skyid | integer | not null
addtime | timestamp(0) without time zone | default now()
reference | integer | default 0
restore | integer | default 0
restoretime | timestamp(0) without time zone | default now()
locked | boolean | default false
settop | boolean | default false
hidden | boolean | default false
feature | boolean | default false
picid | integer | default 29249
managerid | integer |
imageid | integer | default 0
pass | boolean | default false
ischeck | boolean |
Indexes:
"cell_forum_topic_pkey" PRIMARY KEY, btree (id)
"idx_cell_forum_topic_1" btree (categoryid, settop, hidden, restoretime, skyid)
"idx_cell_forum_topic_2" btree (categoryid, hidden, restoretime, skyid)
"idx_cell_forum_topic_3" btree (categoryid, hidden, restoretime)
"idx_cell_forum_topic_4" btree (categoryid, hidden, restore)
"idx_cell_forum_topic_5" btree (categoryid, hidden, restoretime, feature)
"idx_cell_forum_topic_6" btree (categoryid, settop, hidden, restoretime)
Explain analyze
mapfriends=> explain analyze SELECT id FROM mapfriends.cell_forum_topic
mapfriends-> join (SELECT topicid FROM mapfriends.cell_forum_item WHERE skyid=103230293 GROUP BY topicid) as tmp
mapfriends-> on mapfriends.cell_forum_topic.id=tmp.topicid
mapfriends-> where categoryid=29 AND hidden=false ORDER BY restoretime DESC LIMIT 10 OFFSET 0;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------
Limit (cost=1446.89..1446.90 rows=2 width=12) (actual time=18016.006..18016.013 rows=10 loops=1)
-> Sort (cost=1446.89..1446.90 rows=2 width=12) (actual time=18016.001..18016.002 rows=10 loops=1)
Sort Key: cell_forum_topic.restoretime
Sort Method: quicksort Memory: 25kB
-> Nested Loop (cost=1438.02..1446.88 rows=2 width=12) (actual time=16988.492..18015.869 rows=20 loops=1)
-> HashAggregate (cost=1438.02..1438.04 rows=2 width=4) (actual time=15446.735..15447.243 rows=610 loops=1)
-> Index Scan using cell_forum_item_idx_skyid on cell_forum_item (cost=0.00..1434.22 rows=1520 width=4) (actual time=302.378..15429.782 rows=7133 loops=1)
Index Cond: (skyid = 103230293)
-> Index Scan using cell_forum_topic_pkey on cell_forum_topic (cost=0.00..4.40 rows=1 width=12) (actual time=4.210..4.210 rows=0 loops=610)
Index Cond: (cell_forum_topic.id = cell_forum_item.topicid)
Filter: ((NOT cell_forum_topic.hidden) AND (cell_forum_topic.categoryid = 29))
Total runtime: 18019.461 ms
Could you give us some more information about the tables (the statistics) and the configuration?
SELECT version();
SELECT category, name, setting FROM pg_settings WHERE name IN('effective_cache_size', 'enable_seqscan', 'shared_buffers');
SELECT * FROM pg_stat_user_tables WHERE relname IN('cell_forum_topic', 'cell_forum_item');
SELECT * FROM pg_stat_user_indexes WHERE relname IN('cell_forum_topic', 'cell_forum_item');
SELECT * FROM pg_stats WHERE tablename IN('cell_forum_topic', 'cell_forum_item');
And before getting this data, use ANALYZE.
It looks like you have a problem with an index, this is where all the query spends all it's time:
-> Index Scan using cell_forum_item_idx_skyid on
cell_forum_item (cost=0.00..1434.22
rows=1520 width=4) (actual
time=302.378..15429.782 rows=7133
loops=1)
If you use VACUUM FULL on a regular basis (NOT RECOMMENDED!), index bloat might be your problem. A REINDEX might be a good idea, just to be sure:
REINDEX TABLE cell_forum_item;
And talking about indexes, you can drop a couple of them, these are obsolete:
"idx_cell_forum_topic_6" btree (categoryid, settop, hidden, restoretime)
"idx_cell_forum_topic_3" btree (categoryid, hidden, restoretime)
Other indexes have the same data and can be used by the database as well.
It looks like you have a couple of problems:
autovacuum is turned off or it's way
behind. That last autovacuum was on
2010-12-02 and you have 256734 dead
tuples in one table and 451430 dead
ones in the other.... You have to do
something about this, this is a
serious problem.
When autovacuum is working again, you
have to do a VACUUM FULL and a
REINDEX to force a table rewrite and
get rid of all empty space in your
tables.
after fixing the vacuum-problem, you
have to analyze as well: the database
expects 1520 results but it gets 7133
results. This could be a problem with
statistics, maybe you have to
increase the STATISTICS.
The query itself needs some rewriting
as well: It gets 7133 results but it
needs only 610 results. Over 90% of
the results are lost... And getting
these 7133 takes a lot of time, over
15 seconds. Get rid of the subquery by using a JOIN without the GROUP BY or use EXISTS, also without the GROUP BY.
But first get autovacuum back on track, before you get new or other problems.
the problem isn't due to lack of caching of the query plan but most likely due to the choice of plan due to lack of appropriate indexes