Hello and thanks in advance to anyone who will take some time to answer me.
This is a question to learn how to use indexes efficiently on count queries.
Version Postgresql 12.11
I have a huge table lots (~15M rows) which should have been partitioned by status (an integer column with actual values between 0 and 3), but it has not.
Here is how the data is distributed on it:
-- lots by status
SELECT status, count(*), ROUND((count(*) * 100 / SUM(count(*)) OVER ()), 1) AS "%"
FROM lots
GROUP BY status
ORDER BY count(*) DESC;
status
count
%
2
~13.3M
90%
0
~1.5M
10%
1
~6K
~0%
NULL
~0.5K
~0%
I also have those indexes on it:
tablename
indexname
num_rows
table_size
index_size
unique
number_of_scans
tuples_read
tuples_fetched
lots
index_lots_on_status
1.4742644e+07
5024 MB
499 MB
N
3451
7060928281
134328966
lots
pidx_active_lots_on_id
1.4742644e+07
5024 MB
38 MB
Y
23491795
1496103827
2680228
where the pidx_active_lots_on_id is a partial index defined as follow:
CREATE UNIQUE INDEX CONCURRENTLY "pidx_active_lots_on_id" ON "lots" ("id" DESC) WHERE status = 0;
As you can see, the partial index on lots with status = 0 is "only" 38MB (against the 0.5GB of the full status index).
I've introduced the latter index to try to optimise this query:
SELECT count(*) FROM lots WHERE status = 0;
because the count of the lots on status 0 is the most common count case for that table, but for some reason the index seems to be ignored.
I also tried to perform a more specific query:
SELECT count(id) FROM lots WHERE status = 0;
with this second query, the index is used, but with worst results.
NOTE: I also ran an ANALYSE lots; after the introduction of the index.
My questions are:
Why is the index partial index ignored on the first count case (count(*))?
Why is the second query performing worst?
Detail on plan:
EXPLAIN(ANALYZE, COSTS, VERBOSE, BUFFERS)
SELECT COUNT(*) FROM lots WHERE lots.status = 0
Aggregate (cost=539867.77..539867.77 rows=1 width=8) (actual time=16517.790..16517.791 rows=1 loops=1)
Output: count(*)
Buffers: shared hit=79181 read=287729 dirtied=16606 written=7844
I/O Timings: read=14040.416 write=58.453
-> Index Only Scan using index_lots_on_status on public.lots (cost=0.11..539125.83 rows=1483881 width=0) (actual time=0.498..16238.580 rows=1501060 loops=1)
Output: status
Index Cond: (lots.status = 0)
Heap Fetches: 1545139
Buffers: shared hit=79181 read=287729 dirtied=16606 written=7844
I/O Timings: read=14040.416 write=58.453
Planning Time: 1.856 ms
JIT:
Functions: 3
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 0.466 ms, Inlining 80.076 ms, Optimization 15.797 ms, Emission 12.393 ms, Total 108.733 ms
Execution Time: 16568.670 ms
EXPLAIN(ANALYZE, COSTS, VERBOSE, BUFFERS)
SELECT COUNT(id) FROM lots WHERE lots.status = 0
Aggregate (cost=660337.71..660337.72 rows=1 width=8) (actual time=32127.686..32127.687 rows=1 loops=1)
Output: count(id)
Buffers: shared hit=80426 read=334949 dirtied=3 written=75
I/O Timings: read=11365.273 write=22.365
-> Bitmap Heap Scan on public.lots (cost=11304.17..659595.77 rows=1483887 width=4) (actual time=3783.122..30680.836 rows=1501176 loops=1)
Output: id, url, title, ... *(list of all of the 32 columns)*
Recheck Cond: (lots.status = 0)
Heap Blocks: exact=402865
Buffers: shared hit=80426 read=334949 dirtied=3 written=75
I/O Timings: read=11365.273 write=22.365
-> Bitmap Index Scan on pidx_active_lots_on_id (cost=0.00..11229.97 rows=1483887 width=0) (actual time=2534.845..2534.845 rows=1614888 loops=1)
Buffers: shared hit=4866
Planning Time: 0.248 ms
JIT:
Functions: 5
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 1.170 ms, Inlining 56.485 ms, Optimization 474.508 ms, Emission 205.882 ms, Total 738.045 ms
Execution Time: 32169.349 ms
The partial index may be smaller, but the size of the part of the index that needs to be read will be about the same between the two indexes. Skipping the parts of the complete index which are for the wrong "status" will be very efficient.
Your table is obviously not very well vacuumed, based on both the heap fetches and the number of buffers dirtied and written. Try vacuuming the table and then repeating the queries.
Related
So I have this query on a pretty big table:
SELECT * FROM datTable WHERE type='bla'
AND timestamp > (CURRENT_DATE - INTERVAL '1 day')
This query is too slow, like 5 seconds; and there is an index on type
So I tried:
SELECT * FROM datTable WHERE type NOT IN ('blu','bli','blo')
AND timestamp > (CURRENT_DATE - INTERVAL '1 day')
This query is way better like 1second, but the issue is that I don't want this not type list hardcoded.
So I tried:
with res as (
SELECT * FROM datTable WHERE type NOT IN ('blu','bli','blo')
AND timestamp > (CURRENT_DATE - INTERVAL '1 day')
)
select * from res where type='bla'
And I'm back to bad perf, 5 seconds same as before.
Any idea how I could trick postgres to get the 1sec perf but specifying positively the type I want ('bla') ?
EDIT: EXPLAIN ANALYZE for the last request
GroupAggregate (cost=677400.59..677493.09 rows=3595 width=59) (actual time=4789.667..4803.183 rows=3527 loops=1)
Group Key: event_historic.sender
-> Sort (cost=677400.59..677412.48 rows=4756 width=23) (actual time=4789.646..4792.808 rows=68045 loops=1)
Sort Key: event_historic.sender
Sort Method: quicksort Memory: 9469kB
-> Bitmap Heap Scan on event_historic (cost=505379.21..677110.11 rows=4756 width=23) (actual time=4709.494..4769.437 rows=68045 loops=1)
Recheck Cond: (("timestamp" > (CURRENT_DATE - '1 day'::interval)) AND ((type)::text = 'NEAR_TRANSFER'::text))
Heap Blocks: exact=26404
-> BitmapAnd (cost=505379.21..505379.21 rows=44676 width=0) (actual time=4706.080..4706.082 rows=0 loops=1)
-> Bitmap Index Scan on event_historic_timestamp_idx (cost=0.00..3393.89 rows=263109 width=0) (actual time=167.838..167.838 rows=584877 loops=1)
Index Cond: ("timestamp" > (CURRENT_DATE - '1 day'::interval))
-> Bitmap Index Scan on event_historic_type_idx (cost=0.00..501982.69 rows=45316549 width=0) (actual time=4453.071..4453.071 rows=44279973 loops=1)
Index Cond: ((type)::text = 'NEAR_TRANSFER'::text)
Planning Time: 0.385 ms
JIT:
Functions: 10
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 2.505 ms, Inlining 18.102 ms, Optimization 87.745 ms, Emission 44.270 ms, Total 152.622 ms
Execution Time: 4809.099 ms
EDIT 2: After adding the index on (type, timestamp) the result is way faster:
HashAggregate (cost=156685.88..156786.59 rows=8057 width=59) (actual time=95.201..96.511 rows=3786 loops=1)
Group Key: sender
Batches: 1 Memory Usage: 2449kB
Buffers: shared hit=31041
-> Index Scan using typetimestamp on event_historic eh (cost=0.57..156087.67 rows=47857 width=44) (actual time=12.244..55.921 rows=76220 loops=1)
Index Cond: (((type)::text = 'NEAR_TRANSFER'::text) AND ("timestamp" > (CURRENT_DATE - '1 day'::interval)))
Buffers: shared hit=31041
Planning:
Buffers: shared hit=5
Planning Time: 0.567 ms
JIT:
Functions: 10
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 2.543 ms, Inlining 0.000 ms, Optimization 1.221 ms, Emission 10.819 ms, Total 14.584 ms
Execution Time: 99.496 ms
You need a two-column index on ((type::text), timestamp) to make that query fast.
Let me explain the reasoning behind the index order in detail. If type is first in the index, the index scan can start with the first index entry after ('NEAR_TRANSFER', <now - 1 day>) and scan all index entries until it hits the next type, so all the index entries that are found correspond to a result row. If the index order is the other way around, the scan has to start at the first entry after (<now - 1 day>, ...) and read all index entries up to the end of the index. It discards the index entries where type IS DISTINCT FROM 'NEAR_TRANSFER' and fetches the table rows for the remaining index entries. So this scan will fetch the same number of table rows, but has to read more index entries.
It is an old myth that the most selective column should be the first in the index, but it is nonetheless a myth. For the reason described above, you should have the columns that are compared with = first in the index. The selectivity of the columns is irrelevant.
All this is speaking about a single query in isolation. But you always have to consider all the other queries in the workload, and for them it may make a difference how the columns are ordered.
A single index on timestamp and type might be faster:
CREATE INDEX idx1 ON datTable (timestamp, type);
Or maybe:
CREATE INDEX idx1 ON datTable (type, timestamp);
Check the query plan if the new index is used. Maybe you have to drop an old one as well. And most likely you could drop the one anyway.
A table raw_data has an index ix_raw_data_timestamp:
CREATE TABLE IF NOT EXISTS public.raw_data
(
ts timestamp without time zone NOT NULL,
log_msg character varying COLLATE pg_catalog."default",
log_image bytea
)
CREATE INDEX IF NOT EXISTS ix_raw_data_timestamp
ON public.raw_data USING btree
(ts ASC NULLS LAST)
TABLESPACE pg_default;
For some reason the index is not used for the following query (and therefore is very slow):
SELECT ts,
log_msg
FROM raw_data
ORDER BY ts ASC
LIMIT 5e6;
The result of EXPLAIN (analyze, buffers, format text) for the query above:
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=9752787.07..10336161.14 rows=5000000 width=50) (actual time=789124.600..859046.614 rows=5000000 loops=1)
Buffers: shared hit=12234 read=888521, temp read=2039471 written=2664654
-> Gather Merge (cost=9752787.07..18421031.89 rows=74294054 width=50) (actual time=789085.442..822547.099 rows=5000000 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=12234 read=888521, temp read=2039471 written=2664654
-> Sort (cost=9751787.05..9844654.62 rows=37147027 width=50) (actual time=788203.880..795491.054 rows=1667070 loops=3)
Sort Key: "ts"
Sort Method: external merge Disk: 1758904kB
Worker 0: Sort Method: external merge Disk: 1762872kB
Worker 1: Sort Method: external merge Disk: 1756216kB
Buffers: shared hit=12234 read=888521, temp read=2039471 written=2664654
-> Parallel Seq Scan on raw_data (cost=0.00..1272131.27 rows=37147027 width=50) (actual time=25.436..119352.861 rows=29717641 loops=3)
Buffers: shared hit=12141 read=888520
Planning Time: 5.240 ms
JIT:
Functions: 7
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 0.578 ms, Inlining 76.678 ms, Optimization 24.578 ms, Emission 13.060 ms, Total 114.894 ms
Execution Time: 877489.531 ms
(20 rows)
But it is used for this one:
SELECT ts,
log_msg
FROM raw_data
ORDER BY ts ASC
LIMIT 4e6;
EXPLAIN (analyze, buffers, format text) of the query above is:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.57..9408157.15 rows=4000000 width=50) (actual time=15.081..44747.127 rows=4000000 loops=1)
Buffers: shared hit=24775 read=61155
-> Index Scan using ix_raw_data_timestamp on raw_data (cost=0.57..209691026.73 rows=89152864 width=50) (actual time=2.218..16077.755 rows=4000000 loops=1)
Buffers: shared hit=24775 read=61155
Planning Time: 1.306 ms
JIT:
Functions: 3
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 0.406 ms, Inlining 1.121 ms, Optimization 7.917 ms, Emission 3.721 ms, Total 13.165 ms
Execution Time: 59028.951 ms
(10 rows)
Needless to say that the aim is to get all queries to use the index no matter the size, but I cannot seem to find a solution.
PS:
There's about 89152922 rows in the database.
Edit:
After increasing the memory to 2G (SET work_mem = '2GB';), the query is a little faster (doesn't use disk anymore) but still nowhere as fast:
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=5592250.54..6175624.61 rows=5000000 width=50) (actual time=215887.445..282393.743 rows=5000000 loops=1)
Buffers: shared hit=12224 read=888531
-> Gather Merge (cost=5592250.54..14260731.75 rows=74296080 width=50) (actual time=215874.072..247030.062 rows=5000000 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=12224 read=888531
-> Sort (cost=5591250.52..5684120.62 rows=37148040 width=50) (actual time=215854.323..221828.921 rows=1667147 loops=3)
Sort Key: "ts"
Sort Method: top-N heapsort Memory: 924472kB
Worker 0: Sort Method: top-N heapsort Memory: 924379kB
Worker 1: Sort Method: top-N heapsort Memory: 924281kB
Buffers: shared hit=12224 read=888531
-> Parallel Seq Scan on raw_data (cost=0.00..1272141.40 rows=37148040 width=50) (actual time=25.899..107034.903 rows=29717641 loops=3)
Buffers: shared hit=12130 read=888531
Planning Time: 0.058 ms
JIT:
Functions: 7
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 0.642 ms, Inlining 53.860 ms, Optimization 23.848 ms, Emission 11.768 ms, Total 90.119 ms
Execution Time: 300281.654 ms
(20 rows)
The problem here is you're going to PARALLEL SEQ SCAN and GATHER_MERGE. The gather merge is taking in 74,294,054 rows to output 5,000,000. Which makes sense, because you're saying there are 89,152,922 rows in the DB, and you have no conditional for which to limit them.
Why would it choose this plan, probably because it is forcing materialization because you're over work_mem. So increase your work_mem. If PostgreSQL thinks it can fit all this in memory and that it doesn't have to do this on disk then it will move massively faster.
Postgres is using a much heavier Seq Scan on table tracking when an index is available. The first query was the original attempt, which uses a Seq Scan and therefore has a slow query. I attempted to force an Index Scan with an Inner Select, but postgres converted it back to effectively the same query with nearly the same runtime. I finally copied the list from the Inner Select of query two to make the third query. Finally postgres used the Index Scan, which dramatically decreased the runtime. The third query is not viable in a production environment. What will cause postgres to use the last query plan?
(vacuum was used on both tables)
Tables
tracking (worker_id, localdatetime) total records: 118664105
project_worker (id, project_id) total records: 12935
INDEX
CREATE INDEX tracking_worker_id_localdatetime_idx ON public.tracking USING btree (worker_id, localdatetime)
Queries
SELECT worker_id, localdatetime FROM tracking t JOIN project_worker pw ON t.worker_id = pw.id WHERE project_id = 68475018
Hash Join (cost=29185.80..2638162.26 rows=19294218 width=16) (actual time=16.912..18376.032 rows=177681 loops=1)
Hash Cond: (t.worker_id = pw.id)
-> Seq Scan on tracking t (cost=0.00..2297293.86 rows=118716186 width=16) (actual time=0.004..8242.891 rows=118674660 loops=1)
-> Hash (cost=29134.80..29134.80 rows=4080 width=8) (actual time=16.855..16.855 rows=2102 loops=1)
Buckets: 4096 Batches: 1 Memory Usage: 115kB
-> Seq Scan on project_worker pw (cost=0.00..29134.80 rows=4080 width=8) (actual time=0.004..16.596 rows=2102 loops=1)
Filter: (project_id = 68475018)
Rows Removed by Filter: 10833
Planning Time: 0.192 ms
Execution Time: 18382.698 ms
SELECT worker_id, localdatetime FROM tracking t WHERE worker_id IN (SELECT id FROM project_worker WHERE project_id = 68475018 LIMIT 500)
Hash Semi Join (cost=6905.32..2923969.14 rows=27733254 width=24) (actual time=19.715..20191.517 rows=20530 loops=1)
Hash Cond: (t.worker_id = project_worker.id)
-> Seq Scan on tracking t (cost=0.00..2296948.27 rows=118698327 width=24) (actual time=0.005..9184.676 rows=118657026 loops=1)
-> Hash (cost=6899.07..6899.07 rows=500 width=8) (actual time=1.103..1.103 rows=500 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 28kB
-> Limit (cost=0.00..6894.07 rows=500 width=8) (actual time=0.006..1.011 rows=500 loops=1)
-> Seq Scan on project_worker (cost=0.00..28982.65 rows=2102 width=8) (actual time=0.005..0.968 rows=500 loops=1)
Filter: (project_id = 68475018)
Rows Removed by Filter: 4493
Planning Time: 0.224 ms
Execution Time: 20192.421 ms
SELECT worker_id, localdatetime FROM tracking t WHERE worker_id IN (322016383,316007840,...,285702579)
Index Scan using tracking_worker_id_localdatetime_idx on tracking t (cost=0.57..4766798.31 rows=21877360 width=24) (actual time=0.079..29.756 rows=22112 loops=1)
" Index Cond: (worker_id = ANY ('{322016383,316007840,...,285702579}'::bigint[]))"
Planning Time: 1.162 ms
Execution Time: 30.884 ms
... is in place of the 500 id entries used in the query
Same query ran on another set of 500 id's
Index Scan using tracking_worker_id_localdatetime_idx on tracking t (cost=0.57..4776714.91 rows=21900980 width=24) (actual time=0.105..5528.109 rows=117838 loops=1)
" Index Cond: (worker_id = ANY ('{286237712,286237844,...,216724213}'::bigint[]))"
Planning Time: 2.105 ms
Execution Time: 5534.948 ms
The distribution of "worker_id" within "tracking" seems very skewed. For one thing, the number of rows in one of your instances of query 3 returns over 5 times as many rows as the other instance of it. For another, the estimated number of rows is 100 to 1000 times higher than the actual number. This can certainly lead to bad plans (although it is unlikely to be the complete picture).
What is the actual number of distinct values for worker_id within tracking: select count(distinct worker_id) from tracking? What does the planner think this value is: select n_distinct from pg_stats where tablename='tracking' and attname='worker_id'? If those values are far apart and you force the planner to use a more reasonable value with alter table tracking alter column worker_id set (n_distinct = <real value>); analyze tracking; does that change the plans?
If you want to nudge PostgreSQL towards a nested loop join, try the following:
Create an index on tracking that can be used for an index-only scan:
CREATE INDEX ON tracking (worker_id) INCLUDE (localdatetime);
Make sure that tracking is VACUUMed often, so that an index-only scan is effective.
Reduce random_page_cost and increase effective_cache_size so that the optimizer prices index scans lower (but don't use insane values).
Make sure that you have good estimates on project_worker:
ALTER TABLE project_worker ALTER project_id SET STATISTICS 1000;
ANALYZE project_worker;
I'm making two queries to a contacts table (1854453 total records) and a notes table (956467 total records). Although their query plans are very similar, the notes table query is taking considerably longer to process while the contacts query is really fast. Below are the queries with the query plan:
Contacts query (0.9 ms):
Contact Load (0.9ms) SELECT "contacts".* FROM "contacts" WHERE "contacts"."discarded_at" IS NULL AND "contacts"."firm_id" = $1 ORDER BY id DESC LIMIT $2 [["firm_id", 1], ["LIMIT", 2]]
=> EXPLAIN (ANALYZE,BUFFERS) SELECT "contacts".* FROM "contacts" WHERE "contacts"."discarded_at" IS NULL AND "contacts"."firm_id" = 1 ORDER BY id DESC LIMIT 2;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.43..11.27 rows=2 width=991) (actual time=5.407..5.412 rows=2 loops=1)
Buffers: shared hit=7 read=70
-> Index Scan Backward using contacts_pkey on contacts (cost=0.43..484798.76 rows=89438 width=991) (actual time=5.406..5.410 rows=2 loops=1)
Filter: ((discarded_at IS NULL) AND (firm_id = 1))
Rows Removed by Filter: 86
Buffers: shared hit=7 read=70
Planning Time: 0.271 ms
Execution Time: 5.440 ms
Notes query (294.5ms):
Note Load (294.5ms) SELECT "notes".* FROM "notes" WHERE "notes"."firm_id" = $1 ORDER BY id DESC LIMIT $2 [["firm_id", 1], ["LIMIT", 2]]
=> EXPLAIN (ANALYZE,BUFFERS) SELECT "notes".* FROM "notes" WHERE "notes"."firm_id" = 1 ORDER BY id DESC LIMIT 2
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..0.88 rows=2 width=390) (actual time=387.278..387.280 rows=2 loops=1)
Buffers: shared hit=29871 read=36815
-> Index Scan Backward using notes_pkey on notes (cost=0.42..115349.39 rows=502862 width=390) (actual time=387.277..387.278 rows=2 loops=1)
Filter: (firm_id = 1)
Rows Removed by Filter: 271557
Buffers: shared hit=29871 read=36815
Planning Time: 5.389 ms
Execution Time: 387.322 ms
Both tables have an index on the firm_id and the contacts also have an index in discarded_at columns.
Is the difference in query time because of the number of rows that postgres has to check? if not, what could account for that difference? Let me know if any other information is necessary.
In both cases PostgreSQL reads the rows in index order to avoid an explicit sort, and keeps discarding rows that don't meet the filter condition until it has found two rows that match.
The difference is that in the first case the goal is reached afzer discarding only 86 rows, while in the second case almost 300000 rows have to be scanned.
I'm using PostgreSQL server 12 and psql 12.
I have a really huge table(about 6 million tuples) which has several columns. Say it's like
People(
bigint id,
varchar company_type,
bigint complany_id,
varchar department_type,
bigint department_id,
......
)
And I have several indexes:
"people_pkey" PRIMARY KEY, btree (id),
"unique_person" UNIQUE, btree (company_type, company_id, department_type, department_id),
"company" btree (company_type, company_id),
"department" btree (department_type, department_id)
Now I have this simple query
EXPLAIN ANALYZE SELECT array(
SELECT DISTINCT my_people.company_id
FROM people AS my_people
WHERE
"my_people"."company_type" = 'Some_company' AND
"my_people"."department_type" = 'Some_department' AND
"my_people"."department_id" = ANY(ARRAY[1,2,3,4,5,6,7])
) a
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=8.60..8.61 rows=1 width=32) (actual time=2.377..2.378 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Unique (cost=0.56..8.60 rows=1 width=8) (actual time=2.373..2.374 rows=0 loops=1)
-> Index Scan using company on people my_people (cost=0.56..8.60 rows=1 width=8) (actual time=2.373..2.373 rows=0 loops=1)
Index Cond: ((company_type)::text = 'Some_company'::text)
Filter: ((department_type)::text = 'Some_department'::text) AND (department_id = ANY ('{1,2,3,4,5,6,7}'::integer[])))
Rows Removed by Filter: 1189
Planning Time: 0.873 ms
Execution Time: 2.405 ms
The query time would be optimized if it uses the "department" index but instead doing so, it uses "company".
I've tried using pg_hint_plan to force it to use indexes I desire and it would be much faster than "company".
/*+ IndexScan(my_people department) */ EXPLAIN ANALYZE SELECT array(
SELECT DISTINCT my_people.company_id
FROM people AS my_people
WHERE
"my_people"."company_type" = 'Some_company' AND
"my_people"."department_type" = 'Some_department' AND
"my_people"."department_id" = ANY(ARRAY[1,2,3,4,5,6,7])
) a
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=503435.96..503435.97 rows=1 width=32) (actual time=0.073..0.074 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Unique (cost=503435.95..503435.96 rows=1 width=8) (actual time=0.070..0.071 rows=0 loops=1)
-> Sort (cost=503435.95..503435.96 rows=1 width=8) (actual time=0.070..0.071 rows=0 loops=1)
Sort Key: my_people.id
Sort Method: quicksort Memory: 25kB
-> Index Scan using department on people my_people (cost=0.56..503435.94 rows=1 width=8) (actual time=0.066..0.067 rows=0 loops=1)
Index Cond: (((department_type)::text = 'Some_department'::text) AND (department_id = ANY ('{1,2,3,4,5,6,7}'::integer[])))
Filter: ((Company_type)::text = 'Some_company'::text)
Rows Removed by Filter: 1
Planning Time: 0.252 ms
Execution Time: 0.096 ms
(12 rows)
/*+ IndexScan(my_people unique_person) */ EXPLAIN ANALYZE SELECT array(
SELECT DISTINCT my_people.company_id
FROM people AS my_people
WHERE
"my_people"."company_type" = 'Some_company' AND
"my_people"."department_type" = 'Some_department' AND
"my_people"."department_id" = ANY(ARRAY[1,2,3,4,5,6,7])
) a
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=8.60..8.61 rows=1 width=32) (actual time=1.821..1.822 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Unique (cost=0.56..8.60 rows=1 width=8) (actual time=1.818..1.818 rows=0 loops=1)
-> Index Scan using unique_person on people my_people (cost=0.56..8.60 rows=1 width=8) (actual time=1.817..1.817 rows=0 loops=1)
Index Cond: (((company_type)::text = 'Some_company'::text) AND ((department_type)::text = 'Some_department'::text))
Filter: (department_id = ANY ('{1,2,3,4,5,6,7}'::integer[]))
Rows Removed by Filter: 994
Planning Time: 0.258 ms
Execution Time: 1.842 ms
(9 rows)
Then I thought maybe it's the ScalarArrayOpExpr that makes it inefficient. So I changed the query to this. This is significantly faster but I still have to hint postgres to use "department" index.
/*+ IndexScan(my_people department) */ EXPLAIN ANALYZE SELECT array(
SELECT a.*
FROM unnest(ARRAY[1,2,3,4,5,6,7]) as t(fid)
, LATERAL (
SELECT DISTINCT my_people.id
FROM people AS my_people
WHERE
"my_people"."company_type" = 'Some_company' AND
"my_people"."department_type" = 'Some_department' AND
"my_people"."target_id" = t.fid
) a
) b
;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=3658.94..3658.95 rows=1 width=32) (actual time=0.092..0.094 rows=1 loops=1)
InitPlan 1 (returns $1)
-> Nested Loop (cost=522.67..3658.94 rows=7 width=8) (actual time=0.090..0.091 rows=0 loops=1)
-> Function Scan on unnest t (cost=0.00..0.07 rows=7 width=4) (actual time=0.005..0.006 rows=7 loops=1)
-> Unique (cost=522.67..522.68 rows=1 width=8) (actual time=0.011..0.012 rows=0 loops=7)
-> Sort (cost=522.67..522.67 rows=1 width=8) (actual time=0.011..0.011 rows=0 loops=7)
Sort Key: my_people.id
Sort Method: quicksort Memory: 25kB
-> Index Scan using department on people my_people (cost=0.56..522.66 rows=1 width=8) (actual time=0.010..0.010 rows=0 loops=7)
Index Cond: (((department_type)::text = 'Some_department'::text) AND (department_id = t.fid))
Filter: ((company_type)::text = 'Some_company'::text)
Rows Removed by Filter: 0
Planning Time: 0.248 ms
Execution Time: 0.120 ms
(14 rows)
When using "company", the index scan has the lowest cost, so I think this is the reason why psql would use this index. However, this behavior would drastically slow my query down, I would like to know how psql determine which index to use.
Update 1:
I tried VACUUM (VERBOSE, ANALYZE) people; and here's the output. It seems like nothing has changed but now my query uses the index I need.
VACUUM (VERBOSE, ANALYZE) people;
INFO: vacuuming "people"
INFO: index "people_pkey" now contains 66865768 row versions in 183343 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.11 s, system: 0.41 s, elapsed: 3.24 s.
INFO: index "unique_person" now contains 66865768 row versions in 867318 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.54 s, system: 1.97 s, elapsed: 6.13 s.
INFO: index "department" now contains 66865768 row versions in 308674 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.21 s, system: 0.67 s, elapsed: 1.32 s.
INFO: "people": found 0 removable, 66865768 nonremovable row versions in 1943422 out of 1943422 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 85284
There were 0 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 6.00 s, system: 13.11 s, elapsed: 30.15 s.
INFO: vacuuming "pg_toast.pg_toast_1418456"
INFO: index "pg_toast_1418456_index" now contains 2344 row versions in 9 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO: "pg_toast_1418456": found 0 removable, 2344 nonremovable row versions in 534 out of 534 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 85284
There were 0 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO: analyzing "people"
INFO: "people": scanned 30000 of 1943422 pages, containing 1032582 live rows and 0 dead rows; 30000 rows in sample, 66891419 estimated total rows
VACUUM
The best index here, which you don't currently have, might be one which covers the entire query:
CREATE INDEX idx ON people (company_type, department_type, department_id, company_id);
The two indices you have on the company and department columns each only partially cover the where clause. Maybe that's enough, and Postgres would be happy enough to do a scan of that index and lookup the missing information. The suggestion I gave above covers the where and select clauses, meaning that Postgres would not even have to reference the people table to satisfy your query.
Your title is very generic, so I am assuming you are looking for ways to figure such things out in general, not for a canned answer to your specific scenario.
The EXPLAIN ANALYZE for the planner-preferred plan has:
Rows Removed by Filter: 1189
But, the EXPLAIN output doesn't tell you how many rows it expected to remove with the filter, only how many it actually did. (This lack of info is arguably a defect in EXPLAIN ANALYZE). Based on the low total cost estimate, it seems like it did not expect to remove that many. You can run a variant query to see how many rows it expected the index to return (all but 1 of which it silently expected the filter to remove):
explain select * from person where company_type = 'Some_company'
For the index you actually want it to use, the plan is also a bit baffling:
-> Index Scan using department on people my_people (cost=0.56..503435.94 rows=1 width=8) (actual time=0.066..0.067 rows=0 loops=1)
Index Cond: (((department_type)::text = 'Some_department'::text) AND (department_id = ANY ('{1,2,3,4,5,6,7}'::integer[])))
Filter: ((Company_type)::text = 'Some_company'::text)
Rows Removed by Filter: 1
Instead of the cost here seeming too small for the large number of "Removed by Filter", now it seems very much too high for the low number of filtered rows. But again we can only see how many rows it actually removed, not how many it expected to. To see how many it expected this index scan to find before filtering, you can run:
explain select * from person where department_type = 'Some_department' AND department_id = ANY ('{1,2,3,4,5,6,7}'::integer[])