I have the following PostGIS/greSQL query
SELECT luc.*
FROM spatial_derived.lucas12 luc,
(SELECT geom
FROM spatial_derived.germany_bld
WHERE state = 'SN') sn
WHERE ST_Contains(sn.geom, luc.geom)
Query plan:
Nested Loop (cost=2.45..53.34 rows=8 width=236) (actual time=1.030..26.751 rows=1282 loops=1)
-> Seq Scan on germany_bld (cost=0.00..2.20 rows=1 width=18399) (actual time=0.023..0.029 rows=1 loops=1)
Filter: ((state)::text = 'SN'::text)
Rows Removed by Filter: 15
-> Bitmap Heap Scan on lucas12 luc (cost=2.45..51.06 rows=8 width=236) (actual time=1.002..26.031 rows=1282 loops=1)
Recheck Cond: (germany_bld.geom ~ geom)
Filter: _st_contains(germany_bld.geom, geom)
Rows Removed by Filter: 499
Heap Blocks: exact=174
-> Bitmap Index Scan on lucas12_geom_idx (cost=0.00..2.45 rows=23 width=0) (actual time=0.419..0.419 rows=1781 loops=1)
Index Cond: (germany_bld.geom ~ geom)
Planning time: 0.536 ms
Execution time: 27.023 ms
which is due to an index on the geometry columns pretty fast. However when I want to add a buffer to the sn polygon (1 big polygon that represents a border line, hence a quite simple feature):
SELECT luc.*
FROM spatial_derived.lucas12 luc,
(SELECT ST_Buffer(geom, 30000) geom
FROM spatial_derived.germany_bld
WHERE state = 'SN') sn
WHERE ST_Contains(sn.geom, luc.geom)
Query plan:
Nested Loop (cost=0.00..13234.80 rows=7818 width=236) (actual time=6221.391..1338380.257 rows=2298 loops=1)
Join Filter: st_contains(st_buffer(germany_bld.geom, 30000::double precision), luc.geom)
Rows Removed by Join Filter: 22637
-> Seq Scan on germany_bld (cost=0.00..2.20 rows=1 width=18399) (actual time=0.018..0.036 rows=1 loops=1)
Filter: ((state)::text = 'SN'::text)
Rows Removed by Filter: 15
-> Seq Scan on lucas12 luc (cost=0.00..1270.55 rows=23455 width=236) (actual time=0.005..25.623 rows=24935 loops=1)
Planning time: 0.271 ms
Execution time: 1338381.079 ms
the query takes forever! I blame it on the not existing index in the temporally table sn. The massive decrease in speed can't be 'caused by ST_Buffer() as it's itself really fast and the buffered feature is simple.
Two Questions:
1) Am I right?
2) What can I do, to reach similar speed as with the first query?
I've ran into a trap. ST_Buffer() is not the right choice here rather ST_DWithin() which keeps the indexes of every geometry column when actually performing a bounding box comparison. The help page for ST_Buffer() clearly states to not make the mistake using ST_Buffer(), but instead use ST_DWithin() for radius searches. Since the word Buffer is used in a lot of GIS softwares I didn't consider looking for alternatives.
SELECT luc.*
FROM spatial_derived.lucas12 luc
JOIN spatial_derived.germany_bld sn ON ST_DWithin(sn.geom, luc.geom, 30000)
WHERE bld.state = 'SN'
works and only takes a second (2300 points within that "buffer")!
to check if you right, you can leave sn as is and apply ST_Buffer on join:
SELECT luc.*
FROM spatial_derived.lucas12 luc,
(SELECT geom
FROM spatial_derived.germany_bld
WHERE state = 'SN') sn
WHERE ST_Contains(ST_Buffer(sn.geom, 30000), luc.geom)
Query plan:
Nested Loop (cost=0.00..13234.80 rows=7818 width=236) (actual time=6237.876..1340000.576 rows=2298 loops=1)
Join Filter: st_contains(st_buffer(germany_bld.geom, 30000::double precision), luc.geom)
Rows Removed by Join Filter: 22637
-> Seq Scan on germany_bld (cost=0.00..2.20 rows=1 width=18399) (actual time=0.023..0.038 rows=1 loops=1)
Filter: ((state)::text = 'SN'::text)
Rows Removed by Filter: 15
-> Seq Scan on lucas12 luc (cost=0.00..1270.55 rows=23455 width=236) (actual time=0.004..24.525 rows=24935 loops=1)
Planning time: 0.453 ms
Execution time: 1340001.420 ms
this query will answer both your questions or first, depending on result.
Update
Your assumption seems to be wrong. The ST_Buffer() causes speed drop down
You seem to join on much larger set when using the ST_Buffer, so time increase is quite expected. You can run explain analyze for both with and without ST_Buffer() queries - it probably will show same plans with different rows number and cost second value...
Related
I have product_details table with 30+ Million records. product attributes text type data is stored into column Value1.
Front end(web) users search for product details and it will be queried on column Value1.
create table product_details(
key serial primary key ,
product_key int,
attribute_key int ,
Value1 text[],
Value2 int[],
status text);
I created gin index on column Value1 to improve search query performance.
query execution improved a lot for many queries.
Tables and indexes are here
Below is one of query used by application for search.
select p.key from (select x.product_key,
x.value1,
x.attribute_key,
x.status
from product_details x
where value1 IS NOT NULL
) as pr_d
join attribute_type at on at.key = pr_d.attribute_key
join product p on p.key = pr_d.product_key
where value1_search(pr_d.value1) ilike '%B s%'
and at.type = 'text'
and at.status = 'active'
and pr_d.status = 'active'
and 1 = 1
and p.product_type_key=1
and 1 = 1
group by p.key
query is executed in 2 or 3 secs if we search %B % or any single or two char words and below is query plan
Group (cost=180302.82..180302.83 rows=1 width=4) (actual time=49.006..49.021 rows=65 loops=1)
Group Key: p.key
-> Sort (cost=180302.82..180302.83 rows=1 width=4) (actual time=49.005..49.009 rows=69 loops=1)
Sort Key: p.key
Sort Method: quicksort Memory: 28kB
-> Nested Loop (cost=0.99..180302.81 rows=1 width=4) (actual time=3.491..48.965 rows=69 loops=1)
Join Filter: (x.attribute_key = at.key)
Rows Removed by Join Filter: 10051
-> Nested Loop (cost=0.99..180270.15 rows=1 width=8) (actual time=3.396..45.211 rows=69 loops=1)
-> Index Scan using products_product_type_key_status on product p (cost=0.43..4420.58 rows=1413 width=4) (actual time=0.024..1.473 rows=1630 loops=1)
Index Cond: (product_type_key = 1)
-> Index Scan using product_details_product_attribute_key_status on product_details x (cost=0.56..124.44 rows=1 width=8) (actual time=0.026..0.027 rows=0 loops=1630)
Index Cond: ((product_key = p.key) AND (status = 'active'))
Filter: ((value1 IS NOT NULL) AND (value1_search(value1) ~~* '%B %'::text))
Rows Removed by Filter: 14
-> Seq Scan on attribute_type at (cost=0.00..29.35 rows=265 width=4) (actual time=0.002..0.043 rows=147 loops=69)
Filter: ((value_type = 'text') AND (status = 'active'))
Rows Removed by Filter: 115
Planning Time: 0.732 ms
Execution Time: 49.089 ms
But if i search for %B s%, query took 75 secs and below is query plan (second time query execution took 63 sec)
In below query plan, DB engine didn't consider index for scan as in above query plan indexes were used. Not sure why ?
Group (cost=8057.69..8057.70 rows=1 width=4) (actual time=62138.730..62138.737 rows=12 loops=1)
Group Key: p.key
-> Sort (cost=8057.69..8057.70 rows=1 width=4) (actual time=62138.728..62138.732 rows=14 loops=1)
Sort Key: p.key
Sort Method: quicksort Memory: 25kB
-> Nested Loop (cost=389.58..8057.68 rows=1 width=4) (actual time=2592.685..62138.710 rows=14 loops=1)
-> Hash Join (cost=389.15..4971.85 rows=368 width=4) (actual time=298.280..62129.956 rows=831 loops=1)
Hash Cond: (x.attribute_type = at.key)
-> Bitmap Heap Scan on product_details x (cost=356.48..4937.39 rows=681 width=8) (actual time=298.117..62128.452 rows=831 loops=1)
Recheck Cond: (value1_search(value1) ~~* '%B s%'::text)
Rows Removed by Index Recheck: 26168889
Filter: ((value1 IS NOT NULL) AND (status = 'active'))
Rows Removed by Filter: 22
Heap Blocks: exact=490 lossy=527123
-> Bitmap Index Scan on product_details_value1_gin (cost=0.00..356.31 rows=1109 width=0) (actual time=251.596..251.596 rows=2846970 loops=1)
Index Cond: (value1_search(value1) ~~* '%B s%'::text)
-> Hash (cost=29.35..29.35 rows=265 width=4) (actual time=0.152..0.153 rows=269 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 18kB
-> Seq Scan on attribute_type at (cost=0.00..29.35 rows=265 width=4) (actual time=0.010..0.122 rows=269 loops=1)
Filter: ((value_type = 'text') AND (status = 'active'))
Rows Removed by Filter: 221
-> Index Scan using product_pkey on product p (cost=0.43..8.39 rows=1 width=4) (actual time=0.009..0.009 rows=0 loops=831)
Index Cond: (key = x.product_key)
Filter: (product_type_key = 1)
Rows Removed by Filter: 1
Planning Time: 0.668 ms
Execution Time: 62138.794 ms
Any suggestions pls to improve query for search %B s%
thanks
ilike '%B %' has no usable trigrams in it. The planner knows this, and punishes the pg_trgm index plan so much that the planner then goes with an entirely different plan instead.
But ilike '%B s%' does have one usable trigram in it, ' s'. It turns out that this trigram sucks because it is extremely common in the searched data, but the planner currently has no way to accurately estimate how much it sucks.
Even worse, this large number matches means your full bitmap can't fit in work_mem so it goes lossy. Then it needs to recheck all the tuples in any page which contains even one tuple that has the ' s' trigram in it, which looks like it is most of the pages in your table.
The first thing to do is to increase your work_mem to the point you stop getting lossy blocks. If most of your time is spent in the CPU applying the recheck condition, this should help tremendously. If most of your time is spent reading the product_details from disk (so that the recheck has the data it needs to run) then it won't help much. If you had done EXPLAIN (ANALYZE, BUFFERS) with track_io_timing turned on, then we would already know which is which.
Another thing you could do is have the application inspect the search parameter, and if it looks like two letters (with or without a space between), then forcibly disable that index usage, or just throw an error if there is no good reason to do that type of search. For example, changing the part of the query to look like this will disable the index:
where value1_search(pr_d.value1)||'' ilike '%B s%'
Another thing would be to rethink your data representation. '%B s%' is a peculiar thing to search for. Why would anyone search for that? Does it have some special meaning within the context of your data, which is not obvious to the outside observer? Maybe you could represent it in a different way that gets along better with pg_trgm.
Finally, you could try to improve the planning for GIN indexes generally by explicitly estimating how many tuples are going to fail recheck (due to inherent lossiness of the index, not due to overrunning work_mem). This would be a major undertaking, and you would be unlikely to see it in production for at least a couple years, if ever.
When selecting MIN on a column in PostgreSQL (11, 12, 13) after a GROUP BY operation on multiple columns, any index created on the grouped columns is not used: https://dbfiddle.uk/?rdbms=postgres_13&fiddle=30e0f341940f4c1fa6013677643a0baf
CREATE TABLE tags (id serial, series int, index int, page int);
CREATE INDEX ON tags (page, series, index);
INSERT INTO tags (series, index, page)
SELECT
ceil(random() * 10),
ceil(random() * 100),
ceil(random() * 1000)
FROM generate_series(1, 100000);
EXPLAIN ANALYZE
SELECT tags.page, tags.series, MIN(tags.index)
FROM tags GROUP BY tags.page, tags.series;
HashAggregate (cost=2291.00..2391.00 rows=10000 width=12) (actual time=108.968..133.153 rows=9999 loops=1)
Group Key: page, series
Batches: 1 Memory Usage: 1425kB
-> Seq Scan on tags (cost=0.00..1541.00 rows=100000 width=12) (actual time=0.015..55.240 rows=100000 loops=1)
Planning Time: 0.257 ms
Execution Time: 133.771 ms
Theoretically, the index should allow the database to seek in steps of (tags.page, tags.series) instead of performing a full scan. This would result in 10,000 processed rows for above dataset instead of 100,000. This link describes the method with no grouped columns.
This answer (as well as this one) suggests using DISTINCT ON with an ordering instead of GROUP BY but that produces this query plan:
Unique (cost=0.42..5680.42 rows=10000 width=12) (actual time=0.066..268.038 rows=9999 loops=1)
-> Index Only Scan using tags_page_series_index_idx on tags (cost=0.42..5180.42 rows=100000 width=12) (actual time=0.064..227.219 rows=100000 loops=1)
Heap Fetches: 100000
Planning Time: 0.426 ms
Execution Time: 268.712 ms
While the index is now being used, it still appears to be scanning the full set of rows. When using SET enable_seqscan=OFF, the GROUP BY query degrades to the same behaviour.
How can I encourage PostgreSQL to use the multi-column index?
If you can pull the set of distinct page,series from another table then you can hack it with a lateral join:
CREATE TABLE pageseries AS SELECT DISTINCT page,series FROM tags ORDER BY page,series;
EXPLAIN ANALYZE SELECT p.*, minindex FROM pageseries p CROSS JOIN LATERAL (SELECT index minindex FROM tags t WHERE t.page=p.page AND t.series=p.series ORDER BY page,series,index LIMIT 1) x;
Nested Loop (cost=0.42..8720.00 rows=10000 width=12) (actual time=0.039..56.013 rows=10000 loops=1)
-> Seq Scan on pageseries p (cost=0.00..145.00 rows=10000 width=8) (actual time=0.012..1.872 rows=10000 loops=1)
-> Limit (cost=0.42..0.84 rows=1 width=12) (actual time=0.005..0.005 rows=1 loops=10000)
-> Index Only Scan using tags_page_series_index_idx on tags t (cost=0.42..4.62 rows=10 width=12) (actual time=0.004..0.004 rows=1 loops=10000)
Index Cond: ((page = p.page) AND (series = p.series))
Heap Fetches: 0
Planning Time: 0.168 ms
Execution Time: 57.077 ms
...but it is not necessarily faster:
EXPLAIN ANALYZE SELECT tags.page, tags.series, MIN(tags.index)
FROM tags GROUP BY tags.page, tags.series;
HashAggregate (cost=2291.00..2391.00 rows=10000 width=12) (actual time=56.177..58.923 rows=10000 loops=1)
Group Key: page, series
Batches: 1 Memory Usage: 1425kB
-> Seq Scan on tags (cost=0.00..1541.00 rows=100000 width=12) (actual time=0.010..12.845 rows=100000 loops=1)
Planning Time: 0.129 ms
Execution Time: 59.644 ms
It would be massively faster IF the number of iterations in the nested loop was small, in other words if there was a low number of distinct (page,series). I'll try with series alone, since that has only 10 distinct values:
CREATE TABLE series AS SELECT DISTINCT series FROM tags;
EXPLAIN ANALYZE SELECT p.*, minindex FROM series p CROSS JOIN LATERAL (SELECT index minindex FROM tags t WHERE t.series=p.series ORDER BY series,index LIMIT 1) x;
Nested Loop (cost=0.29..886.18 rows=2550 width=8) (actual time=0.081..0.264 rows=10 loops=1)
-> Seq Scan on series p (cost=0.00..35.50 rows=2550 width=4) (actual time=0.007..0.010 rows=10 loops=1)
-> Limit (cost=0.29..0.31 rows=1 width=8) (actual time=0.024..0.024 rows=1 loops=10)
-> Index Only Scan using tags_series_index_idx on tags t (cost=0.29..211.29 rows=10000 width=8) (actual time=0.023..0.023 rows=1 loops=10)
Index Cond: (series = p.series)
Heap Fetches: 0
Planning Time: 0.198 ms
Execution Time: 0.292 ms
In this case, definitely worth it, because the query hits only 10/100000 rows. The other queries hit 10000/100000 rows, or 10% of the table, which is above the threshold where an index would really help.
Note putting the column with lower cardinality first will result in a smaller index:
CREATE INDEX ON tags (series, page, index);
select pg_relation_size( 'tags_page_series_index_idx' );
4284416
select pg_relation_size( 'tags_series_page_index_idx' );
3104768
...but it doesn't make the query any faster.
If this type of stuff is really critical, perhaps try clickhouse or dolphindb.
To support that kind of thing PostgreSQL would have to have something like an index skip scan, and it is only efficient to use that if there are few groups.
If the speed of that query is essential, you could consider using a materialized view.
This is the query:
EXPLAIN (analyze, BUFFERS, SETTINGS)
SELECT
operation.id
FROM
operation
RIGHT JOIN(
SELECT uid, did FROM (
SELECT uid, did FROM operation where id = 993754
) t
) parts ON (operation.uid = parts.uid AND operation.did = parts.did)
and EXPLAIN info:
Nested Loop Left Join (cost=0.85..29695.77 rows=100 width=8) (actual time=13.709..13.711 rows=1 loops=1)
Buffers: shared hit=4905
-> Unique (cost=0.42..8.45 rows=1 width=16) (actual time=0.011..0.013 rows=1 loops=1)
Buffers: shared hit=5
-> Index Only Scan using oi on operation operation_1 (cost=0.42..8.44 rows=1 width=16) (actual time=0.011..0.011 rows=1 loops=1)
Index Cond: (id = 993754)
Heap Fetches: 1
Buffers: shared hit=5
-> Index Only Scan using oi on operation (cost=0.42..29686.32 rows=100 width=24) (actual time=13.695..13.696 rows=1 loops=1)
Index Cond: ((uid = operation_1.uid) AND (did = operation_1.did))
Heap Fetches: 1
Buffers: shared hit=4900
Settings: max_parallel_workers_per_gather = '4', min_parallel_index_scan_size = '0', min_parallel_table_scan_size = '0', parallel_setup_cost = '0', parallel_tuple_cost = '0', work_mem = '256MB'
Planning Time: 0.084 ms
Execution Time: 13.728 ms
Why does Nested Loop cost more and more time than sum of childs cost? What can I do for that? The Execution Time should less than 1 ms right?
update:
Nested Loop Left Join (cost=5.88..400.63 rows=101 width=8) (actual time=0.012..0.012 rows=1 loops=1)
Buffers: shared hit=8
-> Index Scan using oi on operation operation_1 (cost=0.42..8.44 rows=1 width=16) (actual time=0.005..0.005 rows=1 loops=1)
Index Cond: (id = 993754)
Buffers: shared hit=4
-> Bitmap Heap Scan on operation (cost=5.45..391.19 rows=100 width=24) (actual time=0.004..0.005 rows=1 loops=1)
Recheck Cond: ((uid = operation_1.uid) AND (did = operation_1.did))
Heap Blocks: exact=1
Buffers: shared hit=4
-> Bitmap Index Scan on ou (cost=0.00..5.42 rows=100 width=0) (actual time=0.003..0.003 rows=1 loops=1)
Index Cond: ((uid = operation_1.uid) AND (did = operation_1.did))
Buffers: shared hit=3
Settings: max_parallel_workers_per_gather = '4', min_parallel_index_scan_size = '0', min_parallel_table_scan_size = '0', parallel_setup_cost = '0', parallel_tuple_cost = '0', work_mem = '256MB'
Planning Time: 0.127 ms
Execution Time: 0.028 ms
Thanks all of you, when I split the index to btree(id) and btree(uid, did), everything's going perfect, but what caused those can not be used together? Any details or rules?
BTW, the sql is used for Real-Time Calculation, there are some Window Functions code didn't show here.
The Nested Loop does not take much time actually. The actual time of 13.709..13.711 means that it took 13.709 ms until the first row was ready to be emitted from this node and it took 0.002 ms until it was finished.
Note that the startup cost of 13.709 ms includes the cost of its two child nodes. Both of the child nodes need to emit at least one row before the nested loop can start.
The Unique child began emitting its first (and only) row after 0.011 ms. The Index Only Scan child however only started to emit its first (and only) row after 13.695 ms. This means that most of your actual time spent is in this Index Only Scan.
There is a great answer here which explains the costs and actual times in depth.
Also there is a nice tool at https://explain.depesz.com which calculates an inclusive and exclusive time for each node. Here it is used for your query plan which clearly shows that most of the time is spent in the Index Only Scan.
Since the query is spending almost all of the time in this index only scan, optimizations there will have the most benefit. Creating a separate index for the columns uid and did on the operation table should improve query time a lot.
CREATE INDEX operation_uid_did ON operation(uid, did);
The current execution plan contains 2 index only scans.
A slow one:
-> Index Only Scan using oi on operation (cost=0.42..29686.32 rows=100 width=24) (actual time=13.695..13.696 rows=1 loops=1)
Index Cond: ((uid = operation_1.uid) AND (did = operation_1.did))
Heap Fetches: 1
Buffers: shared hit=4900
And a fast one:
-> Index Only Scan using oi on operation operation_1 (cost=0.42..8.44 rows=1 width=16) (actual time=0.011..0.011 rows=1 loops=1)
Index Cond: (id = 993754)
Heap Fetches: 1
Buffers: shared hit=5
Both of them use the index oi but have different index conditions. Note how the fast one, who uses the id as index condition only needs to load 5 pages of data (Buffers: shared hit=5). The slow one needs to load 4900 pages instead (Buffers: shared hit=4900). This indicates that the index is optimized to query for id but not so much for uid and did. Probably the index oi covers all 3 columns id, uid, did in this order.
A multi-column btree index can only be used efficently when there are constraints in the query on the leftmost columns. The official documentation about multi-column indexes explains this very well in depth.
Why does Nested Loop cost more and more time than sum of childs cost?
Based on your example, it doesn't. Can you elaborate on what makes you think it does?
Anyway, it seems extravagant to visit 4900 pages to fetch 1 tuple. I'm guessing your tables are not getting vacuumed enough.
Although now I prefer Florian's suggestion, that "uid" and "did" are not the leading columns of the index, and that is why it is slow. It is basically doing a full index scan, using the index as a skinny version of the table. It is a shame that EXPLAIN output doesn't make it clear when a index is being used in this fashion, rather than the traditional "jump to a specific part of the index"
So you have a missing index.
I am having problems optimizing a query in PostgreSQL 9.5.14.
select *
from file as f
join product_collection pc on (f.product_collection_id = pc.id)
where pc.mission_id = 7
order by f.id asc
limit 100;
Takes about 100 seconds. If I drop the limit clause it takes about 0.5:
With limit:
explain (analyze,buffers) ... -- query exactly as above
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.84..859.32 rows=100 width=457) (actual time=102793.422..102856.884 rows=100 loops=1)
Buffers: shared hit=222430592
-> Nested Loop (cost=0.84..58412343.43 rows=6804163 width=457) (actual time=102793.417..102856.872 rows=100 loops=1)
Buffers: shared hit=222430592
-> Index Scan using file_pkey on file f (cost=0.57..23409008.61 rows=113831736 width=330) (actual time=0.048..28207.152 rows=55858772 loops=1)
Buffers: shared hit=55652672
-> Index Scan using product_collection_pkey on product_collection pc (cost=0.28..0.30 rows=1 width=127) (actual time=0.001..0.001 rows=0 loops=55858772)
Index Cond: (id = f.product_collection_id)
Filter: (mission_id = 7)
Rows Removed by Filter: 1
Buffers: shared hit=166777920
Planning time: 0.803 ms
Execution time: 102856.988 ms
Without limit:
=> explain (analyze,buffers) ... -- query as above, just without limit
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=20509671.01..20526681.42 rows=6804163 width=457) (actual time=456.175..510.596 rows=142055 loops=1)
Sort Key: f.id
Sort Method: quicksort Memory: 79392kB
Buffers: shared hit=37956
-> Nested Loop (cost=0.84..16494851.02 rows=6804163 width=457) (actual time=0.044..231.051 rows=142055 loops=1)
Buffers: shared hit=37956
-> Index Scan using product_collection_mission_id_index on product_collection pc (cost=0.28..46.13 rows=87 width=127) (actual time=0.017..0.101 rows=87 loops=1)
Index Cond: (mission_id = 7)
Buffers: shared hit=10
-> Index Scan using file_product_collection_id_index on file f (cost=0.57..187900.11 rows=169535 width=330) (actual time=0.007..1.335 rows=1633 loops=87)
Index Cond: (product_collection_id = pc.id)
Buffers: shared hit=37946
Planning time: 0.807 ms
Execution time: 569.865 ms
I have copied the database to a backup server so that I may safely manipulate the database without something else changing it on me.
Cardinalities:
Table file: 113,831,736 rows.
Table product_collection: 1370 rows.
The query without LIMIT: 142,055 rows.
SELECT count(*) FROM product_collection WHERE mission_id = 7: 87 rows.
What I have tried:
searching stack overflow
vacuum full analyze
creating two column indexes on file.product_collection_id & file.id. (there already are single column indexes on every field touched.)
creating two column indexes on file.id & file.product_collection_id.
increasing the statistics on file.id & file.product_collection_id, then re-vacuum analyze.
changing various query planner settings.
creating non-materialized views.
walking up and down the hallway while muttering to myself.
None of them seem to change the performance in a significant way.
Thoughts?
UPDATE from OP:
Tested this on PostgreSQL 9.6 & 10.4, and found no significant changes in plans or performance.
However, setting random_page_cost low enough is the only way to get faster performance on the without limit search.
With a default random_page_cost = 4, the without limit:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=9270013.01..9287875.64 rows=7145054 width=457) (actual time=47782.523..47843.812 rows=145697 loops=1)
Sort Key: f.id
Sort Method: external sort Disk: 59416kB
Buffers: shared hit=3997185 read=1295264, temp read=7427 written=7427
-> Hash Join (cost=24.19..6966882.72 rows=7145054 width=457) (actual time=1.323..47458.767 rows=145697 loops=1)
Hash Cond: (f.product_collection_id = pc.id)
Buffers: shared hit=3997182 read=1295264
-> Seq Scan on file f (cost=0.00..6458232.17 rows=116580217 width=330) (actual time=0.007..17097.581 rows=116729984 loops=1)
Buffers: shared hit=3997169 read=1295261
-> Hash (cost=23.08..23.08 rows=89 width=127) (actual time=0.840..0.840 rows=87 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 15kB
Buffers: shared hit=13 read=3
-> Bitmap Heap Scan on product_collection pc (cost=4.97..23.08 rows=89 width=127) (actual time=0.722..0.801 rows=87 loops=1)
Recheck Cond: (mission_id = 7)
Heap Blocks: exact=10
Buffers: shared hit=13 read=3
-> Bitmap Index Scan on product_collection_mission_id_index (cost=0.00..4.95 rows=89 width=0) (actual time=0.707..0.707 rows=87 loops=1)
Index Cond: (mission_id = 7)
Buffers: shared hit=3 read=3
Planning time: 0.929 ms
Execution time: 47911.689 ms
User Erwin's answer below will take me some time to fully understand and generalize to all of the use cases needed. In the mean time we will probably use either a materialized view or just flatten our table structure.
This query is harder for the Postgres query planner than it might look. Depending on cardinalities, data distribution, value frequencies, sizes, ... completely different query plans can prevail and the planner has a hard time predicting which is best. Current versions of Postgres are better at this in several aspects, but it's still hard to optimize.
Since you retrieve only relatively few rows from product_collection, this equivalent query with LIMIT in a LATERAL subquery should avoid performance degradation:
SELECT *
FROM product_collection pc
CROSS JOIN LATERAL (
SELECT *
FROM file f -- big table
WHERE f.product_collection_id = pc.id
ORDER BY f.id
LIMIT 100
) f
WHERE pc.mission_id = 7
ORDER BY f.id
LIMIT 100;
Edit: This results in a query plan with explain (analyze,verbose) provided by the OP:
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=30524.34..30524.59 rows=100 width=457) (actual time=13.128..13.167 rows=100 loops=1)
Buffers: shared hit=3213
-> Sort (cost=30524.34..30546.09 rows=8700 width=457) (actual time=13.126..13.152 rows=100 loops=1)
Sort Key: file.id
Sort Method: top-N heapsort Memory: 76kB
Buffers: shared hit=3213
-> Nested Loop (cost=0.57..30191.83 rows=8700 width=457) (actual time=0.060..9.868 rows=2880 loops=1)
Buffers: shared hit=3213
-> Seq Scan on product_collection pc (cost=0.00..69.12 rows=87 width=127) (actual time=0.024..0.336 rows=87 loops=1)
Filter: (mission_id = 7)
Rows Removed by Filter: 1283
Buffers: shared hit=13
-> Limit (cost=0.57..344.24 rows=100 width=330) (actual time=0.008..0.071 rows=33 loops=87)
Buffers: shared hit=3200
-> Index Scan using file_pc_id_index on file (cost=0.57..582642.42 rows=169535 width=330) (actual time=0.007..0.065 rows=33 loops=87)
Index Cond: (product_collection_id = pc.id)
Buffers: shared hit=3200
Planning time: 0.595 ms
Execution time: 13.319 ms
You need these indexes (will help your original query, too):
CREATE INDEX idx1 ON file (product_collection_id, id); -- crucial
CREATE INDEX idx2 ON product_collection (mission_id, id); -- helpful
You mentioned:
two column indexes on file.id & file.product_collection_id.
Etc. But we need it the other way round: id last. The order of index expressions is crucial. See:
Is a composite index also good for queries on the first field?
Rationale: With only 87 rows from product_collection, we only fetch a maximum of 87 x 100 = 8700 rows (fewer if not every pc.id has 100 rows in table file), which are then sorted before picking the top 100. Performance degrades with the number of rows you get from product_collection and with bigger LIMIT.
With the multicolumn index idx1 above, that's 87 fast index scans. The rest is not very expensive.
More optimization is possible, depending on additional information. Related:
Can spatial index help a “range - order by - limit” query
I have the following query:
SELECT "person_dimensions"."dimension"
FROM "person_dimensions"
join users
on users.id = person_dimensions.user_id
where users.team_id = 2
The following is the result of EXPLAIN ANALYZE:
Nested Loop (cost=0.43..93033.84 rows=452 width=11) (actual time=1245.321..42915.426 rows=827 loops=1)
-> Seq Scan on person_dimensions (cost=0.00..254.72 rows=13772 width=15) (actual time=0.022..9.907 rows=13772 loops=1)
-> Index Scan using users_pkey on users (cost=0.43..6.73 rows=1 width=4) (actual time=2.978..3.114 rows=0 loops=13772)
Index Cond: (id = person_dimensions.user_id)
Filter: (team_id = 2)
Rows Removed by Filter: 1
Planning time: 0.396 ms
Execution time: 42915.678 ms
Indexes exist on person_dimensions.user_id and users.team_id, so it is unclear as to why this seemingly simple query would be taking so long.
Maybe it has something to do with team_id being unable to be used in the join condition? Ideas how to speed this up?
EDIT:
I tried this query:
SELECT "person_dimensions"."dimension"
FROM "person_dimensions"
join users ON users.id = person_dimensions.user_id
WHERE users.id IN (2337,2654,3501,56,4373,1060,3170,97,4629,41,3175,4541,2827)
which contains the id's returned by the subquery:
SELECT id FROM users WHERE team_id = 2
The result was 380ms versus 42s as above. I could use this as a workaround, but I am really curious as to what is going on here...
I rebooted my DB server yesterday, and when it came back up this same query was performing as expected with a completely different query plan that used expected indices:
QUERY PLAN
Hash Join (cost=1135.63..1443.45 rows=84 width=11) (actual time=0.354..6.312 rows=835 loops=1)
Hash Cond: (person_dimensions.user_id = users.id)
-> Seq Scan on person_dimensions (cost=0.00..255.17 rows=13817 width=15) (actual time=0.002..2.764 rows=13902 loops=1)
-> Hash (cost=1132.96..1132.96 rows=214 width=4) (actual time=0.175..0.175 rows=60 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 11kB
-> Bitmap Heap Scan on users (cost=286.07..1132.96 rows=214 width=4) (actual time=0.032..0.157 rows=60 loops=1)
Recheck Cond: (team_id = 2)
Heap Blocks: exact=68
-> Bitmap Index Scan on index_users_on_team_id (cost=0.00..286.02 rows=214 width=0) (actual time=0.021..0.021 rows=82 loops=1)
Index Cond: (team_id = 2)
Planning time: 0.215 ms
Execution time: 6.474 ms
Anyone have any ideas why it required a reboot to be aware of all of this? Could it be that manual vacuums were required that hadn't been done in a while, or something like this? Recall I did do an analyze on the relevant tables before the reboot and it didn't change anything.