Postgres 9.6 function performing poorly compared to straight sql

Postgres 9.6 function performing poorly compared to straight sql - postgresql

I have this function, and it works, it gives the most recent b record.
create or replace function most_recent_b(the_a a) returns b as $$
select distinct on (c.a_id) b.*
from c
join b on b.c_id = c.id
where c.a_id = the_a.id
order by c.a_id, b.date desc
$$ language sql stable;
This runs ~5000ms with real data. V.S. the following which runs in 500ms
create or replace function most_recent_b(the_a a) returns b as $$
select distinct on (c.a_id) b.*
from c
join b on b.c_id = c.id
where c.a_id = 1347
order by c.a_id, b.date desc
$$ language sql stable;
The only difference being that I've hard coded a.id with a value 1347 instead of using its param value.
Also running this query without a function also gives me speeds around 500ms
I'm running PostgreSQL 9.6, so the query planner failing in functions results I see suggested elsewhere shouldn't apply to me right?
I'm sure its not the query itself that is the issue, as this is my third iteration at it, different techniques to get this result all result in the same slow down when inside a function.
As requested by #laurenz-albe
Result of EXPLAIN (ANALYZE, BUFFERS) with constant
Unique (cost=60.88..60.89 rows=3 width=463) (actual time=520.117..520.122 rows=1 loops=1)
Buffers: shared hit=14555
-> Sort (cost=60.88..60.89 rows=3 width=463) (actual time=520.116..520.120 rows=9 loops=1)
Sort Key: b.date DESC
Sort Method: quicksort Memory: 28kB
Buffers: shared hit=14555
-> Hash Join (cost=13.71..60.86 rows=3 width=463) (actual time=386.848..520.083 rows=9 loops=1)
Hash Cond: (b.c_id = c.id)
Buffers: shared hit=14555
-> Seq Scan on b (cost=0.00..46.38 rows=54 width=459) (actual time=25.362..519.140 rows=51 loops=1)
Filter: b_can_view(b.*)
Rows Removed by Filter: 112
Buffers: shared hit=14530
-> Hash (cost=13.67..13.67 rows=3 width=8) (actual time=0.880..0.880 rows=10 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=25
-> Subquery Scan on c (cost=4.21..13.67 rows=3 width=8) (actual time=0.222..0.872 rows=10 loops=1)
Buffers: shared hit=25
-> Bitmap Heap Scan on c c_1 (cost=4.21..13.64 rows=3 width=2276) (actual time=0.221..0.863 rows=10 loops=1)
Recheck Cond: (a_id = 1347)
Filter: c_can_view(c_1.*)
Heap Blocks: exact=4
Buffers: shared hit=25
-> Bitmap Index Scan on c_a_id_c_number_idx (cost=0.00..4.20 rows=8 width=0) (actual time=0.007..0.007 rows=10 loops=1)
Index Cond: (a_id = 1347)
Buffers: shared hit=1
Execution time: 520.256 ms
And this is the result after running six times with the parameter being passed ( it was exactly six times as you predicted :) )
Slow query;
Unique (cost=57.07..57.07 rows=1 width=463) (actual time=5040.237..5040.243 rows=1 loops=1)
Buffers: shared hit=145325
-> Sort (cost=57.07..57.07 rows=1 width=463) (actual time=5040.237..5040.240 rows=9 loops=1)
Sort Key: b.date DESC
Sort Method: quicksort Memory: 28kB
Buffers: shared hit=145325
-> Nested Loop (cost=0.14..57.06 rows=1 width=463) (actual time=912.354..5040.195 rows=9 loops=1)
Join Filter: (c.id = b.c_id)
Rows Removed by Join Filter: 501
Buffers: shared hit=145325
-> Index Scan using c_a_id_idx on c (cost=0.14..9.45 rows=1 width=2276) (actual time=0.378..1.171 rows=10 loops=1)
Index Cond: (a_id = $1)
Filter: c_can_view(c.*)
Buffers: shared hit=25
-> Seq Scan on b (cost=0.00..46.38 rows=54 width=459) (actual time=24.842..503.854 rows=51 loops=10)
Filter: b_can_view(b.*)
Rows Removed by Filter: 112
Buffers: shared hit=145300
Execution time: 5040.375 ms
Its worth noting that I have some strict row level security involved, and I suspect this is why these queries are both slow, however, one is 10 times slower than the other.
I've changed my original table names hopefully my search and replace was good here.

The expensive part of your query execution is the filter b_can_view(b.*), which must come from your row level security definition.
The fast execution:
Seq Scan on b (cost=0.00..46.38 rows=54 width=459)
(actual time=25.362..519.140 rows=51 loops=1)
Filter: b_can_view(b.*)
Rows Removed by Filter: 112
Buffers: shared hit=14530
The slow execution:
Seq Scan on b (cost=0.00..46.38 rows=54 width=459)
(actual time=24.842..503.854 rows=51 loops=10)
Filter: b_can_view(b.*)
Rows Removed by Filter: 112
Buffers: shared hit=145300
The difference is that the scan is executed 10 times in the slow case (loops=10) and touches 10 times as many data blocks.
When using the generic plan, PostgreSQL underestimates how many rows in c will satisfy the condition c.a_id = $1, because it doesn't know that the actual value is 1347, which is more frequent than average.
Since PostgreSQL thinks there will be at most one row from c, it chooses a nested loop join with a sequential scan of b on the inner side.
Now two problems combine:
Calling function b_can_view takes over 3 milliseconds per row (which PostgreSQL doesn't know), which accounts for the half second that a sequential scan of the 163 rows takes.
There are actually 10 rows found in c instead of the predicted 1, so table b is scanned 10 times, and you end up with a query duration of 5 seconds.
So what can you do?
Tell PostgreSQL how expensive the b_can_view is. Use ALTER TABLE to set the COST for that function to 1000 or 10000 to reflect reality. That alone will not be enough to get a faster plan, since PostgreSQL thinks that it has to execute a single sequential scan anyway, but it is a good thing to give the optimizer correct data.
Create an index on b(c_id). That will enable PostgreSQL to avoid a sequential scan of b, which it will try to do once it is aware how expensive the function is.
Also, try to make the function b_can_view cheaper. That will make your experience so much better.

Related

Postgres not using index when ORDER BY and LIMIT when LIMIT above X

I have been trying to debug an issue with postgres where it decides to not use an index when LIMIT is above a specific value.
For example I have a table of 150k rows and when searching with LIMIT of 286 it uses the index while with LIMIT above 286 it does not.
LIMIT 286 uses index
db=# explain (analyze, buffers) SELECT * FROM tempz.tempx AS r INNER JOIN tempz.tempy AS z ON (r.id_tempy=z.id) WHERE z.int_col=2000 AND z.string_col='temp_string' ORDER BY r.name ASC, r.type ASC, r.id ASC LIMIT 286;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.56..5024.12 rows=286 width=810) (actual time=0.030..0.992 rows=286 loops=1)
Buffers: shared hit=921
-> Nested Loop (cost=0.56..16968.23 rows=966 width=810) (actual time=0.030..0.977 rows=286 loops=1)
Join Filter: (r.id_tempy = z.id)
Rows Removed by Join Filter: 624
Buffers: shared hit=921
-> Index Scan using tempz_tempx_name_type_id_idx on tempx r (cost=0.42..14357.69 rows=173878 width=373) (actual time=0.016..0.742 rows=910 loops=1)
Buffers: shared hit=919
-> Materialize (cost=0.14..2.37 rows=1 width=409) (actual time=0.000..0.000 rows=1 loops=910)
Buffers: shared hit=2
-> Index Scan using tempy_string_col_idx on tempy z (cost=0.14..2.37 rows=1 width=409) (actual time=0.007..0.008 rows=1 loops=1)
Index Cond: (string_col = 'temp_string'::text)
Filter: (int_col = 2000)
Buffers: shared hit=2
Planning Time: 0.161 ms
Execution Time: 1.032 ms
(16 rows)
vs.
LIMIT 287 doing sort
db=# explain (analyze, buffers) SELECT * FROM tempz.tempx AS r INNER JOIN tempz.tempy AS z ON (r.id_tempy=z.id) WHERE z.int_col=2000 AND z.string_col='temp_string' ORDER BY r.name ASC, r.type ASC, r.id ASC LIMIT 287;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=4976.86..4977.58 rows=287 width=810) (actual time=49.802..49.828 rows=287 loops=1)
Buffers: shared hit=37154
-> Sort (cost=4976.86..4979.27 rows=966 width=810) (actual time=49.801..49.813 rows=287 loops=1)
Sort Key: r.name, r.type, r.id
Sort Method: top-N heapsort Memory: 506kB
Buffers: shared hit=37154
-> Nested Loop (cost=0.42..4932.59 rows=966 width=810) (actual time=0.020..27.973 rows=51914 loops=1)
Buffers: shared hit=37154
-> Seq Scan on tempy z (cost=0.00..12.70 rows=1 width=409) (actual time=0.006..0.008 rows=1 loops=1)
Filter: ((int_col = 2000) AND (string_col = 'temp_string'::text))
Rows Removed by Filter: 2
Buffers: shared hit=1
-> Index Scan using tempx_id_tempy_idx on tempx r (cost=0.42..4340.30 rows=57959 width=373) (actual time=0.012..17.075 rows=51914 loops=1)
Index Cond: (id_tempy = z.id)
Buffers: shared hit=37153
Planning Time: 0.258 ms
Execution Time: 49.907 ms
(17 rows)
Update:
This is Postgres 11 and VACUUM ANALYZE is run daily. Also, I have already tried to use CTE to remove the filter but the problem is the sorting specifically
-> Sort (cost=4976.86..4979.27 rows=966 width=810) (actual time=49.801..49.813 rows=287 loops=1)
Sort Key: r.name, r.type, r.id
Sort Method: top-N heapsort Memory: 506kB
Buffers: shared hit=37154
Update 2:
After running VACUUM ANALYZE the database starts using the index for some hours and then it goes back to not using it.

Turns out that I can force Postgres to avoid doing any sort if I run SET enable_sort TO OFF;. This raises the cost of sorting very high which causes the Postgres planner to do index scan instead.
I am not really sure why Postgres thinks that index scan is so costly cost=0.42..14357.69 and thinks sorting is cheaper and ends up choosing that. It is also very weird that immediately after a VACUUM ANALYZE it analyzes the costs correctly but after some hours it goes back to sorting.
With sort off plan is still not optimized as it does materialize and loads stuff into memory but it is still faster than sorting.

postgresql search is slow on type text[] column

I have product_details table with 30+ Million records. product attributes text type data is stored into column Value1.
Front end(web) users search for product details and it will be queried on column Value1.
create table product_details(
key serial primary key ,
product_key int,
attribute_key int ,
Value1 text[],
Value2 int[],
status text);
I created gin index on column Value1 to improve search query performance.
query execution improved a lot for many queries.
Tables and indexes are here
Below is one of query used by application for search.
select p.key from (select x.product_key,
x.value1,
x.attribute_key,
x.status
from product_details x
where value1 IS NOT NULL
) as pr_d
join attribute_type at on at.key = pr_d.attribute_key
join product p on p.key = pr_d.product_key
where value1_search(pr_d.value1) ilike '%B s%'
and at.type = 'text'
and at.status = 'active'
and pr_d.status = 'active'
and 1 = 1
and p.product_type_key=1
and 1 = 1
group by p.key
query is executed in 2 or 3 secs if we search %B % or any single or two char words and below is query plan
Group (cost=180302.82..180302.83 rows=1 width=4) (actual time=49.006..49.021 rows=65 loops=1)
Group Key: p.key
-> Sort (cost=180302.82..180302.83 rows=1 width=4) (actual time=49.005..49.009 rows=69 loops=1)
Sort Key: p.key
Sort Method: quicksort Memory: 28kB
-> Nested Loop (cost=0.99..180302.81 rows=1 width=4) (actual time=3.491..48.965 rows=69 loops=1)
Join Filter: (x.attribute_key = at.key)
Rows Removed by Join Filter: 10051
-> Nested Loop (cost=0.99..180270.15 rows=1 width=8) (actual time=3.396..45.211 rows=69 loops=1)
-> Index Scan using products_product_type_key_status on product p (cost=0.43..4420.58 rows=1413 width=4) (actual time=0.024..1.473 rows=1630 loops=1)
Index Cond: (product_type_key = 1)
-> Index Scan using product_details_product_attribute_key_status on product_details x (cost=0.56..124.44 rows=1 width=8) (actual time=0.026..0.027 rows=0 loops=1630)
Index Cond: ((product_key = p.key) AND (status = 'active'))
Filter: ((value1 IS NOT NULL) AND (value1_search(value1) ~~* '%B %'::text))
Rows Removed by Filter: 14
-> Seq Scan on attribute_type at (cost=0.00..29.35 rows=265 width=4) (actual time=0.002..0.043 rows=147 loops=69)
Filter: ((value_type = 'text') AND (status = 'active'))
Rows Removed by Filter: 115
Planning Time: 0.732 ms
Execution Time: 49.089 ms
But if i search for %B s%, query took 75 secs and below is query plan (second time query execution took 63 sec)
In below query plan, DB engine didn't consider index for scan as in above query plan indexes were used. Not sure why ?
Group (cost=8057.69..8057.70 rows=1 width=4) (actual time=62138.730..62138.737 rows=12 loops=1)
Group Key: p.key
-> Sort (cost=8057.69..8057.70 rows=1 width=4) (actual time=62138.728..62138.732 rows=14 loops=1)
Sort Key: p.key
Sort Method: quicksort Memory: 25kB
-> Nested Loop (cost=389.58..8057.68 rows=1 width=4) (actual time=2592.685..62138.710 rows=14 loops=1)
-> Hash Join (cost=389.15..4971.85 rows=368 width=4) (actual time=298.280..62129.956 rows=831 loops=1)
Hash Cond: (x.attribute_type = at.key)
-> Bitmap Heap Scan on product_details x (cost=356.48..4937.39 rows=681 width=8) (actual time=298.117..62128.452 rows=831 loops=1)
Recheck Cond: (value1_search(value1) ~~* '%B s%'::text)
Rows Removed by Index Recheck: 26168889
Filter: ((value1 IS NOT NULL) AND (status = 'active'))
Rows Removed by Filter: 22
Heap Blocks: exact=490 lossy=527123
-> Bitmap Index Scan on product_details_value1_gin (cost=0.00..356.31 rows=1109 width=0) (actual time=251.596..251.596 rows=2846970 loops=1)
Index Cond: (value1_search(value1) ~~* '%B s%'::text)
-> Hash (cost=29.35..29.35 rows=265 width=4) (actual time=0.152..0.153 rows=269 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 18kB
-> Seq Scan on attribute_type at (cost=0.00..29.35 rows=265 width=4) (actual time=0.010..0.122 rows=269 loops=1)
Filter: ((value_type = 'text') AND (status = 'active'))
Rows Removed by Filter: 221
-> Index Scan using product_pkey on product p (cost=0.43..8.39 rows=1 width=4) (actual time=0.009..0.009 rows=0 loops=831)
Index Cond: (key = x.product_key)
Filter: (product_type_key = 1)
Rows Removed by Filter: 1
Planning Time: 0.668 ms
Execution Time: 62138.794 ms
Any suggestions pls to improve query for search %B s%
thanks

ilike '%B %' has no usable trigrams in it. The planner knows this, and punishes the pg_trgm index plan so much that the planner then goes with an entirely different plan instead.
But ilike '%B s%' does have one usable trigram in it, ' s'. It turns out that this trigram sucks because it is extremely common in the searched data, but the planner currently has no way to accurately estimate how much it sucks.
Even worse, this large number matches means your full bitmap can't fit in work_mem so it goes lossy. Then it needs to recheck all the tuples in any page which contains even one tuple that has the ' s' trigram in it, which looks like it is most of the pages in your table.
The first thing to do is to increase your work_mem to the point you stop getting lossy blocks. If most of your time is spent in the CPU applying the recheck condition, this should help tremendously. If most of your time is spent reading the product_details from disk (so that the recheck has the data it needs to run) then it won't help much. If you had done EXPLAIN (ANALYZE, BUFFERS) with track_io_timing turned on, then we would already know which is which.
Another thing you could do is have the application inspect the search parameter, and if it looks like two letters (with or without a space between), then forcibly disable that index usage, or just throw an error if there is no good reason to do that type of search. For example, changing the part of the query to look like this will disable the index:
where value1_search(pr_d.value1)||'' ilike '%B s%'
Another thing would be to rethink your data representation. '%B s%' is a peculiar thing to search for. Why would anyone search for that? Does it have some special meaning within the context of your data, which is not obvious to the outside observer? Maybe you could represent it in a different way that gets along better with pg_trgm.
Finally, you could try to improve the planning for GIN indexes generally by explicitly estimating how many tuples are going to fail recheck (due to inherent lossiness of the index, not due to overrunning work_mem). This would be a major undertaking, and you would be unlikely to see it in production for at least a couple years, if ever.

PostgreSQL slow order

I have table (over 100 millions records) on PostgreSQL 13.1
CREATE TABLE report
(
id serial primary key,
license_plate_id integer,
datetime timestamp
);
Indexes (for test I create both of them):
create index report_lp_datetime_index on report (license_plate_id, datetime);
create index report_lp_datetime_desc_index on report (license_plate_id desc, datetime desc);
So, my question is why query like
select * from report r
where r.license_plate_id in (1,2,4,5,6,7,8,10,15,22,34,75)
order by datetime desc
limit 100
Is very slow (~10sec). But query without order statement is fast (milliseconds).
Explain:
explain (analyze, buffers, format text) select * from report r
where r.license_plate_id in (1,2,4,5,6,7,8,10,15,22,34, 75,374,57123)
limit 100
Limit (cost=0.57..400.38 rows=100 width=316) (actual time=0.037..0.216 rows=100 loops=1)
Buffers: shared hit=103
-> Index Scan using report_lp_id_idx on report r (cost=0.57..44986.97 rows=11252 width=316) (actual time=0.035..0.202 rows=100 loops=1)
Index Cond: (license_plate_id = ANY ('{1,2,4,5,6,7,8,10,15,22,34,75,374,57123}'::integer[]))
Buffers: shared hit=103
Planning Time: 0.228 ms
Execution Time: 0.251 ms
explain (analyze, buffers, format text) select * from report r
where r.license_plate_id in (1,2,4,5,6,7,8,10,15,22,34,75,374,57123)
order by datetime desc
limit 100
Limit (cost=44193.63..44193.88 rows=100 width=316) (actual time=4921.030..4921.047 rows=100 loops=1)
Buffers: shared hit=11455 read=671
-> Sort (cost=44193.63..44221.76 rows=11252 width=316) (actual time=4921.028..4921.035 rows=100 loops=1)
Sort Key: datetime DESC
Sort Method: top-N heapsort Memory: 128kB
Buffers: shared hit=11455 read=671
-> Bitmap Heap Scan on report r (cost=151.18..43763.59 rows=11252 width=316) (actual time=54.422..4911.927 rows=12148 loops=1)
Recheck Cond: (license_plate_id = ANY ('{1,2,4,5,6,7,8,10,15,22,34,75,374,57123}'::integer[]))
Heap Blocks: exact=12063
Buffers: shared hit=11455 read=671
-> Bitmap Index Scan on report_lp_id_idx (cost=0.00..148.37 rows=11252 width=0) (actual time=52.631..52.632 rows=12148 loops=1)
Index Cond: (license_plate_id = ANY ('{1,2,4,5,6,7,8,10,15,22,34,75,374,57123}'::integer[]))
Buffers: shared hit=59 read=4
Planning Time: 0.427 ms
Execution Time: 4921.128 ms

You seem to have rather slow storage, if reading 671 8kB-blocks from disk takes a couple of seconds.
The way to speed this up is to reorder the table in the same way as the index, so that you can find the required rows in the same or adjacent table blocks:
CLUSTER report_lp_id_idx USING report_lp_id_idx;
Be warned that rewriting the table in this way causes downtime – the table will not be available while it is being rewritten. Moreover, PostgreSQL does not maintain the table order, so subsequent data modifications will cause performance to gradually deteriorate, so that after a while you will have to run CLUSTER again.
But if you need this query to be fast no matter what, CLUSTER is the way to go.

Your two indices do exactly the same thing, so you can remove the second one, it's useless.
To optimize your query, the order of the fields inside the index must be reversed:
create index report_lp_datetime_index on report (datetime,license_plate_id);
BEGIN;
CREATE TABLE foo (d INTEGER, i INTEGER);
INSERT INTO foo SELECT random()*100000, random()*1000 FROM generate_series(1,1000000) s;
CREATE INDEX foo_d_i ON foo(d DESC,i);
COMMIT;
VACUUM ANALYZE foo;
EXPLAIN ANALYZE SELECT * FROM foo WHERE i IN (1,2,4,5,6,7,8,10,15,22,34,75) ORDER BY d DESC LIMIT 100;
Limit (cost=0.42..343.92 rows=100 width=8) (actual time=0.076..9.359 rows=100 loops=1)
-> Index Only Scan Backward using foo_d_i on foo (cost=0.42..40976.43 rows=11929 width=8) (actual time=0.075..9.339 rows=100 loops=1)
Filter: (i = ANY ('{1,2,4,5,6,7,8,10,15,22,34,75}'::integer[]))
Rows Removed by Filter: 9016
Heap Fetches: 0
Planning Time: 0.339 ms
Execution Time: 9.387 ms
Note the index is not used to optimize the WHERE clause. It is used here as a compact and fast way to store references to the rows ordered by date DESC, so the ORDER BY can do an index-only scan and avoid sorting. By adding column id to the index, an index-only scan can be performed to test the condition on id, without hitting the table for every row. Since there is a low LIMIT value it does not need to scan the whole index, it only scans it in date DESC order until it finds enough rows satisfying the WHERE condition to return the result.
It will be faster if you create the index in date DESC order, this could be useful if you use ORDER BY date DESC + LIMIT in other queries too.
You forget that OP's table has a third column, and he is using SELECT *. So that wouldn't be an index-only scan.
Easy to work around. The optimum way to do this query would be an index-only scan to filter on WHERE conditions, then LIMIT, then hit the table to get the rows. For some reason if "select *" is used postgres takes the id column from the table instead of taking it from the index, which results in lots of unnecessary heap fetches for rows whose id is rejected by the WHERE condition.
Easy to work around, by doing it manually. I've also added another bogus column to make sure the SELECT * hits the table.
EXPLAIN (ANALYZE,buffers) SELECT * FROM foo
JOIN (SELECT d,i FROM foo WHERE i IN (1,2,4,5,6,7,8,10,15,22,34,75) ORDER BY d DESC LIMIT 100) f USING (d,i)
ORDER BY d DESC LIMIT 100;
Limit (cost=0.85..1281.94 rows=1 width=17) (actual time=0.052..3.618 rows=100 loops=1)
Buffers: shared hit=453
-> Nested Loop (cost=0.85..1281.94 rows=1 width=17) (actual time=0.050..3.594 rows=100 loops=1)
Buffers: shared hit=453
-> Limit (cost=0.42..435.44 rows=100 width=8) (actual time=0.037..2.953 rows=100 loops=1)
Buffers: shared hit=53
-> Index Only Scan using foo_d_i on foo foo_1 (cost=0.42..51936.43 rows=11939 width=8) (actual time=0.037..2.935 rows=100 loops=1)
Filter: (i = ANY ('{1,2,4,5,6,7,8,10,15,22,34,75}'::integer[]))
Rows Removed by Filter: 9010
Heap Fetches: 0
Buffers: shared hit=53
-> Index Scan using foo_d_i on foo (cost=0.42..8.45 rows=1 width=17) (actual time=0.005..0.005 rows=1 loops=100)
Index Cond: ((d = foo_1.d) AND (i = foo_1.i))
Buffers: shared hit=400
Execution Time: 3.663 ms
Another option is to just add the primary key to the date,license_plate index.
SELECT * FROM foo JOIN (SELECT id FROM foo WHERE i IN (1,2,4,5,6,7,8,10,15,22,34,75) ORDER BY d DESC LIMIT 100) f USING (id) ORDER BY d DESC LIMIT 100;
Limit (cost=1357.98..1358.23 rows=100 width=17) (actual time=3.920..3.947 rows=100 loops=1)
Buffers: shared hit=473
-> Sort (cost=1357.98..1358.23 rows=100 width=17) (actual time=3.919..3.931 rows=100 loops=1)
Sort Key: foo.d DESC
Sort Method: quicksort Memory: 32kB
Buffers: shared hit=473
-> Nested Loop (cost=0.85..1354.66 rows=100 width=17) (actual time=0.055..3.858 rows=100 loops=1)
Buffers: shared hit=473
-> Limit (cost=0.42..509.41 rows=100 width=8) (actual time=0.039..3.116 rows=100 loops=1)
Buffers: shared hit=73
-> Index Only Scan using foo_d_i_id on foo foo_1 (cost=0.42..60768.43 rows=11939 width=8) (actual time=0.039..3.093 rows=100 loops=1)
Filter: (i = ANY ('{1,2,4,5,6,7,8,10,15,22,34,75}'::integer[]))
Rows Removed by Filter: 9010
Heap Fetches: 0
Buffers: shared hit=73
-> Index Scan using foo_pkey on foo (cost=0.42..8.44 rows=1 width=17) (actual time=0.006..0.006 rows=1 loops=100)
Index Cond: (id = foo_1.id)
Buffers: shared hit=400
Execution Time: 3.972 ms
Edit
After thinking about it... since the LIMIT restricts the output to 100 rows ordered by date desc, wouldn't it be nice if we could get the 100 most recent rows for each license_plate_id, put all that into a top-n sort, and only keep the best 100 for all license_plate_ids? That would avoid reading and throwing away a lot of rows from the index. Even if that's much faster than hitting the table, it will still load up these index pages in RAM and clog up your buffers with stuff you don't actually need to keep in cache. Let's use LATERAL JOIN:
EXPLAIN (ANALYZE,BUFFERS)
SELECT * FROM foo
JOIN (SELECT d,i FROM
(VALUES (1),(2),(4),(5),(6),(7),(8),(10),(15),(22),(34),(75)) idlist
CROSS JOIN LATERAL
(SELECT d,i FROM foo WHERE i=idlist.column1 ORDER BY d DESC LIMIT 100) f2
ORDER BY d DESC LIMIT 100
) f3 USING (d,i)
ORDER BY d DESC LIMIT 100;
It's even faster: 2ms, and it uses the index on (license_plate_id,date) instead of the other way around. Also, and this is important, since each subquery in the lateral hits only the index pages that contain rows that will actually be selected, while the previous queries hit much more index pages. So you save on RAM buffers.
If you don't need the index on (date,license_plate_id) and don't want to keep a useless index, that could be interesting since this query doesn't use it. On the other hand, if you need the index on (date,license_plate_id) for something else and want to keep it, then... maybe not.
Please post results for the winning query 🔥

LIMIT with ORDER BY makes query slow

I am having problems optimizing a query in PostgreSQL 9.5.14.
select *
from file as f
join product_collection pc on (f.product_collection_id = pc.id)
where pc.mission_id = 7
order by f.id asc
limit 100;
Takes about 100 seconds. If I drop the limit clause it takes about 0.5:
With limit:
explain (analyze,buffers) ... -- query exactly as above
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.84..859.32 rows=100 width=457) (actual time=102793.422..102856.884 rows=100 loops=1)
Buffers: shared hit=222430592
-> Nested Loop (cost=0.84..58412343.43 rows=6804163 width=457) (actual time=102793.417..102856.872 rows=100 loops=1)
Buffers: shared hit=222430592
-> Index Scan using file_pkey on file f (cost=0.57..23409008.61 rows=113831736 width=330) (actual time=0.048..28207.152 rows=55858772 loops=1)
Buffers: shared hit=55652672
-> Index Scan using product_collection_pkey on product_collection pc (cost=0.28..0.30 rows=1 width=127) (actual time=0.001..0.001 rows=0 loops=55858772)
Index Cond: (id = f.product_collection_id)
Filter: (mission_id = 7)
Rows Removed by Filter: 1
Buffers: shared hit=166777920
Planning time: 0.803 ms
Execution time: 102856.988 ms
Without limit:
=> explain (analyze,buffers) ... -- query as above, just without limit
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=20509671.01..20526681.42 rows=6804163 width=457) (actual time=456.175..510.596 rows=142055 loops=1)
Sort Key: f.id
Sort Method: quicksort Memory: 79392kB
Buffers: shared hit=37956
-> Nested Loop (cost=0.84..16494851.02 rows=6804163 width=457) (actual time=0.044..231.051 rows=142055 loops=1)
Buffers: shared hit=37956
-> Index Scan using product_collection_mission_id_index on product_collection pc (cost=0.28..46.13 rows=87 width=127) (actual time=0.017..0.101 rows=87 loops=1)
Index Cond: (mission_id = 7)
Buffers: shared hit=10
-> Index Scan using file_product_collection_id_index on file f (cost=0.57..187900.11 rows=169535 width=330) (actual time=0.007..1.335 rows=1633 loops=87)
Index Cond: (product_collection_id = pc.id)
Buffers: shared hit=37946
Planning time: 0.807 ms
Execution time: 569.865 ms
I have copied the database to a backup server so that I may safely manipulate the database without something else changing it on me.
Cardinalities:
Table file: 113,831,736 rows.
Table product_collection: 1370 rows.
The query without LIMIT: 142,055 rows.
SELECT count(*) FROM product_collection WHERE mission_id = 7: 87 rows.
What I have tried:
searching stack overflow
vacuum full analyze
creating two column indexes on file.product_collection_id & file.id. (there already are single column indexes on every field touched.)
creating two column indexes on file.id & file.product_collection_id.
increasing the statistics on file.id & file.product_collection_id, then re-vacuum analyze.
changing various query planner settings.
creating non-materialized views.
walking up and down the hallway while muttering to myself.
None of them seem to change the performance in a significant way.
Thoughts?
UPDATE from OP:
Tested this on PostgreSQL 9.6 & 10.4, and found no significant changes in plans or performance.
However, setting random_page_cost low enough is the only way to get faster performance on the without limit search.
With a default random_page_cost = 4, the without limit:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=9270013.01..9287875.64 rows=7145054 width=457) (actual time=47782.523..47843.812 rows=145697 loops=1)
Sort Key: f.id
Sort Method: external sort Disk: 59416kB
Buffers: shared hit=3997185 read=1295264, temp read=7427 written=7427
-> Hash Join (cost=24.19..6966882.72 rows=7145054 width=457) (actual time=1.323..47458.767 rows=145697 loops=1)
Hash Cond: (f.product_collection_id = pc.id)
Buffers: shared hit=3997182 read=1295264
-> Seq Scan on file f (cost=0.00..6458232.17 rows=116580217 width=330) (actual time=0.007..17097.581 rows=116729984 loops=1)
Buffers: shared hit=3997169 read=1295261
-> Hash (cost=23.08..23.08 rows=89 width=127) (actual time=0.840..0.840 rows=87 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 15kB
Buffers: shared hit=13 read=3
-> Bitmap Heap Scan on product_collection pc (cost=4.97..23.08 rows=89 width=127) (actual time=0.722..0.801 rows=87 loops=1)
Recheck Cond: (mission_id = 7)
Heap Blocks: exact=10
Buffers: shared hit=13 read=3
-> Bitmap Index Scan on product_collection_mission_id_index (cost=0.00..4.95 rows=89 width=0) (actual time=0.707..0.707 rows=87 loops=1)
Index Cond: (mission_id = 7)
Buffers: shared hit=3 read=3
Planning time: 0.929 ms
Execution time: 47911.689 ms
User Erwin's answer below will take me some time to fully understand and generalize to all of the use cases needed. In the mean time we will probably use either a materialized view or just flatten our table structure.

This query is harder for the Postgres query planner than it might look. Depending on cardinalities, data distribution, value frequencies, sizes, ... completely different query plans can prevail and the planner has a hard time predicting which is best. Current versions of Postgres are better at this in several aspects, but it's still hard to optimize.
Since you retrieve only relatively few rows from product_collection, this equivalent query with LIMIT in a LATERAL subquery should avoid performance degradation:
SELECT *
FROM product_collection pc
CROSS JOIN LATERAL (
SELECT *
FROM file f -- big table
WHERE f.product_collection_id = pc.id
ORDER BY f.id
LIMIT 100
) f
WHERE pc.mission_id = 7
ORDER BY f.id
LIMIT 100;
Edit: This results in a query plan with explain (analyze,verbose) provided by the OP:
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=30524.34..30524.59 rows=100 width=457) (actual time=13.128..13.167 rows=100 loops=1)
Buffers: shared hit=3213
-> Sort (cost=30524.34..30546.09 rows=8700 width=457) (actual time=13.126..13.152 rows=100 loops=1)
Sort Key: file.id
Sort Method: top-N heapsort Memory: 76kB
Buffers: shared hit=3213
-> Nested Loop (cost=0.57..30191.83 rows=8700 width=457) (actual time=0.060..9.868 rows=2880 loops=1)
Buffers: shared hit=3213
-> Seq Scan on product_collection pc (cost=0.00..69.12 rows=87 width=127) (actual time=0.024..0.336 rows=87 loops=1)
Filter: (mission_id = 7)
Rows Removed by Filter: 1283
Buffers: shared hit=13
-> Limit (cost=0.57..344.24 rows=100 width=330) (actual time=0.008..0.071 rows=33 loops=87)
Buffers: shared hit=3200
-> Index Scan using file_pc_id_index on file (cost=0.57..582642.42 rows=169535 width=330) (actual time=0.007..0.065 rows=33 loops=87)
Index Cond: (product_collection_id = pc.id)
Buffers: shared hit=3200
Planning time: 0.595 ms
Execution time: 13.319 ms
You need these indexes (will help your original query, too):
CREATE INDEX idx1 ON file (product_collection_id, id); -- crucial
CREATE INDEX idx2 ON product_collection (mission_id, id); -- helpful
You mentioned:
two column indexes on file.id & file.product_collection_id.
Etc. But we need it the other way round: id last. The order of index expressions is crucial. See:
Is a composite index also good for queries on the first field?
Rationale: With only 87 rows from product_collection, we only fetch a maximum of 87 x 100 = 8700 rows (fewer if not every pc.id has 100 rows in table file), which are then sorted before picking the top 100. Performance degrades with the number of rows you get from product_collection and with bigger LIMIT.
With the multicolumn index idx1 above, that's 87 fast index scans. The rest is not very expensive.
More optimization is possible, depending on additional information. Related:
Can spatial index help a “range - order by - limit” query

How does one interpret the following PostgreSQL query plan

Please, observe:
(Forgot to add order, the plan is updated)
The query:
EXPLAIN ANALYZE
SELECT DISTINCT(id), special, customer, business_no, bill_to_name, bill_to_address1, bill_to_address2, bill_to_postal_code, ship_to_name, ship_to_address1, ship_to_address2, ship_to_postal_code,
purchase_order_no, ship_date::text, calc_discount_text(o) AS discount, discount_absolute, delivery, hst_percents, sub_total, total_before_hst, hst, total, total_discount, terms, rep, ship_via,
item_count, version, to_char(modified, 'YYYY-MM-DD HH24:MI:SS') AS "modified", to_char(created, 'YYYY-MM-DD HH24:MI:SS') AS "created"
FROM invoices o
LEFT JOIN reps ON reps.rep_id = o.rep_id
LEFT JOIN terms ON terms.terms_id = o.terms_id
LEFT JOIN shipVia ON shipVia.ship_via_id = o.ship_via_id
JOIN invoiceItems items ON items.invoice_id = o.id
WHERE items.qty < 5
ORDER BY modified
LIMIT 100
The result:
Limit (cost=2931740.10..2931747.85 rows=100 width=635) (actual time=414307.004..414387.899 rows=100 loops=1)
-> Unique (cost=2931740.10..3076319.37 rows=1865539 width=635) (actual time=414307.001..414387.690 rows=100 loops=1)
-> Sort (cost=2931740.10..2936403.95 rows=1865539 width=635) (actual time=414307.000..414325.058 rows=2956 loops=1)
Sort Key: (to_char(o.modified, 'YYYY-MM-DD HH24:MI:SS'::text)), o.id, o.special, o.customer, o.business_no, o.bill_to_name, o.bill_to_address1, o.bill_to_address2, o.bill_to_postal_code, o.ship_to_name, o.ship_to_address1, o.ship_to_address2, (...)
Sort Method: external merge Disk: 537240kB
-> Hash Join (cost=11579.63..620479.38 rows=1865539 width=635) (actual time=1535.805..131378.864 rows=1872673 loops=1)
Hash Cond: (items.invoice_id = o.id)
-> Seq Scan on invoiceitems items (cost=0.00..78363.45 rows=1865539 width=4) (actual time=0.110..4591.117 rows=1872673 loops=1)
Filter: (qty < 5)
Rows Removed by Filter: 1405763
-> Hash (cost=5498.18..5498.18 rows=64996 width=635) (actual time=1530.786..1530.786 rows=64996 loops=1)
Buckets: 1024 Batches: 64 Memory Usage: 598kB
-> Hash Left Join (cost=113.02..5498.18 rows=64996 width=635) (actual time=0.214..1043.207 rows=64996 loops=1)
Hash Cond: (o.ship_via_id = shipvia.ship_via_id)
-> Hash Left Join (cost=75.35..4566.81 rows=64996 width=607) (actual time=0.154..754.957 rows=64996 loops=1)
Hash Cond: (o.terms_id = terms.terms_id)
-> Hash Left Join (cost=37.67..3800.33 rows=64996 width=579) (actual time=0.071..506.145 rows=64996 loops=1)
Hash Cond: (o.rep_id = reps.rep_id)
-> Seq Scan on invoices o (cost=0.00..2868.96 rows=64996 width=551) (actual time=0.010..235.977 rows=64996 loops=1)
-> Hash (cost=22.30..22.30 rows=1230 width=36) (actual time=0.044..0.044 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on reps (cost=0.00..22.30 rows=1230 width=36) (actual time=0.027..0.032 rows=4 loops=1)
-> Hash (cost=22.30..22.30 rows=1230 width=36) (actual time=0.067..0.067 rows=3 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on terms (cost=0.00..22.30 rows=1230 width=36) (actual time=0.001..0.007 rows=3 loops=1)
-> Hash (cost=22.30..22.30 rows=1230 width=36) (actual time=0.043..0.043 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on shipvia (cost=0.00..22.30 rows=1230 width=36) (actual time=0.027..0.032 rows=4 loops=1)
Total runtime: 414488.582 ms
This is, obviously, awful. I am pretty new to interpreting query plans and would like to know how to extract the useful performance improvement hints from such a plan.
EDIT 1
Two kinds of entities are involved in this query - invoices and invoice items having the 1-many relationship.
An invoice item specifies the quantity of it within the parent invoice.
The given query returns 100 invoices which have at least one item with the quantity of less than 5.
That should explain why I need DISTINCT - an invoice may have several items satisfying the filter, but I do not want that same invoice returned multiple times. Hence the usage of DISTINCT. However, I am perfectly aware that there may be better means to accomplish the same semantics than using DISTINCT - I am more than willing to learn about them.
EDIT 2
Please, find below the indexes on the invoiceItems table at the time of the query:
CREATE INDEX invoiceitems_invoice_id_idx ON invoiceitems (invoice_id);
CREATE INDEX invoiceitems_invoice_id_name_index ON invoiceitems (invoice_id, name varchar_pattern_ops);
CREATE INDEX invoiceitems_name_index ON invoiceitems (name varchar_pattern_ops);
CREATE INDEX invoiceitems_qty_index ON invoiceitems (qty);
EDIT 3
The advice given by https://stackoverflow.com/users/808806/yieldsfalsehood as to how eliminate DISTINCT (and why) turns out to be a really good one. Here is the new query:
EXPLAIN ANALYZE
SELECT id, special, customer, business_no, bill_to_name, bill_to_address1, bill_to_address2, bill_to_postal_code, ship_to_name, ship_to_address1, ship_to_address2, ship_to_postal_code,
purchase_order_no, ship_date::text, calc_discount_text(o) AS discount, discount_absolute, delivery, hst_percents, sub_total, total_before_hst, hst, total, total_discount, terms, rep, ship_via,
item_count, version, to_char(modified, 'YYYY-MM-DD HH24:MI:SS') AS "modified", to_char(created, 'YYYY-MM-DD HH24:MI:SS') AS "created"
FROM invoices o
LEFT JOIN reps ON reps.rep_id = o.rep_id
LEFT JOIN terms ON terms.terms_id = o.terms_id
LEFT JOIN shipVia ON shipVia.ship_via_id = o.ship_via_id
WHERE EXISTS (SELECT 1 FROM invoiceItems items WHERE items.invoice_id = id AND items.qty < 5)
ORDER BY modified DESC
LIMIT 100
Here is the new plan:
Limit (cost=64717.14..64717.39 rows=100 width=635) (actual time=7830.347..7830.869 rows=100 loops=1)
-> Sort (cost=64717.14..64827.01 rows=43949 width=635) (actual time=7830.334..7830.568 rows=100 loops=1)
Sort Key: (to_char(o.modified, 'YYYY-MM-DD HH24:MI:SS'::text))
Sort Method: top-N heapsort Memory: 76kB
-> Hash Left Join (cost=113.46..63037.44 rows=43949 width=635) (actual time=2.322..6972.679 rows=64467 loops=1)
Hash Cond: (o.ship_via_id = shipvia.ship_via_id)
-> Hash Left Join (cost=75.78..50968.72 rows=43949 width=607) (actual time=0.650..3809.276 rows=64467 loops=1)
Hash Cond: (o.terms_id = terms.terms_id)
-> Hash Left Join (cost=38.11..50438.25 rows=43949 width=579) (actual time=0.550..3527.558 rows=64467 loops=1)
Hash Cond: (o.rep_id = reps.rep_id)
-> Nested Loop Semi Join (cost=0.43..49796.28 rows=43949 width=551) (actual time=0.015..3200.735 rows=64467 loops=1)
-> Seq Scan on invoices o (cost=0.00..2868.96 rows=64996 width=551) (actual time=0.002..317.954 rows=64996 loops=1)
-> Index Scan using invoiceitems_invoice_id_idx on invoiceitems items (cost=0.43..7.61 rows=42 width=4) (actual time=0.030..0.030 rows=1 loops=64996)
Index Cond: (invoice_id = o.id)
Filter: (qty < 5)
Rows Removed by Filter: 1
-> Hash (cost=22.30..22.30 rows=1230 width=36) (actual time=0.213..0.213 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on reps (cost=0.00..22.30 rows=1230 width=36) (actual time=0.183..0.192 rows=4 loops=1)
-> Hash (cost=22.30..22.30 rows=1230 width=36) (actual time=0.063..0.063 rows=3 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on terms (cost=0.00..22.30 rows=1230 width=36) (actual time=0.044..0.050 rows=3 loops=1)
-> Hash (cost=22.30..22.30 rows=1230 width=36) (actual time=0.096..0.096 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on shipvia (cost=0.00..22.30 rows=1230 width=36) (actual time=0.071..0.079 rows=4 loops=1)
Total runtime: 7832.750 ms
Is it the best I can count on? I have restarted the server (to clean the database caches) and rerun the query without EXPLAIN ANALYZE. It takes almost 5 seconds. Can it be improved even further? I have 65,000 invoices and 3,278,436 invoice items.
EDIT 4
Found it. I was ordering by a computation result, modified = to_char(modified, 'YYYY-MM-DD HH24:MI:SS'). Adding an index on the modified invoice field and ordering by the field itself brings the result to under 100 ms !
The final plan is:
Limit (cost=1.18..1741.92 rows=100 width=635) (actual time=3.002..27.065 rows=100 loops=1)
-> Nested Loop Left Join (cost=1.18..765042.09 rows=43949 width=635) (actual time=2.989..25.989 rows=100 loops=1)
-> Nested Loop Left Join (cost=1.02..569900.41 rows=43949 width=607) (actual time=0.413..16.863 rows=100 loops=1)
-> Nested Loop Left Join (cost=0.87..386185.48 rows=43949 width=579) (actual time=0.333..15.694 rows=100 loops=1)
-> Nested Loop Semi Join (cost=0.72..202470.54 rows=43949 width=551) (actual time=0.017..13.965 rows=100 loops=1)
-> Index Scan Backward using invoices_modified_index on invoices o (cost=0.29..155543.23 rows=64996 width=551) (actual time=0.003..4.543 rows=100 loops=1)
-> Index Scan using invoiceitems_invoice_id_idx on invoiceitems items (cost=0.43..7.61 rows=42 width=4) (actual time=0.079..0.079 rows=1 loops=100)
Index Cond: (invoice_id = o.id)
Filter: (qty < 5)
Rows Removed by Filter: 1
-> Index Scan using reps_pkey on reps (cost=0.15..4.17 rows=1 width=36) (actual time=0.007..0.008 rows=1 loops=100)
Index Cond: (rep_id = o.rep_id)
-> Index Scan using terms_pkey on terms (cost=0.15..4.17 rows=1 width=36) (actual time=0.003..0.004 rows=1 loops=100)
Index Cond: (terms_id = o.terms_id)
-> Index Scan using shipvia_pkey on shipvia (cost=0.15..4.17 rows=1 width=36) (actual time=0.006..0.008 rows=1 loops=100)
Index Cond: (ship_via_id = o.ship_via_id)
Total runtime: 27.572 ms
It is amazing! Thank you all for the help.

For starters, it's pretty standard to post explain plans to http://explain.depesz.com - that'll add some pretty formatting to it, give you a nice way to distribute the plan, and let you anonymize plans that might contain sensitive data. Even if you're not distributing the plan it makes it a lot easier to understand what's happening and can sometimes illustrate exactly where a bottleneck is.
There are countless resources that cover interpreting the details of postgres explain plans (see https://wiki.postgresql.org/wiki/Using_EXPLAIN). There are a lot of little details that get taken in to account when the database chooses a plan, but there are some general concepts that can make it easier. First, get a grasp of the page-based layout of data and indexes (you don't need to know the details of the page format, just how data and indexes get split in to pages). From there, get a feel for the two basic data access methods - full table scans and index scans - and with a little thought it should start to become clear the different situations where one would be preferred to the other (also keep in mind that an index scan isn't even always possible). At that point you can start looking in to some of the different configuration items that affect plan selection in the context of how they might tip the scale in favor of a table scan or an index scan.
Once you've got that down, move on up the plan and read in to the details of the different nodes you find - in this plan you've got a lot of hash joins, so read up on that to start with. Then, to compare apples to apples, disable hash joins entirely ("set enable_hashjoin = false;") and run your explain analyze again. Now what join method do you see? Read up on that. Compare the estimated cost of that method with the estimated cost of the hash join. Why might they be different? The estimated cost of the second plan will be higher than this first plan (otherwise it would have been preferred in the first place) but what about the real time that it takes to run the second plan? Is it lower or higher?
Finally, to address this plan specifically. With regards to that sort that's taking a long time: distinct is not a function. "DISTINCT(id)" does not say "give me all the rows that are distinct on only the column id", instead it is sorting the rows and taking the unique values based on all columns in the output (i.e. it is equivalent to writing "distinct id ..."). You should probably re-consider if you actually need that distinct in there. Normalization will tend to design away the need for distincts, and while they will occasionally be needed, whether they really are super truly needed is not always true.

You begin by chasing down the node that takes the longest, and start optimizing there. In your case, that appears to be
Seq Scan on invoiceitems items
You should add an index there, and problem also to the other tables.
You could also try increasing work_mem to get rid of the external sort.
When you have done that, the new plan will probably look completely differently, so then start over.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse