How to skip partitions when scanning query with ORDER BY? - postgresql

I'm trying to query a Postgres partitioned by range table with order by and limit.
Is there a way to make it scan partitions by order and skip the rest of partitions after it reached the limit of 20?
SELECT *
FROM gateway_samples_test
WHERE gateway_id = 14920
ORDER BY timestamp DESC
LIMIT 1
When running in Postgres 11.3:
Limit (cost=2.39..1521.70 rows=1 width=411)
-> Merge Append (cost=2.39..13445838.34 rows=8850 width=411)
Sort Key: gateway_samples_2022_8_28."timestamp" DESC
-> Index Scan Backward using gateway_samples_2022_8_28_timestamp_idx on gateway_samples_old (cost=0.57..12393096.00 rows=8787 width=411)
Filter: (gateway_id = 14920)
-> Index Scan Backward using gateway_samples_2022_8_29_timestamp_idx on gateway_samples_2022_8_29 (cost=0.42..507283.89 rows=28 width=414)
Filter: (gateway_id = 14920)
-> Index Scan Backward using gateway_samples_2022_8_30_timestamp_idx on gateway_samples_2022_8_30 (cost=0.42..471569.21 rows=27 width=414)
Filter: (gateway_id = 14920)
-> Index Scan Backward using gateway_samples_2022_8_31_timestamp_idx on gateway_samples_2022_8_31 (cost=0.29..72649.94 rows=4 width=414)
Filter: (gateway_id = 14920)
-> Index Scan Backward using gateway_samples_2022_9_1_timestamp_idx on gateway_samples_2022_9_1 (cost=0.14..265.54 rows=1 width=974)
Filter: (gateway_id = 14920)
-> Index Scan Backward using gateway_samples_2022_9_2_timestamp_idx on gateway_samples_2022_9_2 (cost=0.14..265.54 rows=1 width=974)
Filter: (gateway_id = 14920)
-> Index Scan Backward using gateway_samples_2022_9_3_timestamp_idx on gateway_samples_2022_9_3 (cost=0.14..265.54 rows=1 width=974)
Filter: (gateway_id = 14920)
-> Index Scan Backward using gateway_samples_default_timestamp_idx on gateway_samples_default (cost=0.14..265.54 rows=1 width=974)
Filter: (gateway_id = 14920)
When running in Postgres 12 (here the first partition contains only subset of the data):
Limit (cost=271.63..271.63 rows=1 width=449)
-> Sort (cost=271.63..271.78 rows=62 width=449)
Sort Key: gateway_samples_2022_8_28_test."timestamp" DESC
-> Append (cost=0.42..271.32 rows=62 width=449)
-> Index Scan using gateway_samples_2022_8_28_test_gateway_id_idx on gateway_samples_old_test (cost=0.42..109.85 rows=27 width=414)
Index Cond: (gateway_id = 14920)
-> Index Scan using gateway_samples_2022_8_29_test_gateway_id_idx on gateway_samples_2022_8_29_test (cost=0.42..104.92 rows=26 width=414)
Index Cond: (gateway_id = 14920)
-> Bitmap Heap Scan on gateway_samples_2022_8_30_test (cost=4.33..23.60 rows=5 width=414)
Recheck Cond: (gateway_id = 14920)
-> Bitmap Index Scan on gateway_samples_2022_8_30_test_gateway_id_idx (cost=0.00..4.33 rows=5 width=0)
Index Cond: (gateway_id = 14920)
-> Index Scan using gateway_samples_2022_8_31_test_gateway_id_idx on gateway_samples_2022_8_31_test (cost=0.14..8.16 rows=1 width=974)
Index Cond: (gateway_id = 14920)
-> Index Scan using gateway_samples_2022_9_1_test_gateway_id_idx on gateway_samples_2022_9_1_test (cost=0.14..8.16 rows=1 width=974)
Index Cond: (gateway_id = 14920)
-> Index Scan using gateway_samples_2022_9_2_test_gateway_id_idx on gateway_samples_2022_9_2_test (cost=0.14..8.16 rows=1 width=974)
Index Cond: (gateway_id = 14920)
-> Index Scan using gateway_samples_default_test_gateway_id_idx on gateway_samples_default_test (cost=0.14..8.16 rows=1 width=974)
Index Cond: (gateway_id = 14920)
For some reason I see it's not using the timestamp index in the latest.

Not in v11.
This will work automatically provided you have an index starting with "timestamp", and you upgrade to at least v12.
It seems like this could work without the index (sorting each partition in turn, starting at the proper end of the range, until it gets its LIMIT), but I guess no one bothered to implement that.
But your new query is different because of the gateway_id = 14920 condition. It can still scan the partitions in order using the timestamp index, but it thinks it will be faster to use the highly selective index on gateway_id instead. You can force it to use the timestamp index, even if it is worse, by set enable_sort=off. But better would be to make a new index that can fulfill both needs simultaneously, (gateway_id, timestamp)

Related

How to read costs in Postgres explain statements?

Here's an example explain from postgres:
Aggregate (cost=55881.29..55881.30 rows=1 width=64)
-> Nested Loop (cost=1509.25..55881.28 rows=1 width=32)
-> Nested Loop (cost=1508.82..55880.82 rows=1 width=23)
-> Nested Loop (cost=1508.53..55880.48 rows=1 width=19)
Join Filter: (n.id = ci.person_id)
-> Nested Loop (cost=1508.09..55874.34 rows=1 width=27)
-> Nested Loop (cost=1507.67..55873.73 rows=1 width=23)
-> Nested Loop (cost=1507.24..55865.27 rows=1 width=4)
-> Seq Scan on info_type it (cost=0.00..2.41 rows=1 width=4)
Filter: ((info)::text = 'mini biography'::text)
-> Bitmap Heap Scan on person_info pi (cost=1507.24..55862.85 rows=1 width=8)
Recheck Cond: (info_type_id = it.id)
Filter: ((note)::text = 'Volker Boehm'::text)
-> Bitmap Index Scan on info_type_id_person_info (cost=0.00..1507.24 rows=137974 width=0)
Index Cond: (info_type_id = it.id)
-> Index Scan using name_pkey on name n (cost=0.43..8.46 rows=1 width=19)
Index Cond: (id = pi.person_id)
Filter: (((name_pcode_cf)::text >= 'A'::text) AND ((name_pcode_cf)::text <= 'F'::text) AND (((gender)::text = 'm'::text) OR (((gender)::text = 'f'::text) AND ((name)::text ~~ 'B%'::text))))
-> Index Scan using person_id_aka_name on aka_name an (cost=0.42..0.60 rows=2 width=4)
Index Cond: (person_id = n.id)
Filter: ((name)::text ~~ '%a%'::text)
-> Index Scan using person_id_cast_info on cast_info ci (cost=0.44..4.40 rows=139 width=8)
Index Cond: (person_id = an.person_id)
-> Index Only Scan using linked_movie_id_movie_link on movie_link ml (cost=0.29..0.32 rows=2 width=4)
Index Cond: (linked_movie_id = ci.movie_id)
-> Index Scan using title_pkey on title t (cost=0.43..0.46 rows=1 width=21)
Index Cond: (id = ci.movie_id)
Filter: ((production_year >= 1980) AND (production_year <= 1995))
I understand the the cost=a...b means a is the startup cost and b is the total cost. But the total cost of what? The total cost of everything that happens in it, or just the action? For example, the cost of a Nested Loop, is it the cost of the nested loop itself or that cost plus everything that happens inside it (more joins, table scans etc)? Thanks.
From the documentation: https://www.postgresql.org/docs/11/using-explain.html
It's important to understand that the cost of an upper-level node includes the cost of all its child nodes. It's also important to realize that the cost only reflects things that the planner cares about. In particular, the cost does not consider the time spent transmitting result rows to the client, which could be an important factor in the real elapsed time; but the planner ignores it because it cannot change it by altering the plan. (Every correct plan will output the same row set, we trust.)

postgres seq scan on two partitions

When ch table is queried direcly on rp_fk column, it uses index for almost all of its partitions.
explain analyze select * from oc.ch ch where rp_fk = 'abc123';
Append (cost=0.14..9469.61 rows=2357 width=10008) (actual time=0.164..0.166 rows=0 loops=1)
-> Index Scan using ch_2016_rp_fk_idx1 on ch_2016 ch (cost=0.14..8.16 rows=1 width=11097) (actual time=0.015..0.016 rows=0 loops=1)
Index Cond: (rp_fk = 'abc123'::bpchar)
-> Index Scan using ch_2017_rp_fk_idx1 on ch_2017 ch_1 (cost=0.14..8.16 rows=1 width=10477) (actual time=0.022..0.022 rows=0 loops=1)
Index Cond: (rp_fk = 'abc123'::bpchar)
-> Index Scan using ch_2018_rp_fk_idx1 on ch_2018 ch_2 (cost=0.56..4.58 rows=1 width=12295) (actual time=0.031..0.031 rows=0 loops=1)
Index Cond: (rp_fk = 'abc123'::bpchar)
-> Index Scan using ch_2019_rp_fk_idx1 on ch_2019 ch_3 (cost=0.69..4.71 rows=1 width=12299) (actual time=0.025..0.025 rows=0 loops=1)
Index Cond: (rp_fk = 'abc123'::bpchar)
-> Index Scan using ch_2020_rp_fk_idx1 on ch_2020 ch_4 (cost=0.69..6030.67 rows=1510 width=10259) (actual time=0.029..0.029 rows=0 loops=1)
Index Cond: (rp_fk = 'abc123'::bpchar)
-> Bitmap Heap Scan on ch_2021 ch_5 (cost=55.08..3392.36 rows=841 width=9566) (actual time=0.018..0.019 rows=0 loops=1)
Recheck Cond: (rp_fk = 'abc123'::bpchar)
-> Bitmap Index Scan on ch_2021_rp_fk_idx1 (cost=0.00..54.87 rows=841 width=0) (actual time=0.016..0.016 rows=0 loops=1)
Index Cond: (rp_fk = 'abc123'::bpchar)
-> Index Scan using ch_2022_rp_fk_idx on ch_2022 ch_6 (cost=0.12..8.14 rows=1 width=12241) (actual time=0.013..0.013 rows=0 loops=1)
Index Cond: (rp_fk = 'abc123'::bpchar)
-> Seq Scan on ch_def ch_7 (cost=0.00..1.05 rows=1 width=12167) (actual time=0.009..0.009 rows=0 loops=1)
Filter: (rp_fk = 'abc123'::bpchar)
Rows Removed by Filter: 4
Planning Time: 24.637 ms
Execution Time: 0.538 ms
But when joining ch table with p table from where it gets the rp_fk, its doing a sequential scan of the ch_2018 and ch_2019 partitions. This is where its taking most of the time, whereas 2020 and 2021 paritions are scanned using index. Any insight as to why its using seq scan on these 2 partitions? All the partitions have same indexes and have upto date vacuum analyze. random_page_cost is set to 1.
explain analyze select * FROM oc.ch ch JOIN op.p p ON ch.rp_fk = p.p_pk WHERE ch.del_flg = '0'::bpchar and hf_p_num_cd ='112642817-002';
Nested Loop (cost=0.42..13134696.50 rows=96 width=12593) (actual time=193878.564..193878.569 rows=0 loops=1)
-> Index Scan using ix_p_hf_p_num_cd on p p (cost=0.42..8.44 rows=1 width=1420) (actual time=0.025..0.028 rows=1 loops=1)
Index Cond: ((hf_p_num_cd)::text = '112642817-002'::text)
-> Append (cost=0.00..12666318.27 rows=46836978 width=11167) (actual time=193878.532..193878.535 rows=0 loops=1)
-> Seq Scan on ch_2016 ch (cost=0.00..11.57 rows=13 width=11097) (actual time=0.072..0.072 rows=0 loops=1)
Filter: ((del_flg = '0'::bpchar) AND (p.p_pk = rp_fk))
Rows Removed by Filter: 38
-> Index Scan using ch_2017_rp_fk_idx1 on ch_2017 ch_1 (cost=0.14..20.94 rows=40 width=10477) (actual time=0.016..0.016 rows=0 loops=1)
Index Cond: (rp_fk = p.p_pk)
Filter: (del_flg = '0'::bpchar)
-> Seq Scan on ch_2018 ch_2 (cost=0.00..4005273.43 rows=15109962 width=12295) (actual time=25051.825..25051.825 rows=0 loops=1)
Filter: ((del_flg = '0'::bpchar) AND (p.p_pk = rp_fk))
Rows Removed by Filter: 15111645
-> Seq Scan on ch_2019 ch_3 (cost=0.00..8406638.77 rows=31721918 width=12299) (actual time=168826.481..168826.481 rows=0 loops=1)
Filter: ((del_flg = '0'::bpchar) AND (p.p_pk = rp_fk))
Rows Removed by Filter: 31722679
-> Index Scan using ch_2020_rp_fk_idx1 on ch_2020 ch_4 (cost=0.69..14893.73 rows=3729 width=10259) (actual time=0.057..0.057 rows=0 loops=1)
Index Cond: (rp_fk = p.p_pk)
Filter: (del_flg = '0'::bpchar)
-> Bitmap Heap Scan on ch_2021 ch_5 (cost=82.74..5285.74 rows=1314 width=9566) (actual time=0.033..0.033 rows=0 loops=1)
Recheck Cond: (rp_fk = p.p_pk)
Filter: (del_flg = '0'::bpchar)
-> Bitmap Index Scan on ch_2021_rp_fk_idx1 (cost=0.00..82.42 rows=1314 width=0) (actual time=0.022..0.022 rows=0 loops=1)
Index Cond: (rp_fk = p.p_pk)
-> Index Scan using ch_2022_rp_fk_idx on ch_2022 ch_6 (cost=0.12..8.14 rows=1 width=12241) (actual time=0.014..0.014 rows=0 loops=1)
Filter: ((del_flg = '0'::bpchar) AND (p.p_pk = rp_fk))
-> Seq Scan on ch_def ch_7 (cost=0.00..1.06 rows=1 width=12167) (actual time=0.017..0.017 rows=0 loops=1)
Filter: ((del_flg = '0'::bpchar) AND (p.p_pk = rp_fk))
Rows Removed by Filter: 4
Planning Time: 4.183 ms
Execution Time: 193878.918 ms
It seems like 'abc123' is a very rare value for rp_fk, so PostgreSQL plans an index scan.
But it seems there are values that are much more frequent. In the second query, the optimizer does not know what value the rp_fk for hf_p_num_cd = '112642817-002' is going to be, so it goes with some average estimate, which turns out to be much higher for many partitions. Hence the sequential scan.
I would split the query in two parts and query the partitioned table with the constants found in the first query. Then the optimizer knows more and will plan better.
If you are certain that an index scan will always win, you can force the planner's hand:
BEGIN;
SET LOCAL enable_seqscan = off;
SET LOCAL enable_bitmapscan = off;
SELECT ...
COMMIT;

Postgres query optimizer generates bad plan after adding another order by criterion

I'm using django orm with select related and it generates the query of the form:
SELECT *
FROM "coupons_coupon"
LEFT OUTER JOIN "coupons_merchant"
ON ("coupons_coupon"."merchant_id" = "coupons_merchant"."slug")
WHERE ("coupons_coupon"."end_date" > '2020-07-10T09:10:28.101980+00:00'::timestamptz AND "coupons_coupon"."published" = true)
ORDER BY "coupons_coupon"."end_date" ASC, "coupons_coupon"."id"
LIMIT 5;
Which is then executed using the following plan:
Limit (cost=4363.28..4363.30 rows=5 width=604) (actual time=21.864..21.865 rows=5 loops=1)
-> Sort (cost=4363.28..4373.34 rows=4022 width=604) (actual time=21.863..21.863 rows=5 loops=1)
Sort Key: coupons_coupon.end_date, coupons_coupon.id"
Sort Method: top-N heapsort Memory: 32kB
-> Hash Left Join (cost=2613.51..4296.48 rows=4022 width=604) (actual time=13.918..20.209 rows=4022 loops=1)
Hash Cond: ((coupons_coupon.merchant_id)::text = (coupons_merchant.slug)::text)
-> Seq Scan on coupons_coupon (cost=0.00..291.41 rows=4022 width=261) (actual time=0.007..1.110 rows=4022 loops=1)
Filter: (published AND (end_date > '2020-07-10 09:10:28.10198+00'::timestamp with time zone))
Rows Removed by Filter: 1691
-> Hash (cost=1204.56..1204.56 rows=24956 width=331) (actual time=13.894..13.894 rows=23911 loops=1)
Buckets: 16384 Batches: 4 Memory Usage: 1948kB
-> Seq Scan on coupons_merchant (cost=0.00..1204.56 rows=24956 width=331) (actual time=0.003..4.681 rows=23911 loops=1)
Which is a bad execution plan as join can be done after the left table has been filtered, ordered and limited. When I remove the id from order by it generates an efficient plan, which basically could have been used in the previous query as well.
Limit (cost=0.57..8.84 rows=5 width=600) (actual time=0.013..0.029 rows=5 loops=1)
-> Nested Loop Left Join (cost=0.57..6650.48 rows=4022 width=600) (actual time=0.012..0.028 rows=5 loops=1)
-> Index Scan using coupons_cou_end_dat_a8d5b7_btree on coupons_coupon (cost=0.28..1015.77 rows=4022 width=261) (actual time=0.007..0.010 rows=5 loops=1)
Index Cond: (end_date > '2020-07-10 09:10:28.10198+00'::timestamp with time zone)
Filter: published
-> Index Scan using coupons_merchant_pkey on coupons_merchant (cost=0.29..1.40 rows=1 width=331) (actual time=0.003..0.003 rows=1 loops=5)
Index Cond: ((slug)::text = (coupons_coupon.merchant_id)::text)
Why is this happening? Can the optimizer be nudged to use similar plan for the former query?
I'm using postgres 12.
v13 of PostgreSQL, which should be released in the next few months, implements incremental sorting, in which it can read rows in an pre-sorted order based on prefix columns, then sorts just the ties on those prefix column(s) by the remaining column(s) in order to get a complete sort based on more columns than an index provides. I think that will do more or less what you want.
Limit (cost=2.46..2.99 rows=5 width=21)
-> Incremental Sort (cost=2.46..405.58 rows=3850 width=21)
Sort Key: coupons_coupon.end_date, coupons_coupon.id
Presorted Key: coupons_coupon.end_date
-> Nested Loop Left Join (cost=0.31..253.48 rows=3850 width=21)
-> Index Scan using coupons_coupon_end_date_idx on coupons_coupon (cost=0.15..54.71 rows=302 width=17)
Index Cond: (end_date > '2020-07-10 05:10:28.10198-04'::timestamp with time zone)
Filter: published
-> Index Only Scan using coupons_merchant_slug_idx on coupons_merchant (cost=0.15..0.53 rows=13 width=4)
Index Cond: (slug = coupons_coupon.merchant_id)
Of course just adding "id" into the current index will work under currently released versions, and even under version 13 is should be more efficient to have the index fully order the rows in the way you need them.

Why postgres choose wrong execution plan

I have a simple query
select count(*)
from taxi_order.ta_orders o
inner join public.t_bases b on b.id = o.id_base
where o.c_phone2 = '012356789'
and b.id_organization = 1
and o.c_date_end < '2017-12-01'::date
group by date_trunc('month', o.c_date_end);
Most of time this query runs fast in less then 100 ms but sometimes it starts run very slow up to 4 seconds for some c_phone2, id_organization combinations.
Execution plan for fast case:
HashAggregate (cost=7005.05..7005.62 rows=163 width=8)
Group Key: date_trunc('month'::text, o.c_date_end)
-> Hash Join (cost=94.30..7004.23 rows=163 width=8)
Hash Cond: (o.id_base = b.id)
-> Index Scan using ix_ta_orders_c_phone2 on ta_orders o (cost=0.57..6899.41 rows=2806 width=12)
Index Cond: ((c_phone2)::text = $3)
Filter: (c_date_end < $4)
-> Hash (cost=93.26..93.26 rows=133 width=4)
-> Bitmap Heap Scan on t_bases b (cost=4.71..93.26 rows=133 width=4)
Recheck Cond: (id_organization = $2)
-> Bitmap Index Scan on ix_t_bases_id_organization (cost=0.00..4.68 rows=133 width=0)
Index Cond: (id_organization = $2)
Execution plan for slow case:
HashAggregate (cost=6604.97..6604.98 rows=1 width=8)
Group Key: date_trunc('month'::text, o.c_date_end)
-> Nested Loop (cost=2195.33..6604.97 rows=1 width=8)
-> Bitmap Heap Scan on t_bases b (cost=2.29..7.78 rows=3 width=4)
Recheck Cond: (id_organization = $2)
-> Bitmap Index Scan on ix_t_bases_id_organization (cost=0.00..2.29 rows=3 width=0)
Index Cond: (id_organization = $2)
-> Bitmap Heap Scan on ta_orders o (cost=2193.04..2199.06 rows=3 width=12)
Recheck Cond: (((c_phone2)::text = $3) AND (id_base = b.id) AND (c_date_end < $4))
-> BitmapAnd (cost=2193.04..2193.04 rows=3 width=0)
-> Bitmap Index Scan on ix_ta_orders_c_phone2 (cost=0.00..58.84 rows=3423 width=0)
Index Cond: ((c_phone2)::text = $3)
-> Bitmap Index Scan on ix_ta_orders_id_base_date_end (cost=0.00..2133.66 rows=83472 width=0)
Index Cond: ((id_base = b.id) AND (c_date_end < $4))
Why query planer chooses so slow ineffective plan sometimes?
EDIT
Schema for tables:
craete table taxi_order.ta_orders (
id bigserial not null,
id_base integer not null,
c_phone2 character varying(30),
c_date_end timestamp with time zone,
...
CONSTRAINT pk_ta_orders PRIMARY KEY (id),
CONSTRAINT fk_ta_orders_t_bases REFERENCES public.t_bases (id)
);
craete table public.t_bases (
id serial not null,
id_organization integer not null,
...
CONSTRAINT pk_t_bases PRIMARY KEY (id)
);
ta_orders ~ 100M rows, t_bases ~ 2K rows.
EDIT2
Explain analyze for slow case:
HashAggregate (cost=6355.29..6355.29 rows=1 width=8) (actual time=4075.847..4075.847 rows=1 loops=1)
Group Key: date_trunc('month'::text, o.c_date_end)
-> Nested Loop (cost=2112.10..6355.28 rows=1 width=8) (actual time=114.871..4075.803 rows=2 loops=1)
-> Bitmap Heap Scan on t_bases b (cost=2.29..7.78 rows=3 width=4) (actual time=0.061..0.375 rows=133 loops=1)
Recheck Cond: (id_organization = $2)
Heap Blocks: exact=45
-> Bitmap Index Scan on ix_t_bases_id_organization (cost=0.00..2.29 rows=3 width=0) (actual time=0.045..0.045 rows=133 loops=1)
Index Cond: (id_organization = $2)
-> Bitmap Heap Scan on ta_orders o (cost=2109.81..2115.83 rows=3 width=12) (actual time=30.638..30.638 rows=0 loops=133)
Recheck Cond: (((c_phone2)::text = $3) AND (id_base = b.id) AND (c_date_end < $4))
Heap Blocks: exact=2
-> BitmapAnd (cost=2109.81..2109.81 rows=3 width=0) (actual time=30.635..30.635 rows=0 loops=133)
-> Bitmap Index Scan on ix_ta_orders_c_phone2 (cost=0.00..58.85 rows=3427 width=0) (actual time=0.032..0.032 rows=6 loops=133)
Index Cond: ((c_phone2)::text = $3)
-> Bitmap Index Scan on ix_ta_orders_id_base_date_end (cost=0.00..2050.42 rows=80216 width=0) (actual time=30.108..30.108 rows=94206 loops=133)
Index Cond: ((id_base = b.id) AND (c_date_end < $4))
Explain analyze for fast case:
HashAggregate (cost=7005.05..7005.62 rows=163 width=8) (actual time=0.927..0.928 rows=1 loops=1)
Group Key: date_trunc('month'::text, o.c_date_end)
-> Hash Join (cost=94.30..7004.23 rows=163 width=8) (actual time=0.903..0.913 rows=2 loops=1)
Hash Cond: (o.id_base = b.id)
-> Index Scan using ix_ta_orders_c_phone2 on ta_orders o (cost=0.57..6899.41 rows=2806 width=12) (actual time=0.591..0.604 rows=4 loops=1)
Index Cond: ((c_phone2)::text = $3)
Filter: (c_date_end < $4)
Rows Removed by Filter: 2
-> Hash (cost=93.26..93.26 rows=133 width=4) (actual time=0.237..0.237 rows=133 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 13kB
-> Bitmap Heap Scan on t_bases b (cost=4.71..93.26 rows=133 width=4) (actual time=0.058..0.196 rows=133 loops=1)
Recheck Cond: (id_organization = $2)
Heap Blocks: exact=45
-> Bitmap Index Scan on ix_t_bases_id_organization (cost=0.00..4.68 rows=133 width=0) (actual time=0.044..0.044 rows=133 loops=1)
Index Cond: (id_organization = $2)
I know I can create separate index for every query to speed it up. But I want to know what is the reason for choosing wrong plan? What is wrong with my statistic?
You'd have to give us EXPLAIN (ANALYZE, BUFFERS) output for a definitive answer.
The difference between the plans is that the second plan chooses a nested loop join because it estimates that only very few rows will be selected from t_bases. Since you complain that the query is slow, that estimate is probably wrong, resulting in too many loops over the inner table.
Try to improve your table statistics by running ANALYZE, perhaps after increasing default_statistics_target.
A multi-column index on ta_orders(c_phone2, id_base, c_date_end) would improve the execution time for the nested loop plan.
Not sure, but I can suggest a possible improvement to your query: remove the inner join. You're not selecting anything from that table, so why bother querying it? You should be able to add where o.id_base = ? to your query.
If you want this query to run quickly every time you should add the following index to ta_orders: (id_base, c_phone2, c_date_end). It's important that the column with the > or < where clause is at the end (otherwise Postgres will not be able to make use of it).

Order by ASC 100x faster than Order by DESC ? Why?

I have one complexe query generated by Hibernate for JBPM. I can't really modify it and i'm searching to optimize it as much as possible.
I found out that ORDER BY DESC is way slower than ORDER BY ASC, do you have any idea ?
PostgreSQL Version : 9.4
Schema : https://pastebin.com/qNZhrbef
Query :
select
taskinstan0_.ID_ as ID1_27_,
taskinstan0_.VERSION_ as VERSION3_27_,
taskinstan0_.NAME_ as NAME4_27_,
taskinstan0_.DESCRIPTION_ as DESCRIPT5_27_,
taskinstan0_.ACTORID_ as ACTORID6_27_,
taskinstan0_.CREATE_ as CREATE7_27_,
taskinstan0_.START_ as START8_27_,
taskinstan0_.END_ as END9_27_,
taskinstan0_.DUEDATE_ as DUEDATE10_27_,
taskinstan0_.PRIORITY_ as PRIORITY11_27_,
taskinstan0_.ISCANCELLED_ as ISCANCE12_27_,
taskinstan0_.ISSUSPENDED_ as ISSUSPE13_27_,
taskinstan0_.ISOPEN_ as ISOPEN14_27_,
taskinstan0_.ISSIGNALLING_ as ISSIGNA15_27_,
taskinstan0_.ISBLOCKING_ as ISBLOCKING16_27_,
taskinstan0_.LOCKED as LOCKED27_,
taskinstan0_.QUEUE as QUEUE27_,
taskinstan0_.TASK_ as TASK19_27_,
taskinstan0_.TOKEN_ as TOKEN20_27_,
taskinstan0_.PROCINST_ as PROCINST21_27_,
taskinstan0_.SWIMLANINSTANCE_ as SWIMLAN22_27_,
taskinstan0_.TASKMGMTINSTANCE_ as TASKMGM23_27_
from JBPM_TASKINSTANCE taskinstan0_, JBPM_VARIABLEINSTANCE stringinst1_, JBPM_PROCESSINSTANCE processins2_, JBPM_VARIABLEINSTANCE variablein3_
where stringinst1_.CLASS_='S'
and taskinstan0_.PROCINST_=processins2_.ID_
and taskinstan0_.ID_=variablein3_.TASKINSTANCE_
and variablein3_.NAME_ = 'NIR'
and taskinstan0_.QUEUE = 'ERT_TPS'
and (processins2_.ORGAPATH_ like '/ERT%')
and taskinstan0_.ISOPEN_= 't'
and variablein3_.ID_=stringinst1_.ID_
order by stringinst1_.STRINGVALUE_ ASC limit '10';
Explain result for ASC :
Limit (cost=1.71..11652.93 rows=10 width=646) (actual time=6.588..82.407 rows=10 loops=1)
-> Nested Loop (cost=1.71..6215929.27 rows=5335 width=646) (actual time=6.587..82.402 rows=10 loops=1)
-> Nested Loop (cost=1.29..6213170.78 rows=5335 width=646) (actual time=6.578..82.363 rows=10 loops=1)
-> Nested Loop (cost=1.00..6159814.66 rows=153812 width=13) (actual time=0.537..82.130 rows=149 loops=1)
-> Index Scan Backward using totoidx10 on jbpm_variableinstance stringinst1_ (cost=0.56..558481.07 rows=11199905 width=13) (actual time=0.018..11.914 rows=40182 loops=1)
Filter: (class_ = 'S'::bpchar)
-> Index Scan using jbpm_variableinstance_pkey on jbpm_variableinstance variablein3_ (cost=0.43..0.49 rows=1 width=16) (actual time=0.002..0.002 rows=0 loops=40182)
Index Cond: (id_ = stringinst1_.id_)
Filter: ((name_)::text = 'NIR'::text)
Rows Removed by Filter: 1
-> Index Scan using jbpm_taskinstance_pkey on jbpm_taskinstance taskinstan0_ (cost=0.29..0.34 rows=1 width=641) (actual time=0.001..0.001 rows=0 loops=149)
Index Cond: (id_ = variablein3_.taskinstance_)
Filter: (isopen_ AND ((queue)::text = 'ERT_TPS'::text))
Rows Removed by Filter: 0
-> Index Only Scan using idx_procin_2 on jbpm_processinstance processins2_ (cost=0.42..0.51 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=10)
Index Cond: (id_ = taskinstan0_.procinst_)
Filter: ((orgapath_)::text ~~ '/ERT%'::text)
Heap Fetches: 0
Planning time: 2.598 ms
Execution time: 82.513 ms
Explain result for DESC :
Limit (cost=1.71..11652.93 rows=10 width=646) (actual time=8144.871..8144.986 rows=10 loops=1)
-> Nested Loop (cost=1.71..6215929.27 rows=5335 width=646) (actual time=8144.870..8144.984 rows=10 loops=1)
-> Nested Loop (cost=1.29..6213170.78 rows=5335 width=646) (actual time=8144.858..8144.951 rows=10 loops=1)
-> Nested Loop (cost=1.00..6159814.66 rows=153812 width=13) (actual time=8144.838..8144.910 rows=20 loops=1)
-> Index Scan using totoidx10 on jbpm_variableinstance stringinst1_ (cost=0.56..558481.07 rows=11199905 width=13) (actual time=0.066..2351.727 rows=2619671 loops=1)
Filter: (class_ = 'S'::bpchar)
Rows Removed by Filter: 906237
-> Index Scan using jbpm_variableinstance_pkey on jbpm_variableinstance variablein3_ (cost=0.43..0.49 rows=1 width=16) (actual time=0.002..0.002 rows=0 loops=2619671)
Index Cond: (id_ = stringinst1_.id_)
Filter: ((name_)::text = 'NIR'::text)
Rows Removed by Filter: 1
-> Index Scan using jbpm_taskinstance_pkey on jbpm_taskinstance taskinstan0_ (cost=0.29..0.34 rows=1 width=641) (actual time=0.002..0.002 rows=0 loops=20)
Index Cond: (id_ = variablein3_.taskinstance_)
Filter: (isopen_ AND ((queue)::text = 'ERT_TPS'::text))
-> Index Only Scan using idx_procin_2 on jbpm_processinstance processins2_ (cost=0.42..0.51 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=10)
Index Cond: (id_ = taskinstan0_.procinst_)
Filter: ((orgapath_)::text ~~ '/ERT%'::text)
Heap Fetches: 0
Planning time: 2.080 ms
Execution time: 8145.053 ms
Tables infos :
jbpm_variableinstance 12100592 rows
jbpm_taskinstance 69913 rows
jbpm_processinstance 97546 rows
If you have any idea, thanks
This typically only happens when OFFSET and / or LIMIT are involved (as is the case here).
The key difference is this line in the EXPLAIN output for the query with DESC:
Rows Removed by Filter: 906237
Meaning that while the first 10 rows in the index totoidx10 match when scanning backwards (which matches your ASC ordering, obviously), Postgres has to filter ~ 900k rows before it finally finds qualifying rows when scanning the same index forward.
A matching multicolumn index (with the right sort order) might help a lot.
Or, since Postgres chooses an unfavorable query plan, maybe just updated (or more detailed) table statistics or cost settings.
Related:
Keep PostgreSQL from sometimes choosing a bad query plan
Optimizing queries on a range of timestamps (two columns)