Postgres hash join batches explosion - postgresql
We are having some struggle identifying why Postgres is using too much batches to resolve a join.
Here it is the output of explain analyze of a problematic execution:
https://explain.dalibo.com/plan/xNJ#plan
Limit (cost=20880.87..20882.91 rows=48 width=205) (actual time=10722.953..10723.358 rows=48 loops=1)
-> Unique (cost=20880.87..21718.12 rows=19700 width=205) (actual time=10722.951..10723.356 rows=48 loops=1)
-> Sort (cost=20880.87..20930.12 rows=19700 width=205) (actual time=10722.950..10722.990 rows=312 loops=1)
Sort Key: titlemetadata_titlemetadata.creation_date DESC, titlemetadata_titlemetadata.id, titlemetadata_titlemetadata.title_type, titlemetadata_titlemetadata.original_title, titlemetadata_titlemetadata.alternative_ids, titlemetadata_titlemetadata.metadata,
titlemetadata_titlemetadata.is_adult, titlemetadata_titlemetadata.is_kids, titlemetadata_titlemetadata.last_modified, titlemetadata_titlemetadata.year, titlemetadata_titlemetadata.runtime, titlemetadata_titlemetadata.rating, titlemetadata_titlemetadata.video_provider, tit
lemetadata_titlemetadata.series_id_id, titlemetadata_titlemetadata.season_number, titlemetadata_titlemetadata.episode_number
Sort Method: quicksort Memory: 872kB
-> Hash Right Join (cost=13378.20..19475.68 rows=19700 width=205) (actual time=1926.352..10709.970 rows=2909 loops=1)
Hash Cond: (t4.titlemetadata_id = t3.id)
Filter: ((hashed SubPlan 1) OR (hashed SubPlan 2))
Rows Removed by Filter: 63248
-> Seq Scan on video_provider_offer t4 (cost=0.00..5454.90 rows=66290 width=16) (actual time=0.024..57.893 rows=66390 loops=1)
-> Hash (cost=11314.39..11314.39 rows=22996 width=221) (actual time=489.530..489.530 rows=60096 loops=1)
Buckets: 65536 (originally 32768) Batches: 32768 (originally 1) Memory Usage: 11656kB
-> Hash Right Join (cost=5380.95..11314.39 rows=22996 width=221) (actual time=130.024..225.271 rows=60096 loops=1)
Hash Cond: (video_provider_offer.titlemetadata_id = titlemetadata_titlemetadata.id)
-> Seq Scan on video_provider_offer (cost=0.00..5454.90 rows=66290 width=16) (actual time=0.011..32.950 rows=66390 loops=1)
-> Hash (cost=5129.28..5129.28 rows=20133 width=213) (actual time=129.897..129.897 rows=55793 loops=1)
Buckets: 65536 (originally 32768) Batches: 2 (originally 1) Memory Usage: 7877kB
-> Merge Left Join (cost=1.72..5129.28 rows=20133 width=213) (actual time=0.041..93.057 rows=55793 loops=1)
Merge Cond: (titlemetadata_titlemetadata.id = t3.series_id_id)
-> Index Scan using titlemetadata_titlemetadata_pkey on titlemetadata_titlemetadata (cost=1.30..4130.22 rows=20133 width=205) (actual time=0.028..62.949 rows=43921 loops=1)
Filter: ((NOT is_adult) AND (NOT (hashed SubPlan 3)) AND (((title_type)::text = 'MOV'::text) OR ((title_type)::text = 'TVS'::text) OR ((title_type)::text = 'TVP'::text) OR ((title_type)::text = 'EVT'::text)))
Rows Removed by Filter: 14121
SubPlan 3
-> Seq Scan on cable_operator_cableoperatorexcludedtitle u0_2 (cost=0.00..1.01 rows=1 width=8) (actual time=0.006..0.006 rows=0 loops=1)
Filter: (cable_operator_id = 54)
-> Index Scan using titlemetadata_titlemetadata_series_id_id_73453db4_uniq on titlemetadata_titlemetadata t3 (cost=0.41..3901.36 rows=58037 width=16) (actual time=0.011..9.375 rows=12887 loops=1)
SubPlan 1
-> Hash Join (cost=44.62..885.73 rows=981 width=8) (actual time=0.486..36.806 rows=5757 loops=1)
Hash Cond: (w2.device_id = w3.id)
-> Nested Loop (cost=43.49..866.20 rows=2289 width=16) (actual time=0.441..33.096 rows=20180 loops=1)
-> Nested Loop (cost=43.06..414.98 rows=521 width=8) (actual time=0.426..9.952 rows=2909 loops=1)
Join Filter: (w1.id = w0.video_provider_id)
-> Nested Loop (cost=42.65..54.77 rows=13 width=24) (actual time=0.399..0.532 rows=15 loops=1)
-> HashAggregate (cost=42.50..42.95 rows=45 width=16) (actual time=0.390..0.403 rows=45 loops=1)
Group Key: v0.id
-> Nested Loop (cost=13.34..42.39 rows=45 width=16) (actual time=0.095..0.364 rows=45 loops=1)
-> Hash Semi Join (cost=13.19..32.72 rows=45 width=8) (actual time=0.084..0.229 rows=45 loops=1)
Hash Cond: (v1.id = u0.id)
-> Seq Scan on cable_operator_cableoperatorprovider v1 (cost=0.00..17.36 rows=636 width=16) (actual time=0.010..0.077 rows=636 loops=1)
-> Hash (cost=12.63..12.63 rows=45 width=8) (actual time=0.046..0.046 rows=45 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
-> Index Scan using cable_operator_cableoperatorprovider_4d6e54b3 on cable_operator_cableoperatorprovider u0 (cost=0.28..12.63 rows=45 width=8) (actual time=0.016..0.035 rows=45 loops=1)
Index Cond: (cable_operator_id = 54)
-> Index Only Scan using video_provider_videoprovider_pkey on video_provider_videoprovider v0 (cost=0.15..0.20 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=45)
Index Cond: (id = v1.provider_id)
Heap Fetches: 45
-> Index Scan using video_provider_videoprovider_pkey on video_provider_videoprovider w1 (cost=0.15..0.25 rows=1 width=8) (actual time=0.002..0.002 rows=0 loops=45)
Index Cond: (id = v0.id)
Filter: ((video_provider_type)::text = 'VOD'::text)
Rows Removed by Filter: 1
-> Index Scan using video_provider_offer_da942d2e on video_provider_offer w0 (cost=0.42..27.22 rows=39 width=16) (actual time=0.026..0.585 rows=194 loops=15)
Index Cond: (video_provider_id = v0.id)
Filter: (((end_date > '2021-09-02 19:23:00-03'::timestamp with time zone) OR (end_date IS NULL)) AND (access_criteria && '{vtv_mas,TBX_LOGIN,urn:spkg:tve:fox-premium,urn:tve:mcp,AMCHD,AMC_CONSORCIO,ANIMAL_PLANET,ASUNTOS_PUBLI
COS,ASUNTOS_PUBLICOS_CONSORCIO,CINECANALLIVE,CINECANAL_CONSORCIO,DISCOVERY,DISCOVERY_KIDS_CONSORCIO,DISCOVERY_KIDS_OD,DISNEY,DISNEY_CH_CONSORCIO,DISNEY_XD,DISNEY_XD_CONSORCIO,EL_CANAL_HD,EL_CANAL_HD_CONSORCIO,EL_GOURMET_CONSORCIO,ESPN,ESPN2_HD_CONSORCIO,ESPN3_HD_CONSORCIO
,ESPNMAS_HD_CONSORCIO,ESPN_BASIC,ESPN_HD_CONSORCIO,ESPN_PLAY,EUROPALIVE,EUROPA_EUROPA,EUROPA_EUROPA_CONSORCIO,FILMANDARTS_DISPOSITIVOS,FILMS_ARTS,FILM_AND_ARTS_CONSORCIO,FOXLIFE,FOX_LIFE_CONSORCIO,FOX_SPORTS_1_DISPOSITIVOS,FOX_SPORTS_2_DISPOSITIVOS,FOX_SPORTS_2_HD_CONSORC
IO,FOX_SPORTS_3_DISPOSITIVOS,FOX_SPORTS_3_HD_CONSORCIO,FOX_SPORTS_HD_CONSORCIO,FRANCE24_DISPOSITIVOS,FRANCE_24_CONSORCIO,GOURMET,GOURMET_DISPOSITIVOS,HOME_HEALTH,INVESTIGATION_DISCOVERY,MAS_CHIC,NATGEOKIDS_DISPOSITIVOS,NATGEO_CONSORCIO,NATGEO_DISPOSITIVOS,NATGEO_KIDS_CONS
ORCIO,PASIONES,PASIONES_CONSORCIO,SVOD_TYC_BASIC,TBX_LOGIN,TCC_2_CONSORCIO,TCC_2_HD,TLC,TVE,TVE_CONSORCIO,TYC_SPORTS_CONSORCIO,VTV_LIVE,clarosports,discoverykids,espnplay_south_alt,urn:spkg:tve:fox-basic,urn:tve:babytv,urn:tve:cinecanal,urn:tve:discoverykids,urn:tve:foxli
fe,urn:tve:fp,urn:tve:fx,urn:tve:natgeo,urn:tve:natgeokids,urn:tve:natgeowild,urn:tve:thefilmzone}'::character varying(50)[]) AND ((((content_type)::text = 'VOD'::text) AND ((start_date < '2021-09-02 19:23:00-03'::timestamp with time zone) OR (start_date IS NULL))) OR ((c
ontent_type)::text = 'LIV'::text)))
Rows Removed by Filter: 5
-> Index Only Scan using video_provider_offer_devices_offer_id_device_id_key on video_provider_offer_devices w2 (cost=0.42..0.81 rows=6 width=16) (actual time=0.004..0.007 rows=7 loops=2909)
Index Cond: (offer_id = w0.id)
Heap Fetches: 17828
-> Hash (cost=1.10..1.10 rows=3 width=8) (actual time=0.029..0.029 rows=2 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on platform_device_device w3 (cost=0.00..1.10 rows=3 width=8) (actual time=0.024..0.027 rows=2 loops=1)
Filter: ((device_code)::text = ANY ('{ANDROID,ott_dual_tcc,ott_k2_tcc}'::text[]))
Rows Removed by Filter: 5
SubPlan 2
-> Hash Join (cost=44.62..885.73 rows=981 width=8) (actual time=0.410..33.580 rows=5757 loops=1)
Hash Cond: (w2_1.device_id = w3_1.id)
-> Nested Loop (cost=43.49..866.20 rows=2289 width=16) (actual time=0.375..29.886 rows=20180 loops=1)
-> Nested Loop (cost=43.06..414.98 rows=521 width=8) (actual time=0.366..9.134 rows=2909 loops=1)
Join Filter: (w1_1.id = w0_1.video_provider_id)
-> Nested Loop (cost=42.65..54.77 rows=13 width=24) (actual time=0.343..0.476 rows=15 loops=1)
-> HashAggregate (cost=42.50..42.95 rows=45 width=16) (actual time=0.333..0.347 rows=45 loops=1)
Group Key: v0_1.id
-> Nested Loop (cost=13.34..42.39 rows=45 width=16) (actual time=0.083..0.311 rows=45 loops=1)
-> Hash Semi Join (cost=13.19..32.72 rows=45 width=8) (actual time=0.076..0.202 rows=45 loops=1)
Hash Cond: (v1_1.id = u0_1.id)
-> Seq Scan on cable_operator_cableoperatorprovider v1_1 (cost=0.00..17.36 rows=636 width=16) (actual time=0.005..0.057 rows=636 loops=1)
-> Hash (cost=12.63..12.63 rows=45 width=8) (actual time=0.038..0.038 rows=45 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
-> Index Scan using cable_operator_cableoperatorprovider_4d6e54b3 on cable_operator_cableoperatorprovider u0_1 (cost=0.28..12.63 rows=45 width=8) (actual time=0.007..0.020 rows=45 loops=1)
Index Cond: (cable_operator_id = 54)
-> Index Only Scan using video_provider_videoprovider_pkey on video_provider_videoprovider v0_1 (cost=0.15..0.20 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=45)
Index Cond: (id = v1_1.provider_id)
Heap Fetches: 45
-> Index Scan using video_provider_videoprovider_pkey on video_provider_videoprovider w1_1 (cost=0.15..0.25 rows=1 width=8) (actual time=0.002..0.002 rows=0 loops=45)
Index Cond: (id = v0_1.id)
Filter: ((video_provider_type)::text = 'VOD'::text)
Rows Removed by Filter: 1
-> Index Scan using video_provider_offer_da942d2e on video_provider_offer w0_1 (cost=0.42..27.22 rows=39 width=16) (actual time=0.022..0.536 rows=194 loops=15)
Index Cond: (video_provider_id = v0_1.id)
Filter: (((end_date > '2021-09-02 19:23:00-03'::timestamp with time zone) OR (end_date IS NULL)) AND (access_criteria && '{vtv_mas,TBX_LOGIN,urn:spkg:tve:fox-premium,urn:tve:mcp,AMCHD,AMC_CONSORCIO,ANIMAL_PLANET,ASUNTOS_PUBLI
COS,ASUNTOS_PUBLICOS_CONSORCIO,CINECANALLIVE,CINECANAL_CONSORCIO,DISCOVERY,DISCOVERY_KIDS_CONSORCIO,DISCOVERY_KIDS_OD,DISNEY,DISNEY_CH_CONSORCIO,DISNEY_XD,DISNEY_XD_CONSORCIO,EL_CANAL_HD,EL_CANAL_HD_CONSORCIO,EL_GOURMET_CONSORCIO,ESPN,ESPN2_HD_CONSORCIO,ESPN3_HD_CONSORCIO
,ESPNMAS_HD_CONSORCIO,ESPN_BASIC,ESPN_HD_CONSORCIO,ESPN_PLAY,EUROPALIVE,EUROPA_EUROPA,EUROPA_EUROPA_CONSORCIO,FILMANDARTS_DISPOSITIVOS,FILMS_ARTS,FILM_AND_ARTS_CONSORCIO,FOXLIFE,FOX_LIFE_CONSORCIO,FOX_SPORTS_1_DISPOSITIVOS,FOX_SPORTS_2_DISPOSITIVOS,FOX_SPORTS_2_HD_CONSORC
IO,FOX_SPORTS_3_DISPOSITIVOS,FOX_SPORTS_3_HD_CONSORCIO,FOX_SPORTS_HD_CONSORCIO,FRANCE24_DISPOSITIVOS,FRANCE_24_CONSORCIO,GOURMET,GOURMET_DISPOSITIVOS,HOME_HEALTH,INVESTIGATION_DISCOVERY,MAS_CHIC,NATGEOKIDS_DISPOSITIVOS,NATGEO_CONSORCIO,NATGEO_DISPOSITIVOS,NATGEO_KIDS_CONS
ORCIO,PASIONES,PASIONES_CONSORCIO,SVOD_TYC_BASIC,TBX_LOGIN,TCC_2_CONSORCIO,TCC_2_HD,TLC,TVE,TVE_CONSORCIO,TYC_SPORTS_CONSORCIO,VTV_LIVE,clarosports,discoverykids,espnplay_south_alt,urn:spkg:tve:fox-basic,urn:tve:babytv,urn:tve:cinecanal,urn:tve:discoverykids,urn:tve:foxli
fe,urn:tve:fp,urn:tve:fx,urn:tve:natgeo,urn:tve:natgeokids,urn:tve:natgeowild,urn:tve:thefilmzone}'::character varying(50)[]) AND ((((content_type)::text = 'VOD'::text) AND ((start_date < '2021-09-02 19:23:00-03'::timestamp with time zone) OR (start_date IS NULL))) OR ((c
ontent_type)::text = 'LIV'::text)))
Rows Removed by Filter: 5
-> Index Only Scan using video_provider_offer_devices_offer_id_device_id_key on video_provider_offer_devices w2_1 (cost=0.42..0.81 rows=6 width=16) (actual time=0.003..0.006 rows=7 loops=2909)
Index Cond: (offer_id = w0_1.id)
Heap Fetches: 17828
-> Hash (cost=1.10..1.10 rows=3 width=8) (actual time=0.015..0.015 rows=2 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on platform_device_device w3_1 (cost=0.00..1.10 rows=3 width=8) (actual time=0.010..0.011 rows=2 loops=1)
Filter: ((device_code)::text = ANY ('{ANDROID,ott_dual_tcc,ott_k2_tcc}'::text[]))
Rows Removed by Filter: 5
Planning time: 8.255 ms
Execution time: 10723.830 ms
(100 rows)
The weird part is that the same query, sometimes just uses a single batch. Here is an example: https://explain.dalibo.com/plan/zTv#plan
Here is the work_mem being used:
show work_mem;
work_mem
----------
8388kB
(1 row)
I'm not interested in changing the query to be more performant, but in understanding why is the different behavior.
I've found this thread apparently related with this, but I don't quite understand what are they talking about: https://www.postgresql.org/message-id/flat/CA%2BhUKGKWWmf%3DWELLG%3DaUGbcugRaSQbtm0tKYiBut-B2rVKX63g%40mail.gmail.com
Can anyone tell me why is this different behavior? The underlying data is the same in both cases.
If the hash is done in memory, there will only be a single batch.
A difference with the original hash batch numbers is due to Postgres choosing to increase the number of batches in order to reduce memory consumption.
You might find this EXPLAIN glossary useful (disclaimer: I'm one of the authors), here is the page on Hash Batches which also links to the PostgreSQL source code (it's very nicely documented in plain English).
While not a perfect heuristic, you can see that the memory required for the operations with multiple batches are around or above your work_mem setting. They can be lower than it, due to operations on disk generally requiring less memory overall.
I'm not 100% sure why in your exact case one was chosen over the other, but it does look like there are some very slight row estimate differences, which might be a good place to start.
As of PostgreSQL 13 there is also now a hash_mem_multiplier setting that can be used to give more memory to hashes without doing so for other operations (like sorts).
We where able to solve the problem just by doing VACUUM FULL ANALYZE;.
After that, everything started to work as expected (https://explain.depesz.com/s/eoqH#html)
Side note: we where not aware that we should do this on daily basis.
Related
Bitmap heap scan slow with same condition as index scan
I have a query with joins to rather large tables, but do not understand the slow performance of it. Especially this part of the query plan seems weird to me (complete plan and query below): -> Bitmap Heap Scan on order_line (cost=65.45..11521.37 rows=3228 width=20) (actual time=22.555..7764.120 rows=6250 loops=12) Recheck Cond: (product_id = catalogue_product.id) Heap Blocks: exact=71735 Buffers: shared hit=55299 read=16686 -> Bitmap Index Scan on order_line_product_id_e620902d (cost=0.00..64.65 rows=3228 width=0) (actual time=21.532..21.532 rows=6269 loops=12) Index Cond: (product_id = catalogue_product.id) Buffers: shared hit=143 read=107 Why does it need to recheck product_id = catalogue_product.id which is the same as in index and then take so much time? As far as i understand recheck is needed if a) only part of the condition can be covered by index or b) bitmap is too big and must be compressed - but then there should be a lossy=x entry, right? Complete query: SELECT ("order_order"."date_placed" AT TIME ZONE 'UTC')::date, "partner_partner"."odoo_id", "catalogue_product"."odoo_id", SUM("order_line"."quantity") AS "orders" FROM "order_line" INNER JOIN "order_order" ON ("order_line"."order_id" = "order_order"."id") INNER JOIN "catalogue_product" ON ("order_line"."product_id" = "catalogue_product"."id") INNER JOIN "partner_stockrecord" ON ("order_line"."stockrecord_id" = "partner_stockrecord"."id") INNER JOIN "partner_partner" ON ("partner_stockrecord"."partner_id" = "partner_partner"."id") WHERE (("order_order"."date_placed" AT TIME ZONE 'UTC')::date IN ('2022-11-22'::DATE) AND "catalogue_product"."odoo_id" IN (6241, 6499, 6500, 49195, 44753, 44754, 53427, 6452, 44755, 44787, 6427, 6428) AND "partner_partner"."odoo_id" IS NOT NULL AND NOT ("order_order"."status" IN ('Pending', 'PaymentDeclined', 'Canceled'))) GROUP BY ("order_order"."date_placed" AT TIME ZONE 'UTC')::date, "partner_partner"."odoo_id", "catalogue_product"."odoo_id", "order_line"."id" ORDER BY "order_line"."id" ASC Complete plan: GroupAggregate (cost=141002.93..141003.41 rows=16 width=24) (actual time=93629.346..93629.369 rows=52 loops=1) Group Key: order_line.id, ((timezone('UTC'::text, order_order.date_placed))::date), partner_partner.odoo_id, catalogue_product.odoo_id Buffers: shared hit=56537 read=16693 -> Sort (cost=141002.93..141002.97 rows=16 width=20) (actual time=93629.331..93629.335 rows=52 loops=1) Sort Key: order_line.id, partner_partner.odoo_id, catalogue_product.odoo_id Sort Method: quicksort Memory: 29kB Buffers: shared hit=56537 read=16693 -> Hash Join (cost=2319.22..141002.61 rows=16 width=20) (actual time=859.917..93629.204 rows=52 loops=1) Hash Cond: (partner_stockrecord.partner_id = partner_partner.id) Buffers: shared hit=56537 read=16693 -> Nested Loop (cost=2318.11..141001.34 rows=16 width=24) (actual time=859.853..93628.903 rows=52 loops=1) Buffers: shared hit=56536 read=16693 -> Hash Join (cost=2317.69..140994.41 rows=16 width=24) (actual time=859.824..93627.791 rows=52 loops=1) Hash Cond: (order_line.order_id = order_order.id) Buffers: shared hit=56328 read=16693 -> Nested Loop (cost=108.94..138731.32 rows=20700 width=20) (actual time=1.566..93206.434 rows=74999 loops=1) Buffers: shared hit=55334 read=16686 -> Bitmap Heap Scan on catalogue_product (cost=43.48..87.52 rows=12 width=8) (actual time=0.080..0.183 rows=12 loops=1) Recheck Cond: (odoo_id = ANY ('{6241,6499,6500,49195,44753,44754,53427,6452,44755,44787,6427,6428}'::integer[])) Heap Blocks: exact=11 Buffers: shared hit=35 -> Bitmap Index Scan on catalogue_product_odoo_id_c5e41bad (cost=0.00..43.48 rows=12 width=0) (actual time=0.072..0.072 rows=12 loops=1) Index Cond: (odoo_id = ANY ('{6241,6499,6500,49195,44753,44754,53427,6452,44755,44787,6427,6428}'::integer[])) Buffers: shared hit=24 -> Bitmap Heap Scan on order_line (cost=65.45..11521.37 rows=3228 width=20) (actual time=22.555..7764.120 rows=6250 loops=12) Recheck Cond: (product_id = catalogue_product.id) Heap Blocks: exact=71735 Buffers: shared hit=55299 read=16686 -> Bitmap Index Scan on order_line_product_id_e620902d (cost=0.00..64.65 rows=3228 width=0) (actual time=21.532..21.532 rows=6269 loops=12) Index Cond: (product_id = catalogue_product.id) Buffers: shared hit=143 read=107 -> Hash (cost=2194.42..2194.42 rows=1147 width=12) (actual time=365.766..365.766 rows=1313 loops=1) Buckets: 2048 Batches: 1 Memory Usage: 73kB Buffers: shared hit=994 read=7 -> Index Scan using order_date_placed_utc_date_idx on order_order (cost=0.43..2194.42 rows=1147 width=12) (actual time=0.050..365.158 rows=1313 loops=1) Index Cond: ((timezone('UTC'::text, date_placed))::date = '2022-11-22'::date) Filter: ((status)::text <> ALL ('{Pending,PaymentDeclined,Canceled}'::text[])) Rows Removed by Filter: 253 Buffers: shared hit=994 read=7 -> Index Scan using partner_stockrecord_pkey on partner_stockrecord (cost=0.41..0.43 rows=1 width=8) (actual time=0.017..0.017 rows=1 loops=52) Index Cond: (id = order_line.stockrecord_id) Buffers: shared hit=208 -> Hash (cost=1.05..1.05 rows=5 width=8) (actual time=0.028..0.028 rows=5 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB Buffers: shared hit=1 -> Seq Scan on partner_partner (cost=0.00..1.05 rows=5 width=8) (actual time=0.013..0.015 rows=5 loops=1) Filter: (odoo_id IS NOT NULL) Buffers: shared hit=1 Planning time: 3.275 ms Execution time: 93629.781 ms
It doesn't have to do any rechecks. That line in the plan comes from the planner, not from the run-time part. (you can tell because if you just do EXPLAIN without ANALYZE, the line still appears.) At planning time, it doesn't know whether any of the bitmap will overflow, so it has to be prepared to do the recheck, even if that turns out not to be necessary to execute it at run time. The slowness almost certainly comes from the time spent reading 16686 random pages, which could be made clear by turning on track_io_timing.
Query is slow with indexes - how to understand execution plans?
I need help to understand why my query is slower when I use index than without any index. I ran explain analyze command, and below are execution plans option 1 - with index, and option 2 - without index. Can someone explain to me why index makes performances worse in those execution plans? PS. When I add 10 million rows to table (original size 2M), situation is turning in favor of index, and in that case query with index is 3x faster). OPTION 1 WITH INDEX FOR LEFT JOIN invoice_id+acct_level ON TABLE cost_invoice_facepage AND CONDITION (cdb.invoice_id = invoice_id) AND (acct_level = 1) Append (cost=48.87..38583.97 rows=163773 width=371) (actual time=1.269..1516.564 rows=379129 loops=1) -> Nested Loop (cost=48.87..10520.11 rows=36504 width=362) (actual time=1.268..5.986 rows=579 loops=1) -> Hash Left Join (cost=44.66..9918.22 rows=507 width=322) (actual time=1.160..5.497 rows=579 loops=1) Hash Cond: (cd.gl_string_id = gs.id) -> Nested Loop Left Join (cost=0.85..9873.07 rows=507 width=262) (actual time=0.485..4.473 rows=579 loops=1) Filter: ((c.gl_rule_type IS NULL) OR ((cd.charge_id IS NOT NULL) AND (c.gl_rule_type_id <> ALL ('{60,70}'::integer[])))) -> Index Scan using cost_invoice_charge_invoice_id_idx on cost_invoice_charge c (cost=0.43..1204.53 rows=1188 width=243) (actual time=0.467..2.664 rows=579 loops=1) Index Cond: (invoice_id = 14517) Filter: ((chg_amt <> '0'::numeric) AND ((gl_rule_type IS NULL) OR (gl_rule_type_id <> ALL ('{60,70}'::integer[])))) Rows Removed by Filter: 3364 -> Index Scan using "gl_charge_detail.charge_id->cost_invoice_info_only.id" on gl_charge_detail cd (cost=0.42..7.28 rows=1 width=27) (actual time=0.002..0.002 rows=1 loops=579) Index Cond: (c.id = charge_id) -> Hash (cost=31.69..31.69 rows=969 width=64) (actual time=0.657..0.657 rows=969 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 103kB -> Seq Scan on gl_strings gs (cost=0.00..31.69 rows=969 width=64) (actual time=0.026..0.389 rows=969 loops=1) -> Materialize (cost=4.22..145.78 rows=72 width=44) (actual time=0.000..0.000 rows=1 loops=579) -> Hash Left Join (cost=4.22..145.42 rows=72 width=44) (actual time=0.100..0.102 rows=1 loops=1) Hash Cond: (f.vendor_id = vn.id) -> Nested Loop (cost=0.57..141.57 rows=72 width=31) (actual time=0.027..0.029 rows=1 loops=1) -> Index Scan using cost_invoice_header_id_idx on cost_invoice_header ch (cost=0.29..8.31 rows=1 width=4) (actual time=0.012..0.013 rows=1 loops=1) Index Cond: (id = 14517) Filter: (status_code <> ALL ('{100,101,102,490}'::integer[])) -> Index Scan using "invoice_id+acct_level" on cost_invoice_facepage f (cost=0.29..132.55 rows=72 width=31) (actual time=0.013..0.013 rows=1 loops=1) Index Cond: ((invoice_id = 14517) AND (acct_level = 1)) -> Hash (cost=2.73..2.73 rows=73 width=17) (actual time=0.061..0.061 rows=73 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 12kB -> Seq Scan on appdata_vendor_common vn (cost=0.00..2.73 rows=73 width=17) (actual time=0.020..0.038 rows=73 loops=1) -> Hash Left Join (cost=2276.48..25607.26 rows=127269 width=374) (actual time=204.117..1486.717 rows=378550 loops=1) Hash Cond: (f_1.vendor_id = vn_1.id) -> Nested Loop Left Join (cost=2272.84..25250.14 rows=127269 width=361) (actual time=204.072..1328.491 rows=378550 loops=1) -> Gather (cost=2272.55..24385.32 rows=2663 width=338) (actual time=204.055..335.965 rows=378550 loops=1) Workers Planned: 2 Workers Launched: 2 -> Hash Left Join (cost=1272.55..23119.02 rows=1110 width=338) (actual time=127.365..321.126 rows=126183 loops=3) Hash Cond: (cdb.gl_string_id = gs_1.id) -> Hash Join (cost=1228.74..23072.30 rows=1110 width=278) (actual time=126.126..263.315 rows=126183 loops=3) Hash Cond: (cdb.charge_id = c_1.id) -> Parallel Seq Scan on gl_charge_detail_ban cdb (cost=0.00..20581.15 rows=480915 width=43) (actual time=0.270..109.543 rows=384732 loops=3) -> Hash (cost=1194.13..1194.13 rows=2769 width=239) (actual time=7.232..7.232 rows=3929 loops=3) Buckets: 4096 Batches: 1 Memory Usage: 635kB -> Index Scan using cost_invoice_charge_invoice_id_idx on cost_invoice_charge c_1 (cost=0.43..1194.13 rows=2769 width=239) (actual time=0.070..4.686 rows=3929 loops=3) Index Cond: (invoice_id = 14517) Filter: (chg_amt <> '0'::numeric) Rows Removed by Filter: 14 -> Hash (cost=31.69..31.69 rows=969 width=64) (actual time=1.127..1.127 rows=969 loops=3) Buckets: 1024 Batches: 1 Memory Usage: 103kB -> Seq Scan on gl_strings gs_1 (cost=0.00..31.69 rows=969 width=64) (actual time=0.165..0.714 rows=969 loops=3) -> Index Scan using "invoice_id+acct_level" on cost_invoice_facepage f_1 (cost=0.29..0.31 rows=1 width=31) (actual time=0.001..0.002 rows=1 loops=378550) Index Cond: ((cdb.invoice_id = invoice_id) AND (acct_level = 1)) -> Hash (cost=2.73..2.73 rows=73 width=17) (actual time=0.035..0.035 rows=73 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 12kB -> Seq Scan on appdata_vendor_common vn_1 (cost=0.00..2.73 rows=73 width=17) (actual time=0.014..0.021 rows=73 loops=1) Planning Time: 3.636 ms Execution Time: 1550.844 ms and OPTION 2 WITHOUT INDEXES Append (cost=48.58..43257.20 rows=163773 width=371) (actual time=7.965..831.408 rows=379129 loops=1) -> Nested Loop (cost=48.58..12251.68 rows=36504 width=362) (actual time=7.965..14.476 rows=579 loops=1) -> Hash Left Join (cost=44.66..9918.22 rows=507 width=322) (actual time=0.588..6.245 rows=579 loops=1) Hash Cond: (cd.gl_string_id = gs.id) -> Nested Loop Left Join (cost=0.85..9873.07 rows=507 width=262) (actual time=0.245..5.442 rows=579 loops=1) Filter: ((c.gl_rule_type IS NULL) OR ((cd.charge_id IS NOT NULL) AND (c.gl_rule_type_id <> ALL ('{60,70}'::integer[])))) -> Index Scan using cost_invoice_charge_invoice_id_idx on cost_invoice_charge c (cost=0.43..1204.53 rows=1188 width=243) (actual time=0.231..3.003 rows=579 loops=1) Index Cond: (invoice_id = 14517) Filter: ((chg_amt <> '0'::numeric) AND ((gl_rule_type IS NULL) OR (gl_rule_type_id <> ALL ('{60,70}'::integer[])))) Rows Removed by Filter: 3364 -> Index Scan using "gl_charge_detail.charge_id->cost_invoice_info_only.id" on gl_charge_detail cd (cost=0.42..7.28 rows=1 width=27) (actual time=0.003..0.003 rows=1 loops=579) Index Cond: (c.id = charge_id) -> Hash (cost=31.69..31.69 rows=969 width=64) (actual time=0.331..0.331 rows=969 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 103kB -> Seq Scan on gl_strings gs (cost=0.00..31.69 rows=969 width=64) (actual time=0.017..0.183 rows=969 loops=1) -> Materialize (cost=3.93..1877.35 rows=72 width=44) (actual time=0.013..0.013 rows=1 loops=579) -> Hash Left Join (cost=3.93..1876.99 rows=72 width=44) (actual time=7.370..7.698 rows=1 loops=1) Hash Cond: (f.vendor_id = vn.id) -> Nested Loop (cost=0.29..1873.14 rows=72 width=31) (actual time=7.307..7.635 rows=1 loops=1) -> Index Scan using cost_invoice_header_id_idx on cost_invoice_header ch (cost=0.29..8.31 rows=1 width=4) (actual time=0.011..0.013 rows=1 loops=1) Index Cond: (id = 14517) Filter: (status_code <> ALL ('{100,101,102,490}'::integer[])) -> Seq Scan on cost_invoice_facepage f (cost=0.00..1864.12 rows=72 width=31) (actual time=7.293..7.619 rows=1 loops=1) Filter: ((invoice_id = 14517) AND (acct_level = 1)) Rows Removed by Filter: 40340 -> Hash (cost=2.73..2.73 rows=73 width=17) (actual time=0.045..0.045 rows=73 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 12kB -> Seq Scan on appdata_vendor_common vn (cost=0.00..2.73 rows=73 width=17) (actual time=0.022..0.028 rows=73 loops=1) -> Hash Left Join (cost=4248.29..28548.92 rows=127269 width=374) (actual time=234.692..789.334 rows=378550 loops=1) Hash Cond: (cdb.invoice_id = f_1.invoice_id) -> Gather (cost=2272.55..24385.32 rows=2663 width=338) (actual time=216.507..376.349 rows=378550 loops=1) Workers Planned: 2 Workers Launched: 2 -> Hash Left Join (cost=1272.55..23119.02 rows=1110 width=338) (actual time=128.932..389.669 rows=126183 loops=3) Hash Cond: (cdb.gl_string_id = gs_1.id) -> Hash Join (cost=1228.74..23072.30 rows=1110 width=278) (actual time=127.984..308.092 rows=126183 loops=3) Hash Cond: (cdb.charge_id = c_1.id) -> Parallel Seq Scan on gl_charge_detail_ban cdb (cost=0.00..20581.15 rows=480915 width=43) (actual time=0.163..117.001 rows=384732 loops=3) -> Hash (cost=1194.13..1194.13 rows=2769 width=239) (actual time=8.779..8.779 rows=3929 loops=3) Buckets: 4096 Batches: 1 Memory Usage: 635kB -> Index Scan using cost_invoice_charge_invoice_id_idx on cost_invoice_charge c_1 (cost=0.43..1194.13 rows=2769 width=239) (actual time=0.050..5.563 rows=3929 loops=3) Index Cond: (invoice_id = 14517) Filter: (chg_amt <> '0'::numeric) Rows Removed by Filter: 14 -> Hash (cost=31.69..31.69 rows=969 width=64) (actual time=0.829..0.829 rows=969 loops=3) Buckets: 1024 Batches: 1 Memory Usage: 103kB -> Seq Scan on gl_strings gs_1 (cost=0.00..31.69 rows=969 width=64) (actual time=0.184..0.534 rows=969 loops=3) -> Hash (cost=1804.87..1804.87 rows=13670 width=44) (actual time=18.101..18.101 rows=13705 loops=1) Buckets: 16384 Batches: 1 Memory Usage: 1198kB -> Hash Left Join (cost=3.64..1804.87 rows=13670 width=44) (actual time=0.075..14.009 rows=13705 loops=1) Hash Cond: (f_1.vendor_id = vn_1.id) -> Seq Scan on cost_invoice_facepage f_1 (cost=0.00..1763.26 rows=13670 width=31) (actual time=0.017..6.216 rows=13705 loops=1) Filter: (acct_level = 1) Rows Removed by Filter: 26636 -> Hash (cost=2.73..2.73 rows=73 width=17) (actual time=0.052..0.052 rows=73 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 12kB -> Seq Scan on appdata_vendor_common vn_1 (cost=0.00..2.73 rows=73 width=17) (actual time=0.013..0.027 rows=73 loops=1) Planning Time: 3.365 ms Execution Time: 863.941 ms
Look at the line that is driving the iteration over the index scan: Gather (cost=2272.55..24385.32 rows=2663 width=338) (actual time=204.055..335.965 rows=378550 loops=1) It thinks the index scan will get iterated 2663 times (with a different value of invoice_id for each one) but it really gets iterated 378550 times, (this latter number is where the 'loops' field on the index scan comes from), a difference of 140 fold. Every time you hit the index, you need to re-descend from the root to the leaf, locking and unlocking pages as you go. While this is not terribly expensive, it does add up if you do it 378550 times. It gets to be faster to process the table in bulk into a private hash table. But since the estimated row count is so wrong, PostgreSQL doesn't realize that in this case.
Recursive query slow on strange conditions
The following query is part of a much bigger one that runs perfectly fast on a filled DB but on a nearly empty one it is very long. In this simplified form, it takes ~400ms to execute but if you remove either line (1) or lines (2) and (3) then it takes ~35ms. Why ? And how do I make it work normally ? Some background about the DB : DB is VACUUMed and ANALYZEd ctract is empty contrats contains only 2 lines, none of which has a idtypecontrat IN (4,5) so tmpctr1 is empty copyrightad contains 280 rows, only one matches the filters idoeu=13 and role IN ('E','CE') in all cases, query returns ONE row (the one returned by the first part of the recursive CTE) line (1) is absolutely not used in this version but removing it hides the problem for some reason WITH RECURSIVE tmpctr1 AS ( SELECT ced.idad AS cedant, ced.idclient FROM contrats c JOIN CtrAct ced ON c.idcontrat=ced.idcontrat AND ced.isassignor JOIN CtrAct ces ON c.idcontrat=ces.idcontrat AND NOT COALESCE(ces.isassignor,FALSE) --(1) WHERE idtypecontrat IN (4,5) ) ,rec1 AS ( SELECT ca.idoeu,ca.idad AS chn,1 AS idclient, 1 AS level FROM copyrightad ca WHERE ca.role IN ('E','CE') AND ca.idoeu = 13 UNION SELECT r.idoeu,0, 0, r.level+1 FROM rec1 r LEFT JOIN tmpctr1 c ON r.chn=c.cedant LEFT JOIN tmpctr1 c2 ON r.idclient=c2.idclient -- (2) WHERE r.level<20 AND (c.cedant is not null OR c2.cedant is not null --(3) ) ) select * from rec1 Query plan #1 : slow QUERY PLAN CTE Scan on rec1 (cost=1662106.61..2431078.65 rows=38448602 width=16) (actual time=384.975..398.182 rows=1 loops=1) CTE tmpctr1 -> Hash Join (cost=36.06..116.37 rows=148225 width=8) (actual time=0.009..0.010 rows=0 loops=1) Hash Cond: (c.idcontrat = ces.idcontrat) -> Hash Join (cost=1.04..28.50 rows=385 width=16) (actual time=0.009..0.009 rows=0 loops=1) Hash Cond: (ced.idcontrat = c.idcontrat) -> Seq Scan on ctract ced (cost=0.00..25.40 rows=770 width=12) (actual time=0.008..0.008 rows=0 loops=1) Filter: isassignor -> Hash (cost=1.02..1.02 rows=1 width=4) (never executed) -> Seq Scan on contrats c (cost=0.00..1.02 rows=1 width=4) (never executed) Filter: (idtypecontrat = ANY ('{4,5}'::integer[])) -> Hash (cost=25.40..25.40 rows=770 width=4) (never executed) -> Seq Scan on ctract ces (cost=0.00..25.40 rows=770 width=4) (never executed) Filter: (NOT COALESCE(isassignor, false)) CTE rec1 -> Recursive Union (cost=0.00..1661990.25 rows=38448602 width=16) (actual time=384.973..398.179 rows=1 loops=1) -> Seq Scan on copyrightad ca (cost=0.00..8.20 rows=2 width=16) (actual time=384.970..384.981 rows=1 loops=1) Filter: (((role)::text = ANY ('{E,CE}'::text[])) AND (idoeu = 13)) Rows Removed by Filter: 279 -> Merge Left Join (cost=21618.01..89301.00 rows=3844860 width=16) (actual time=13.193..13.193 rows=0 loops=1) Merge Cond: (r.idclient = c2.idclient) Filter: ((c_1.cedant IS NOT NULL) OR (c2.cedant IS NOT NULL)) Rows Removed by Filter: 1 -> Sort (cost=3892.89..3905.86 rows=5188 width=16) (actual time=13.179..13.180 rows=1 loops=1) Sort Key: r.idclient Sort Method: quicksort Memory: 25kB -> Hash Right Join (cost=0.54..3572.76 rows=5188 width=16) (actual time=13.170..13.171 rows=1 loops=1) Hash Cond: (c_1.cedant = r.chn) -> CTE Scan on tmpctr1 c_1 (cost=0.00..2964.50 rows=148225 width=4) (actual time=0.011..0.011 rows=0 loops=1) -> Hash (cost=0.45..0.45 rows=7 width=16) (actual time=13.150..13.150 rows=1 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> WorkTable Scan on rec1 r (cost=0.00..0.45 rows=7 width=16) (actual time=13.138..13.140 rows=1 loops=1) Filter: (level < 20) -> Materialize (cost=17725.12..18466.25 rows=148225 width=8) (actual time=0.008..0.008 rows=0 loops=1) -> Sort (cost=17725.12..18095.68 rows=148225 width=8) (actual time=0.007..0.007 rows=0 loops=1) Sort Key: c2.idclient Sort Method: quicksort Memory: 25kB -> CTE Scan on tmpctr1 c2 (cost=0.00..2964.50 rows=148225 width=8) (actual time=0.000..0.000 rows=0 loops=1) Planning Time: 0.270 ms JIT: Functions: 53 Options: Inlining true, Optimization true, Expressions true, Deforming true Timing: Generation 5.064 ms, Inlining 4.491 ms, Optimization 236.336 ms, Emission 155.206 ms, Total 401.097 ms Execution Time: 403.549 ms Query plan #2 : fast : line (1) is hidden QUERY PLAN CTE Scan on rec1 (cost=240.86..245.90 rows=252 width=16) (actual time=0.030..0.058 rows=1 loops=1) CTE tmpctr1 -> Hash Join (cost=1.04..28.50 rows=385 width=8) (actual time=0.001..0.001 rows=0 loops=1) Hash Cond: (ced.idcontrat = c.idcontrat) -> Seq Scan on ctract ced (cost=0.00..25.40 rows=770 width=12) (actual time=0.001..0.001 rows=0 loops=1) Filter: isassignor -> Hash (cost=1.02..1.02 rows=1 width=4) (never executed) -> Seq Scan on contrats c (cost=0.00..1.02 rows=1 width=4) (never executed) Filter: (idtypecontrat = ANY ('{4,5}'::integer[])) CTE rec1 -> Recursive Union (cost=0.00..212.35 rows=252 width=16) (actual time=0.029..0.056 rows=1 loops=1) -> Seq Scan on copyrightad ca (cost=0.00..8.20 rows=2 width=16) (actual time=0.027..0.041 rows=1 loops=1) Filter: (((role)::text = ANY ('{E,CE}'::text[])) AND (idoeu = 13)) Rows Removed by Filter: 279 -> Hash Right Join (cost=9.97..19.91 rows=25 width=16) (actual time=0.013..0.013 rows=0 loops=1) Hash Cond: (c2.idclient = r.idclient) Filter: ((c_1.cedant IS NOT NULL) OR (c2.cedant IS NOT NULL)) Rows Removed by Filter: 1 -> CTE Scan on tmpctr1 c2 (cost=0.00..7.70 rows=385 width=8) (actual time=0.000..0.000 rows=0 loops=1) -> Hash (cost=9.81..9.81 rows=13 width=16) (actual time=0.009..0.009 rows=1 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Hash Right Join (cost=0.54..9.81 rows=13 width=16) (actual time=0.008..0.008 rows=1 loops=1) Hash Cond: (c_1.cedant = r.chn) -> CTE Scan on tmpctr1 c_1 (cost=0.00..7.70 rows=385 width=4) (actual time=0.001..0.001 rows=0 loops=1) -> Hash (cost=0.45..0.45 rows=7 width=16) (actual time=0.003..0.003 rows=1 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> WorkTable Scan on rec1 r (cost=0.00..0.45 rows=7 width=16) (actual time=0.002..0.002 rows=1 loops=1) Filter: (level < 20) Planning Time: 0.330 ms Execution Time: 0.094 ms Query plan #3 : fast : lines (2) and (3) are hidden QUERY PLAN CTE Scan on rec1 (cost=1829.46..2907.50 rows=53902 width=16) (actual time=0.050..0.074 rows=1 loops=1) CTE rec1 -> Recursive Union (cost=0.00..1829.46 rows=53902 width=16) (actual time=0.049..0.072 rows=1 loops=1) -> Seq Scan on copyrightad ca (cost=0.00..8.20 rows=2 width=16) (actual time=0.046..0.067 rows=1 loops=1) Filter: (((role)::text = ANY ('{E,CE}'::text[])) AND (idoeu = 13)) Rows Removed by Filter: 279 -> Hash Join (cost=30.45..74.32 rows=5390 width=16) (actual time=0.003..0.003 rows=0 loops=1) Hash Cond: (c.idcontrat = ced.idcontrat) -> Hash Join (cost=1.04..28.50 rows=385 width=8) (actual time=0.002..0.002 rows=0 loops=1) Hash Cond: (ces.idcontrat = c.idcontrat) -> Seq Scan on ctract ces (cost=0.00..25.40 rows=770 width=4) (actual time=0.002..0.002 rows=0 loops=1) Filter: (NOT COALESCE(isassignor, false)) -> Hash (cost=1.02..1.02 rows=1 width=4) (never executed) -> Seq Scan on contrats c (cost=0.00..1.02 rows=1 width=4) (never executed) Filter: (idtypecontrat = ANY ('{4,5}'::integer[])) -> Hash (cost=29.08..29.08 rows=27 width=12) (never executed) -> Hash Join (cost=0.54..29.08 rows=27 width=12) (never executed) Hash Cond: (ced.idad = r.chn) -> Seq Scan on ctract ced (cost=0.00..25.40 rows=766 width=8) (never executed) Filter: (isassignor AND (idad IS NOT NULL)) -> Hash (cost=0.45..0.45 rows=7 width=12) (never executed) -> WorkTable Scan on rec1 r (cost=0.00..0.45 rows=7 width=12) (never executed) Filter: (level < 20) Planning Time: 0.310 ms Execution Time: 0.179 ms PostgreSQL 12.2 Edit: the same query on the same DB on PostgreSQL 11.6 runs fast (still highly over-estimating rows on some parts) so I guess this is a regression.
Why? The immediate reason for the big difference in query execution time is "Just-in-Time compilation", which is active by default in Postgres 12. Quoting the release notes: Enable Just-in-Time (JIT) compilation by default, if the server has been built with support for it (Andres Freund) Note that this support is not built by default, but has to be selected explicitly while configuring the build. Turn it off in your session and test again: SET jit = off But JIT only amplifies the underlying problem: Estimates are way off in the query plan, which leads Postgres to assume a huge number of rows resulting from the joins in CTE tmpctr1, and assume that JIT would pay off. Keep PostgreSQL from sometimes choosing a bad query plan You asserted that ... DB is VACUUMed and ANALYZEd ctract is empty But Postgres expects to find 770 rows in a sequential scan: -> Seq Scan on ctract ced (cost=0.00..25.40 rows=770 width=12) (actual time=0.008..0.008 rows=0 loops=1) Filter: isassignor Bold emphasis mine. The number 770 comes directly from pg_class.reltuples, meaning that statistic is completely out of date. Maybe you relied on autovacuum but something kept it from kicking in, or its settings are not aggressive enough? Run this manually and retry: ANALYZE ctract; There is probably more potential to optimize, but I stopped processing here. In a populated database, indexes will help a lot. Are you aware that partial or expression indexes can help with customized statistics? See: Index that is not used, yet influences query Get count estimates from pg_class.reltuples for given conditions Abount (1): JOIN CtrAct ces ON c.idcontrat=ces.idcontrat AND NOT COALESCE(ces.isassignor,FALSE) --(1) Try replacing it with the equivalent: JOIN CtrAct ces ON c.idcontrat=ces.idcontrat AND ces.isassignor IS NOT TRUE It's clearer in any case. The convoluted expression may prevent index usage or better estimates (not the problem here).
select max(id) from joined table inefficient query plan
I have a view req_res which joins 2 tables - Request and Response inner joined on requestId. Request has primary key - ID column. When I query (Query 1): select max(ID) from req_res where ID > 1000000 and ID < 2000000; Explain plan: hash join: Index scan of Request.ID and seqential scan of Response.request_id query duration: 30s When I lower the boundaries to 900k (Query 2): select max(ID) from req_res where ID > 1000000 and ID < 1900000; Plan: nested loop: Index scan of Request.ID and Index only scan of Response.request_id query duration: 3s When I play with first query and disable hash join - set enable_hashjoin=off; I get Merge join plan. When I disable also the merge join plan with set enable_mergejoin=off; I get nested loop, which completes in 3 seconds (instead of 30 using hash join). Size of the Request table is ~70 Mil records. Most of the requests have response counterpart, but some of them don't. Version: PostgreSQL 10.10 req_res DDL: CREATE OR REPLACE VIEW public.req_res AS SELECT req.id, res.req_id, res.body::character varying(500), res.time, res.duration, res.balance, res.header::character varying(100), res.created_at FROM response res JOIN request req ON req.req_id = res.req_id; Query 1 Plan: Aggregate (cost=2834115.70..2834115.71 rows=1 width=8) (actual time=154709.729..154709.730 rows=1 loops=1) Buffers: shared hit=467727 read=685320 dirtied=214, temp read=240773 written=239751 -> Hash Join (cost=2493060.64..2831172.33 rows=1177346 width=8) (actual time=143800.101..154147.080 rows=1198706 loops=1) Hash Cond: (req.req_id = res.req_id) Buffers: shared hit=467727 read=685320 dirtied=214, temp read=240773 written=239751 -> Append (cost=0.44..55619.59 rows=1177346 width=16) (actual time=0.957..2354.648 rows=1200001 loops=1) Buffers: shared hit=438960 read=32014 -> Index Scan using "5_5_req_pkey" on _hyper_2_5_chunk rs (cost=0.44..19000.10 rows=399803 width=16) (actual time=0.956..546.231 rows=399999 loops=1) Index Cond: ((id >= 49600001) AND (id <= 50800001)) Buffers: shared hit=178872 read=10742 -> Index Scan using "7_7_req_pkey" on _hyper_2_7_chunk rs_1 (cost=0.44..36619.50 rows=777543 width=16) (actual time=0.037..767.744 rows=800002 loops=1) Index Cond: ((id >= 49600001) AND (id <= 50800001)) Buffers: shared hit=260088 read=21272 -> Hash (cost=1367864.98..1367864.98 rows=68583298 width=8) (actual time=143681.850..143681.850 rows=68568554 loops=1) Buckets: 262144 Batches: 512 Memory Usage: 7278kB Buffers: shared hit=28764 read=653306 dirtied=214, temp written=233652 -> Append (cost=0.00..1367864.98 rows=68583298 width=8) (actual time=0.311..99590.021 rows=68568554 loops=1) Buffers: shared hit=28764 read=653306 dirtied=214 -> Seq Scan on _hyper_3_2_chunk wt (cost=0.00..493704.44 rows=24941244 width=8) (actual time=0.309..14484.420 rows=24950147 loops=1) Buffers: shared hit=661 read=243631 -> Seq Scan on _hyper_3_6_chunk wt_1 (cost=0.00..503935.04 rows=24978804 width=8) (actual time=0.334..14487.931 rows=24963020 loops=1) Buffers: shared hit=168 read=253979 -> Seq Scan on _hyper_3_8_chunk wt_2 (cost=0.00..370225.50 rows=18663250 width=8) (actual time=0.327..10837.291 rows=18655387 loops=1) Buffers: shared hit=27935 read=155696 dirtied=214 Planning time: 3.986 ms Execution time: 154709.859 ms Query 2 Plan: Finalize Aggregate (cost=2634042.50..2634042.51 rows=1 width=8) (actual time=5525.626..5525.627 rows=1 loops=1) Buffers: shared hit=8764620 read=12779 -> Gather (cost=2634042.29..2634042.50 rows=2 width=8) (actual time=5525.609..5525.705 rows=3 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=8764620 read=12779 -> Partial Aggregate (cost=2633042.29..2633042.30 rows=1 width=8) (actual time=5515.507..5515.508 rows=1 loops=3) Buffers: shared hit=8764620 read=12779 -> Nested Loop (cost=0.88..2632023.83 rows=407382 width=8) (actual time=5.383..5261.979 rows=332978 loops=3) Buffers: shared hit=8764620 read=12779 -> Append (cost=0.44..40514.98 rows=407383 width=16) (actual time=0.035..924.498 rows=333334 loops=3) Buffers: shared hit=446706 -> Parallel Index Scan using "5_5_req_pkey" on _hyper_2_5_chunk rs (cost=0.44..16667.91 rows=166585 width=16) (actual time=0.033..169.854 rows=133333 loops=3) Index Cond: ((id >= 49600001) AND (id <= 50600001)) Buffers: shared hit=190175 -> Parallel Index Scan using "7_7_req_pkey" on _hyper_2_7_chunk rs_1 (cost=0.44..23847.07 rows=240798 width=16) (actual time=0.039..336.091 rows=200001 loops=3) Index Cond: ((id >= 49600001) AND (id <= 50600001)) Buffers: shared hit=256531 -> Append (cost=0.44..6.33 rows=3 width=8) (actual time=0.011..0.011 rows=1 loops=1000001) Buffers: shared hit=8317914 read=12779 -> Index Only Scan using "2_2_response_pkey" on _hyper_3_2_chunk wt (cost=0.44..2.11 rows=1 width=8) (actual time=0.003..0.003 rows=0 loops=1000001) Index Cond: (req_id = req.req_id) Heap Fetches: 0 Buffers: shared hit=3000005 -> Index Only Scan using "6_6_response_pkey" on _hyper_3_6_chunk wt_1 (cost=0.44..2.11 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=1000001) Index Cond: (req_id = req.req_id) Heap Fetches: 192906 Buffers: shared hit=3551440 read=7082 -> Index Only Scan using "8_8_response_pkey" on _hyper_3_8_chunk wt_2 (cost=0.44..2.10 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=443006) Index Cond: (req_id = req.req_id) Heap Fetches: 162913 Buffers: shared hit=1766469 read=5697 Planning time: 0.839 ms Execution time: 5525.814 ms
Explain postgres query, why is the query that much longer with WHERE and LIMIT
I'm using postgres v9.6.5. I have a query which seems not that complicated and was wondering why is it so "slow" (it's not really that slow, but I don't have a lot of data actually - like a few thousand rows). Here is the query: SELECT o0.* FROM "orders" AS o0 JOIN "balances" AS b1 ON b1."id" = o0."balance_id" JOIN "users" AS u3 ON u3."id" = b1."user_id" WHERE (u3."partner_id" = 3) ORDER BY o0."id" DESC LIMIT 10; And that's query plan: Limit (cost=0.43..12.84 rows=10 width=148) (actual time=0.062..53.866 rows=4 loops=1) -> Nested Loop (cost=0.43..4750.03 rows=3826 width=148) (actual time=0.061..53.864 rows=4 loops=1) Join Filter: (b1.user_id = u3.id) Rows Removed by Join Filter: 67404 -> Nested Loop (cost=0.43..3945.32 rows=17856 width=152) (actual time=0.025..38.457 rows=16852 loops=1) -> Index Scan Backward using orders_pkey on orders o0 (cost=0.29..897.80 rows=17856 width=148) (actual time=0.016..11.558 rows=16852 loops=1) -> Index Scan using balances_pkey on balances b1 (cost=0.14..0.16 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=16852) Index Cond: (id = o0.balance_id) -> Materialize (cost=0.00..1.19 rows=3 width=4) (actual time=0.000..0.000 rows=4 loops=16852) -> Seq Scan on users u3 (cost=0.00..1.18 rows=3 width=4) (actual time=0.023..0.030 rows=4 loops=1) Filter: (partner_id = 3) Rows Removed by Filter: 12 Planning time: 0.780 ms Execution time: 54.053 ms I actually tried without LIMIT and I got quite different plan: Sort (cost=874.23..883.80 rows=3826 width=148) (actual time=11.361..11.362 rows=4 loops=1) Sort Key: o0.id DESC Sort Method: quicksort Memory: 26kB -> Hash Join (cost=3.77..646.55 rows=3826 width=148) (actual time=11.300..11.346 rows=4 loops=1) Hash Cond: (o0.balance_id = b1.id) -> Seq Scan on orders o0 (cost=0.00..537.56 rows=17856 width=148) (actual time=0.012..8.464 rows=16852 loops=1) -> Hash (cost=3.55..3.55 rows=18 width=4) (actual time=0.125..0.125 rows=24 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Hash Join (cost=1.21..3.55 rows=18 width=4) (actual time=0.046..0.089 rows=24 loops=1) Hash Cond: (b1.user_id = u3.id) -> Seq Scan on balances b1 (cost=0.00..1.84 rows=84 width=8) (actual time=0.011..0.029 rows=96 loops=1) -> Hash (cost=1.18..1.18 rows=3 width=4) (actual time=0.028..0.028 rows=4 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Seq Scan on users u3 (cost=0.00..1.18 rows=3 width=4) (actual time=0.014..0.021 rows=4 loops=1) Filter: (partner_id = 3) Rows Removed by Filter: 12 Planning time: 0.569 ms Execution time: 11.420 ms And also without WHERE (but with LIMIT): Limit (cost=0.43..4.74 rows=10 width=148) (actual time=0.023..0.066 rows=10 loops=1) -> Nested Loop (cost=0.43..7696.26 rows=17856 width=148) (actual time=0.022..0.065 rows=10 loops=1) Join Filter: (b1.user_id = u3.id) Rows Removed by Join Filter: 139 -> Nested Loop (cost=0.43..3945.32 rows=17856 width=152) (actual time=0.009..0.029 rows=10 loops=1) -> Index Scan Backward using orders_pkey on orders o0 (cost=0.29..897.80 rows=17856 width=148) (actual time=0.007..0.015 rows=10 loops=1) -> Index Scan using balances_pkey on balances b1 (cost=0.14..0.16 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=10) Index Cond: (id = o0.balance_id) -> Materialize (cost=0.00..1.21 rows=14 width=4) (actual time=0.001..0.001 rows=15 loops=10) -> Seq Scan on users u3 (cost=0.00..1.14 rows=14 width=4) (actual time=0.005..0.007 rows=16 loops=1) Planning time: 0.286 ms Execution time: 0.097 ms As you can see, without WHERE it's much faster. Can someone provide me with some information where can I look for explanations for those plans to better understand them? And also what can I do to make those queries faster (or I shouldn't worry cause with like 100 times more data they will still be fast enough? - 50ms is fine for me tbh)
PostgreSQL thinks that it will be fastest if it scans orders in the correct order until it finds a matching users entry that satisfies the WHERE condition. However, it seems that the data distribution is such that it has to scan almost 17000 orders before it finds a match. Since PostgreSQL doesn't know how values correlate across tables, there is nothing much you can do to change that. You can force PostgreSQL to plan the query without the LIMIT clause like this: SELECT * FROM (<your query without ORDER BY and LIMIT> OFFSET 0) q ORDER BY id DESC LIMIT 10; With a top-N-sort this should perform better.