Query statistics on postgres - postgresql

I am facing a problem with a specific query on postgressql.
Look the explain:
-> Nested Loop Left Join (cost=21547.86..87609.16 rows=123 width=69) (actual time=28.997..562.299 rows=32710 loops=1)
-> Hash Join (cost=21547.30..87210.72 rows=123 width=53) (actual time=28.913..74.682 rows=32710 loops=1)
Hash Cond: (registry.id = profile.registry_id)
-> Bitmap Heap Scan on registry (cost=726.99..66218.46 rows=65503 width=53) (actual time=5.123..32.794 rows=66496 loops=1)
Recheck Cond: ((tenant_id = 1009469) AND active AND (excluded_at IS NULL))
Heap Blocks: exact=12563
-> Bitmap Index Scan on registry_tenant_id_excluded_at (cost=0.00..710.61 rows=65503 width=0) (actual time=3.589..3.589 rows=66496 loops=1)
Index Cond: (tenant_id = 1009469)
-> Hash (cost=20202.82..20202.82 rows=49399 width=16) (actual time=23.738..23.738 rows=32710 loops=1)
Buckets: 65536 Batches: 1 Memory Usage: 2046kB
-> Index Only Scan using profile_tenant_id_registry_id on profile (cost=0.56..20202.82 rows=49399 width=16) (actual time=0.019..19.173 rows=32710 loops=1)
Index Cond: (tenant_id = 1009469)
Heap Fetches: 29493
It misestimate the hash join, even if both the scans are accurate.
I already tried to boost the statistics on the related columns but it just estimated from 117 to 123, so I guess this is not the issue.
Why it is misestimating so hard?
The nested loop takes a lot of work for the database.

It looks like rows with same tenant_id also mostly have the same value for registry_id/registry.id. But the planner doesn't understand that. It thinks that registry_id=registry.id will be true as often for the actually selected rows as it will be for randomly selected pairs of rows.
I don't think there is anything you can do about this.

Related

PostgreSQL: Sequential scan despite having indexes

I have the following two tables.
person_addresses
address_normalization
The person_addresses table has a field named address_id as the primary key and address_normalization has the corresponding field address_id which has an index on it.
Now, when I explain the following query, I see a sequential scan.
SELECT
count(*)
FROM
mp_member2.person_addresses pa
JOIN mp_member2.address_normalization an ON
an.address_id = pa.address_id
WHERE
an.sr_modification_time >= 1550692189468;
-- Result: 2654
Please refer to the following screenshot.
You see that there is a sequential scan after the hash join. I'm not sure I understand this part; why would a sequential scan follow a hash join.
And as seen in the query above, the set of records returned is also low.
Is this expected behaviour or am I doing something wrong?
Update #1: I also have indices on the sr_modification_time fields of both the tables
Update #2: Full execution plan
Aggregate (cost=206944.74..206944.75 rows=1 width=0) (actual time=2807.844..2807.844 rows=1 loops=1)
Buffers: shared hit=4629 read=82217
-> Hash Join (cost=2881.95..206825.15 rows=47836 width=0) (actual time=0.775..2807.160 rows=2654 loops=1)
Hash Cond: (pa.address_id = an.address_id)
Buffers: shared hit=4629 read=82217
-> Seq Scan on person_addresses pa (cost=0.00..135924.93 rows=4911993 width=8) (actual time=0.005..1374.610 rows=4911993 loops=1)
Buffers: shared hit=4588 read=82217
-> Hash (cost=2432.05..2432.05 rows=35992 width=18) (actual time=0.756..0.756 rows=1005 loops=1)
Buckets: 4096 Batches: 1 Memory Usage: 41kB
Buffers: shared hit=41
-> Index Scan using mp_member2_address_normalization_mod_time on address_normalization an (cost=0.43..2432.05 rows=35992 width=18) (actual time=0.012..0.424 rows=1005 loops=1)
Index Cond: (sr_modification_time >= 1550692189468::bigint)
Buffers: shared hit=41
Planning time: 0.244 ms
Execution time: 2807.885 ms
Update #3: I tried with a newer timestamp and it used an index scan.
EXPLAIN (
ANALYZE
, buffers
, format TEXT
) SELECT
COUNT(*)
FROM
mp_member2.person_addresses pa
JOIN mp_member2.address_normalization an ON
an.address_id = pa.address_id
WHERE
an.sr_modification_time >= 1557507300342;
-- count: 1364
Query Plan:
Aggregate (cost=295.48..295.49 rows=1 width=0) (actual time=2.770..2.770 rows=1 loops=1)
Buffers: shared hit=1404
-> Nested Loop (cost=4.89..295.43 rows=19 width=0) (actual time=0.038..2.491 rows=1364 loops=1)
Buffers: shared hit=1404
-> Index Scan using mp_member2_address_normalization_mod_time on address_normalization an (cost=0.43..8.82 rows=14 width=18) (actual time=0.009..0.142 rows=341 loops=1)
Index Cond: (sr_modification_time >= 1557507300342::bigint)
Buffers: shared hit=14
-> Bitmap Heap Scan on person_addresses pa (cost=4.46..20.43 rows=4 width=8) (actual time=0.004..0.005 rows=4 loops=341)
Recheck Cond: (address_id = an.address_id)
Heap Blocks: exact=360
Buffers: shared hit=1390
-> Bitmap Index Scan on idx_mp_member2_person_addresses_address_id (cost=0.00..4.46 rows=4 width=0) (actual time=0.003..0.003 rows=4 loops=341)
Index Cond: (address_id = an.address_id)
Buffers: shared hit=1030
Planning time: 0.214 ms
Execution time: 2.816 ms
That is the expected behavior because you don't have index for sr_modification_time so after create the hash join db has to scan the whole set to check each row for the sr_modification_time value
You should create:
index for (sr_modification_time)
or composite index for (address_id , sr_modification_time )

Discrepancy between Explain Analyze Actual Row Scanned vs. Total Row in Table

Recently we experienced a performance problem in Production Aurora PG cluster. This is an EXPLAIN ANALYZE of the query.
The majority of the time is spent on Bitmap Index Scan on job_stage (cost=0.00..172.93 rows=9666 width=0) (actual time=238.410..238.410 rows=2019444 loops=1) where 2019444 are scanned. However, what troubles me is that there are only 70k rows in this table. Autovacuum is turned on, but the RDS was overloaded recently from another issue. We suspect that the autovacuum was running behind. If that is the case, would it explain our observation the scanned row exceeds actual row in table?
Nested Loop (cost=229.16..265.28 rows=1 width=464) (actual time=239.815..239.815 rows=0 loops=1)
-> Nested Loop (cost=228.62..252.71 rows=1 width=540) (actual time=239.814..239.814 rows=0 loops=1)
Join Filter: (job.scanner_uuid = scanner_resource_pool.resource_uuid)
Rows Removed by Join Filter: 1
-> Index Scan using scanner_resource_pool_scanner_index on scanner_resource_pool (cost=0.41..8.43 rows=1 width=115) (actual time=0.017..0.019 rows=1 loops=1)
Index Cond: ((box_uuid = '5d8a7e0c-23ff-4853-bb6d-ffff6a38afa7'::text) AND (scanner_uuid = '9be9ac50-de05-4ddd-9545-ddddc484dce'::text))
-> Bitmap Heap Scan on job (cost=228.22..244.23 rows=4 width=464) (actual time=239.790..239.791 rows=1 loops=1)
Recheck Cond: ((box_uuid = '5d8a7e0c-23ff-4853-bb6d-ffff6a38afa7'::text) AND (stage = 'active'::text))
Rows Removed by Index Recheck: 6
Heap Blocks: exact=791
-> BitmapAnd (cost=228.22..228.22 rows=4 width=0) (actual time=238.913..238.913 rows=0 loops=1)
-> Bitmap Index Scan on job_box_status (cost=0.00..55.04 rows=1398 width=0) (actual time=0.183..0.183 rows=899 loops=1)
Index Cond: (box_uuid = '5d8a7e0c-23ff-4853-bb6d-ffff6a38afa7'::text)
-> Bitmap Index Scan on job_stage (cost=0.00..172.93 rows=9666 width=0) (actual time=238.410..238.410 rows=2019444 loops=1)
Index Cond: (stage = 'active'::text)
-> Index Only Scan using uc_box_uuid on scanner (cost=0.54..12.56 rows=1 width=87) (never executed)
Index Cond: ((box_uuid = '5d8a7e0c-23ff-4853-bb6d-ffff6a38afa7'::text) AND (uuid = '9be9ac50-de05-4ddd-9545-ddddc484dce'::text))
Heap Fetches: 0
Planning time: 1.274 ms
Execution time: 239.876 ms
I found my answer by confirming with AWS. If autovacuum was running behind, the EXPLAIN ANALYZE result may show this discrepancy.

Postgres Query Optimization w/ simple join

I have the following query:
SELECT "person_dimensions"."dimension"
FROM "person_dimensions"
join users
on users.id = person_dimensions.user_id
where users.team_id = 2
The following is the result of EXPLAIN ANALYZE:
Nested Loop (cost=0.43..93033.84 rows=452 width=11) (actual time=1245.321..42915.426 rows=827 loops=1)
-> Seq Scan on person_dimensions (cost=0.00..254.72 rows=13772 width=15) (actual time=0.022..9.907 rows=13772 loops=1)
-> Index Scan using users_pkey on users (cost=0.43..6.73 rows=1 width=4) (actual time=2.978..3.114 rows=0 loops=13772)
Index Cond: (id = person_dimensions.user_id)
Filter: (team_id = 2)
Rows Removed by Filter: 1
Planning time: 0.396 ms
Execution time: 42915.678 ms
Indexes exist on person_dimensions.user_id and users.team_id, so it is unclear as to why this seemingly simple query would be taking so long.
Maybe it has something to do with team_id being unable to be used in the join condition? Ideas how to speed this up?
EDIT:
I tried this query:
SELECT "person_dimensions"."dimension"
FROM "person_dimensions"
join users ON users.id = person_dimensions.user_id
WHERE users.id IN (2337,2654,3501,56,4373,1060,3170,97,4629,41,3175,4541,2827)
which contains the id's returned by the subquery:
SELECT id FROM users WHERE team_id = 2
The result was 380ms versus 42s as above. I could use this as a workaround, but I am really curious as to what is going on here...
I rebooted my DB server yesterday, and when it came back up this same query was performing as expected with a completely different query plan that used expected indices:
QUERY PLAN
Hash Join (cost=1135.63..1443.45 rows=84 width=11) (actual time=0.354..6.312 rows=835 loops=1)
Hash Cond: (person_dimensions.user_id = users.id)
-> Seq Scan on person_dimensions (cost=0.00..255.17 rows=13817 width=15) (actual time=0.002..2.764 rows=13902 loops=1)
-> Hash (cost=1132.96..1132.96 rows=214 width=4) (actual time=0.175..0.175 rows=60 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 11kB
-> Bitmap Heap Scan on users (cost=286.07..1132.96 rows=214 width=4) (actual time=0.032..0.157 rows=60 loops=1)
Recheck Cond: (team_id = 2)
Heap Blocks: exact=68
-> Bitmap Index Scan on index_users_on_team_id (cost=0.00..286.02 rows=214 width=0) (actual time=0.021..0.021 rows=82 loops=1)
Index Cond: (team_id = 2)
Planning time: 0.215 ms
Execution time: 6.474 ms
Anyone have any ideas why it required a reboot to be aware of all of this? Could it be that manual vacuums were required that hadn't been done in a while, or something like this? Recall I did do an analyze on the relevant tables before the reboot and it didn't change anything.

Extremely slow query in PostgreSQL (order by multi col)

Please let me know if you need the table definitions. As I'm sure is obvious, I have several tables holding information about a user (partydevicestatus, groupgps) each have a foreign key relationship to the user held in partyrelationship and each row has an identifier of the "group" that user's in.
With this query, I simply want to, for a particular group (in this example, 6) get the user details, position and device info for each user.
I can clearly see with the explain that the Sort is the issue here, due to having 2 columns with a lot of data. However I have an index on both the columns being sorted on and it has yielded no improvement. I'm almost certain this is a terribly optimised query but I am not experienced enough with PostgreSQL to find a better one?
SELECT DISTINCT ON("public".groupgps.groupmember)
"public".groupgps.groupgps,
"public".groupgps.groupmember,
"public".groupgps.messagetype,
"public".groupgps.lat,
"public".groupgps.lon,
"public".groupgps.date_stamp,
"public".partyrelationship.to_party,
"public".partyrelationship.to_name,
"public".partyrelationship.image_url,
"public".partyrelationship.partyrelationship,
"public".partydevicestatus.connection_type,
"public".partydevicestatus.battery_level,
"public".partydevicestatus.charging_state,
"public".partydevicestatus.timestamp
FROM "public".groupgps
INNER JOIN "public".partyrelationship
ON "public".partyrelationship.partyrelationship = "public".groupgps.groupmember
INNER JOIN "public".partysettings
ON "public".partysettings.groupmember = "public".groupgps.groupmember
LEFT JOIN "public".partydevicestatus
ON "public".partydevicestatus.groupmember_id = "public".groupgps.groupmember
WHERE "public".partyrelationship.from_party = 6
AND "public".partysettings.gps_tracking_enabled = true
ORDER BY "public".groupgps.groupmember, "public".groupgps.date_stamp DESC
Explain Result
Unique (cost=1368961.43..1390701.85 rows=25 width=192) (actual time=24622.609..27043.061 rows=4 loops=1)
-> Sort (cost=1368961.43..1379831.64 rows=4348083 width=192) (actual time=24622.601..26604.659 rows=2221853 loops=1)
Sort Key: groupgps.groupmember, groupgps.date_stamp DESC
Sort Method: external merge Disk: 431400kB
-> Hash Left Join (cost=50.64..87013.93 rows=4348083 width=192) (actual time=0.499..3011.806 rows=2221853 loops=1)
Hash Cond: (groupgps.groupmember = partydevicestatus.groupmember_id)
-> Hash Join (cost=31.66..29732.32 rows=77101 width=167) (actual time=0.153..2242.950 rows=109041 loops=1)
Hash Cond: (groupgps.groupmember = partyrelationship.partyrelationship)
-> Seq Scan on groupgps (cost=0.00..24372.00 rows=1217200 width=50) (actual time=0.005..1933.528 rows=1217025 loops=1)
-> Hash (cost=31.48..31.48 rows=14 width=125) (actual time=0.141..0.141 rows=5 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Hash Join (cost=10.31..31.48 rows=14 width=125) (actual time=0.092..0.138 rows=5 loops=1)
Hash Cond: (partysettings.groupmember = partyrelationship.partyrelationship)
-> Seq Scan on partysettings (cost=0.00..20.75 rows=75 width=8) (actual time=0.003..0.038 rows=75 loops=1)
Filter: gps_tracking_enabled
-> Hash (cost=9.79..9.79 rows=42 width=117) (actual time=0.076..0.076 rows=42 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 11kB
-> Seq Scan on partyrelationship (cost=0.00..9.79 rows=42 width=117) (actual time=0.007..0.058 rows=42 loops=1)
Filter: (from_party = 6)
Rows Removed by Filter: 181
-> Hash (cost=12.88..12.88 rows=488 width=29) (actual time=0.341..0.341 rows=489 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 41kB
-> Seq Scan on partydevicestatus (cost=0.00..12.88 rows=488 width=29) (actual time=0.023..0.163 rows=489 loops=1)
Planning time: 0.878 ms
Execution time: 27218.016 ms

How does one interpret the following PostgreSQL query plan

Please, observe:
(Forgot to add order, the plan is updated)
The query:
EXPLAIN ANALYZE
SELECT DISTINCT(id), special, customer, business_no, bill_to_name, bill_to_address1, bill_to_address2, bill_to_postal_code, ship_to_name, ship_to_address1, ship_to_address2, ship_to_postal_code,
purchase_order_no, ship_date::text, calc_discount_text(o) AS discount, discount_absolute, delivery, hst_percents, sub_total, total_before_hst, hst, total, total_discount, terms, rep, ship_via,
item_count, version, to_char(modified, 'YYYY-MM-DD HH24:MI:SS') AS "modified", to_char(created, 'YYYY-MM-DD HH24:MI:SS') AS "created"
FROM invoices o
LEFT JOIN reps ON reps.rep_id = o.rep_id
LEFT JOIN terms ON terms.terms_id = o.terms_id
LEFT JOIN shipVia ON shipVia.ship_via_id = o.ship_via_id
JOIN invoiceItems items ON items.invoice_id = o.id
WHERE items.qty < 5
ORDER BY modified
LIMIT 100
The result:
Limit (cost=2931740.10..2931747.85 rows=100 width=635) (actual time=414307.004..414387.899 rows=100 loops=1)
-> Unique (cost=2931740.10..3076319.37 rows=1865539 width=635) (actual time=414307.001..414387.690 rows=100 loops=1)
-> Sort (cost=2931740.10..2936403.95 rows=1865539 width=635) (actual time=414307.000..414325.058 rows=2956 loops=1)
Sort Key: (to_char(o.modified, 'YYYY-MM-DD HH24:MI:SS'::text)), o.id, o.special, o.customer, o.business_no, o.bill_to_name, o.bill_to_address1, o.bill_to_address2, o.bill_to_postal_code, o.ship_to_name, o.ship_to_address1, o.ship_to_address2, (...)
Sort Method: external merge Disk: 537240kB
-> Hash Join (cost=11579.63..620479.38 rows=1865539 width=635) (actual time=1535.805..131378.864 rows=1872673 loops=1)
Hash Cond: (items.invoice_id = o.id)
-> Seq Scan on invoiceitems items (cost=0.00..78363.45 rows=1865539 width=4) (actual time=0.110..4591.117 rows=1872673 loops=1)
Filter: (qty < 5)
Rows Removed by Filter: 1405763
-> Hash (cost=5498.18..5498.18 rows=64996 width=635) (actual time=1530.786..1530.786 rows=64996 loops=1)
Buckets: 1024 Batches: 64 Memory Usage: 598kB
-> Hash Left Join (cost=113.02..5498.18 rows=64996 width=635) (actual time=0.214..1043.207 rows=64996 loops=1)
Hash Cond: (o.ship_via_id = shipvia.ship_via_id)
-> Hash Left Join (cost=75.35..4566.81 rows=64996 width=607) (actual time=0.154..754.957 rows=64996 loops=1)
Hash Cond: (o.terms_id = terms.terms_id)
-> Hash Left Join (cost=37.67..3800.33 rows=64996 width=579) (actual time=0.071..506.145 rows=64996 loops=1)
Hash Cond: (o.rep_id = reps.rep_id)
-> Seq Scan on invoices o (cost=0.00..2868.96 rows=64996 width=551) (actual time=0.010..235.977 rows=64996 loops=1)
-> Hash (cost=22.30..22.30 rows=1230 width=36) (actual time=0.044..0.044 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on reps (cost=0.00..22.30 rows=1230 width=36) (actual time=0.027..0.032 rows=4 loops=1)
-> Hash (cost=22.30..22.30 rows=1230 width=36) (actual time=0.067..0.067 rows=3 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on terms (cost=0.00..22.30 rows=1230 width=36) (actual time=0.001..0.007 rows=3 loops=1)
-> Hash (cost=22.30..22.30 rows=1230 width=36) (actual time=0.043..0.043 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on shipvia (cost=0.00..22.30 rows=1230 width=36) (actual time=0.027..0.032 rows=4 loops=1)
Total runtime: 414488.582 ms
This is, obviously, awful. I am pretty new to interpreting query plans and would like to know how to extract the useful performance improvement hints from such a plan.
EDIT 1
Two kinds of entities are involved in this query - invoices and invoice items having the 1-many relationship.
An invoice item specifies the quantity of it within the parent invoice.
The given query returns 100 invoices which have at least one item with the quantity of less than 5.
That should explain why I need DISTINCT - an invoice may have several items satisfying the filter, but I do not want that same invoice returned multiple times. Hence the usage of DISTINCT. However, I am perfectly aware that there may be better means to accomplish the same semantics than using DISTINCT - I am more than willing to learn about them.
EDIT 2
Please, find below the indexes on the invoiceItems table at the time of the query:
CREATE INDEX invoiceitems_invoice_id_idx ON invoiceitems (invoice_id);
CREATE INDEX invoiceitems_invoice_id_name_index ON invoiceitems (invoice_id, name varchar_pattern_ops);
CREATE INDEX invoiceitems_name_index ON invoiceitems (name varchar_pattern_ops);
CREATE INDEX invoiceitems_qty_index ON invoiceitems (qty);
EDIT 3
The advice given by https://stackoverflow.com/users/808806/yieldsfalsehood as to how eliminate DISTINCT (and why) turns out to be a really good one. Here is the new query:
EXPLAIN ANALYZE
SELECT id, special, customer, business_no, bill_to_name, bill_to_address1, bill_to_address2, bill_to_postal_code, ship_to_name, ship_to_address1, ship_to_address2, ship_to_postal_code,
purchase_order_no, ship_date::text, calc_discount_text(o) AS discount, discount_absolute, delivery, hst_percents, sub_total, total_before_hst, hst, total, total_discount, terms, rep, ship_via,
item_count, version, to_char(modified, 'YYYY-MM-DD HH24:MI:SS') AS "modified", to_char(created, 'YYYY-MM-DD HH24:MI:SS') AS "created"
FROM invoices o
LEFT JOIN reps ON reps.rep_id = o.rep_id
LEFT JOIN terms ON terms.terms_id = o.terms_id
LEFT JOIN shipVia ON shipVia.ship_via_id = o.ship_via_id
WHERE EXISTS (SELECT 1 FROM invoiceItems items WHERE items.invoice_id = id AND items.qty < 5)
ORDER BY modified DESC
LIMIT 100
Here is the new plan:
Limit (cost=64717.14..64717.39 rows=100 width=635) (actual time=7830.347..7830.869 rows=100 loops=1)
-> Sort (cost=64717.14..64827.01 rows=43949 width=635) (actual time=7830.334..7830.568 rows=100 loops=1)
Sort Key: (to_char(o.modified, 'YYYY-MM-DD HH24:MI:SS'::text))
Sort Method: top-N heapsort Memory: 76kB
-> Hash Left Join (cost=113.46..63037.44 rows=43949 width=635) (actual time=2.322..6972.679 rows=64467 loops=1)
Hash Cond: (o.ship_via_id = shipvia.ship_via_id)
-> Hash Left Join (cost=75.78..50968.72 rows=43949 width=607) (actual time=0.650..3809.276 rows=64467 loops=1)
Hash Cond: (o.terms_id = terms.terms_id)
-> Hash Left Join (cost=38.11..50438.25 rows=43949 width=579) (actual time=0.550..3527.558 rows=64467 loops=1)
Hash Cond: (o.rep_id = reps.rep_id)
-> Nested Loop Semi Join (cost=0.43..49796.28 rows=43949 width=551) (actual time=0.015..3200.735 rows=64467 loops=1)
-> Seq Scan on invoices o (cost=0.00..2868.96 rows=64996 width=551) (actual time=0.002..317.954 rows=64996 loops=1)
-> Index Scan using invoiceitems_invoice_id_idx on invoiceitems items (cost=0.43..7.61 rows=42 width=4) (actual time=0.030..0.030 rows=1 loops=64996)
Index Cond: (invoice_id = o.id)
Filter: (qty < 5)
Rows Removed by Filter: 1
-> Hash (cost=22.30..22.30 rows=1230 width=36) (actual time=0.213..0.213 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on reps (cost=0.00..22.30 rows=1230 width=36) (actual time=0.183..0.192 rows=4 loops=1)
-> Hash (cost=22.30..22.30 rows=1230 width=36) (actual time=0.063..0.063 rows=3 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on terms (cost=0.00..22.30 rows=1230 width=36) (actual time=0.044..0.050 rows=3 loops=1)
-> Hash (cost=22.30..22.30 rows=1230 width=36) (actual time=0.096..0.096 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on shipvia (cost=0.00..22.30 rows=1230 width=36) (actual time=0.071..0.079 rows=4 loops=1)
Total runtime: 7832.750 ms
Is it the best I can count on? I have restarted the server (to clean the database caches) and rerun the query without EXPLAIN ANALYZE. It takes almost 5 seconds. Can it be improved even further? I have 65,000 invoices and 3,278,436 invoice items.
EDIT 4
Found it. I was ordering by a computation result, modified = to_char(modified, 'YYYY-MM-DD HH24:MI:SS'). Adding an index on the modified invoice field and ordering by the field itself brings the result to under 100 ms !
The final plan is:
Limit (cost=1.18..1741.92 rows=100 width=635) (actual time=3.002..27.065 rows=100 loops=1)
-> Nested Loop Left Join (cost=1.18..765042.09 rows=43949 width=635) (actual time=2.989..25.989 rows=100 loops=1)
-> Nested Loop Left Join (cost=1.02..569900.41 rows=43949 width=607) (actual time=0.413..16.863 rows=100 loops=1)
-> Nested Loop Left Join (cost=0.87..386185.48 rows=43949 width=579) (actual time=0.333..15.694 rows=100 loops=1)
-> Nested Loop Semi Join (cost=0.72..202470.54 rows=43949 width=551) (actual time=0.017..13.965 rows=100 loops=1)
-> Index Scan Backward using invoices_modified_index on invoices o (cost=0.29..155543.23 rows=64996 width=551) (actual time=0.003..4.543 rows=100 loops=1)
-> Index Scan using invoiceitems_invoice_id_idx on invoiceitems items (cost=0.43..7.61 rows=42 width=4) (actual time=0.079..0.079 rows=1 loops=100)
Index Cond: (invoice_id = o.id)
Filter: (qty < 5)
Rows Removed by Filter: 1
-> Index Scan using reps_pkey on reps (cost=0.15..4.17 rows=1 width=36) (actual time=0.007..0.008 rows=1 loops=100)
Index Cond: (rep_id = o.rep_id)
-> Index Scan using terms_pkey on terms (cost=0.15..4.17 rows=1 width=36) (actual time=0.003..0.004 rows=1 loops=100)
Index Cond: (terms_id = o.terms_id)
-> Index Scan using shipvia_pkey on shipvia (cost=0.15..4.17 rows=1 width=36) (actual time=0.006..0.008 rows=1 loops=100)
Index Cond: (ship_via_id = o.ship_via_id)
Total runtime: 27.572 ms
It is amazing! Thank you all for the help.
For starters, it's pretty standard to post explain plans to http://explain.depesz.com - that'll add some pretty formatting to it, give you a nice way to distribute the plan, and let you anonymize plans that might contain sensitive data. Even if you're not distributing the plan it makes it a lot easier to understand what's happening and can sometimes illustrate exactly where a bottleneck is.
There are countless resources that cover interpreting the details of postgres explain plans (see https://wiki.postgresql.org/wiki/Using_EXPLAIN). There are a lot of little details that get taken in to account when the database chooses a plan, but there are some general concepts that can make it easier. First, get a grasp of the page-based layout of data and indexes (you don't need to know the details of the page format, just how data and indexes get split in to pages). From there, get a feel for the two basic data access methods - full table scans and index scans - and with a little thought it should start to become clear the different situations where one would be preferred to the other (also keep in mind that an index scan isn't even always possible). At that point you can start looking in to some of the different configuration items that affect plan selection in the context of how they might tip the scale in favor of a table scan or an index scan.
Once you've got that down, move on up the plan and read in to the details of the different nodes you find - in this plan you've got a lot of hash joins, so read up on that to start with. Then, to compare apples to apples, disable hash joins entirely ("set enable_hashjoin = false;") and run your explain analyze again. Now what join method do you see? Read up on that. Compare the estimated cost of that method with the estimated cost of the hash join. Why might they be different? The estimated cost of the second plan will be higher than this first plan (otherwise it would have been preferred in the first place) but what about the real time that it takes to run the second plan? Is it lower or higher?
Finally, to address this plan specifically. With regards to that sort that's taking a long time: distinct is not a function. "DISTINCT(id)" does not say "give me all the rows that are distinct on only the column id", instead it is sorting the rows and taking the unique values based on all columns in the output (i.e. it is equivalent to writing "distinct id ..."). You should probably re-consider if you actually need that distinct in there. Normalization will tend to design away the need for distincts, and while they will occasionally be needed, whether they really are super truly needed is not always true.
You begin by chasing down the node that takes the longest, and start optimizing there. In your case, that appears to be
Seq Scan on invoiceitems items
You should add an index there, and problem also to the other tables.
You could also try increasing work_mem to get rid of the external sort.
When you have done that, the new plan will probably look completely differently, so then start over.