I don't understand explain result of a slow query - postgresql

I have a slow query (> 1s). Here is the result of an explain analyze on that query:
Nested Loop Left Join (cost=0.42..32275.13 rows=36 width=257) (actual time=549.409..1106.044 rows=2 loops=1)
Join Filter: (answer.lt_surveyee_survey_id = lt_surveyee_survey.id)
-> Index Scan using lt_surveyee_survey_id_key on lt_surveyee_survey (cost=0.42..8.44 rows=1 width=64) (actual time=0.108..0.111 rows=1 loops=1)
Index Cond: (id = 'xxxxx'::citext)
-> Seq Scan on answer (cost=0.00..32266.24 rows=36 width=230) (actual time=549.285..1105.910 rows=2 loops=1)
Filter: (lt_surveyee_survey_id = 'xxxxx'::citext)
Rows Removed by Filter: 825315
Planning time: 0.592 ms
Execution time: 1106.124 ms
The xxxxx parts of the result are an uuid like. I did not built that database, so I have no clue right now. Here is the query:
EXPLAIN ANALYZE SELECT
lt_surveyee_survey.id
-- +Some other fields
FROM lt_surveyee_survey
LEFT JOIN answer ON answer.lt_surveyee_survey_id = lt_surveyee_survey.id
WHERE lt_surveyee_survey.id = 'xxxxx';

Your JOIN is causing the performance drop according to the EXPLAIN ANALYZE output. You can see that there are 2 different lookups, one took a couple milliseconds, and the other took half a second.
The difference is indicated by the beginning of the lines: Index Scan and Seq Scan, where Seq is short for sequential, meaning that all of the rows had to be checked by the DBMS to process the JOIN. The reason why sequential scans would occur is a missing index on the column being checked (answer.lt_surveyee_survey_id in your case).
Adding an index should solve the performance issue.

Related

Very slow query planning time with many indexes

I have a table "nodes" with a JSONB-column "data", in which I store various types of information.
The JSON includes pieces of text in different languages, that need to be frequently searched on by end-users. Per language, I therefore create about 4 indices similar to the following (usually with a separate search dictionary for that language)
CREATE INDEX nodes_label_sv_idx
ON nodes
USING GIN (to_tsvector('swedish_text', data #>> '{label,sv}'));
This works fine when only 2 or 3 languages are present, but when adding 20 more languages (each with 4 indices for that language's path into the JSON), the query planner becomes extremely slow for some queries (180 ms), even though those queries are still executing very fast (less than 1ms). The table currently contains about 50K records.
Weird thing is, those queries are simple joins on other columns of the table (unrelated to the "data" column), so the language-related indices are completely irrelevant. Also, the more language-related indices I drop, the faster the planner becomes again.
I completely understand that the planner needs to check all (150+) indices for potential relevance, but 180ms is extreme. Anybody have a suggestion? By the way, the problem only seems to occur when using a view (not when directly using the query underlying the view).
I am using PostgresSQL 13 on Mac & Linux.
Edit:
query:
EXPLAIN (ANALYZE, BUFFERS)
select 1
from ca_nodes can
where (can.owner_id = 168 or can.aco_id = 0)
limit 1;
underlying view:
CREATE VIEW ca_nodes AS
SELECT n.nid, n.owner_id, c.aco_id,
FROM nodes n inner join acos c on n.nid = c.nid;
explain (analyze, buffers) output:
Limit (cost=0.58..32.45 rows=1 width=4) (actual time=0.038..0.039 rows=1 loops=1)
Buffers: shared hit=6
-> Merge Join (cost=0.58..15136.78 rows=475 width=4) (actual time=0.037..0.037 rows=1 loops=1)
Merge Cond: (n.nid = c.nid)
Join Filter: ((n.owner_id = 168) OR (c.aco_id = 0))
Buffers: shared hit=6
-> Index Scan using nodes_pkey on nodes n (cost=0.29..12094.35 rows=47604 width=8) (actual time=0.017..0.017 rows=1 loops=1)
Buffers: shared hit=3
-> Index Scan using acos_nid_idx on precalculated_acos c (cost=0.29..2090.35 rows=47604 width=8) (actual time=0.014..0.014 rows=1 loops=1)
Buffers: shared hit=3
Planning:
Buffers: shared hit=83
Planning Time: 180.392 ms
Execution Time: 0.079 ms

Why do seq/index scans take so long when running query after a while? How to make it fast?

Problem:
I have a query that joins three tables. Whenever I run this query after a while (say 24hrs), it would take a lot of time to execute. But from that time onwards, it would execute really fast (~ 70x faster). I wanted to know what's the problem that it takes so long to execute for the first time, and how to solve it.
Table conditions:
The tables are: property_2, property_attribute_2, and property_address_2. Each of which is a partition of a bigger table (i.e. property, property_attribute, and property_address). Also, rows in property_attribute_2 and property_address_2 have reference key to property_2 using column property_id. These columns (property.id, property_attribute_2.property_id, and property_address_2.property_id) are all indexed.
The query is:
select * from public.property_2 a
inner join public.property_attribute_2 b on a.id = b.property_id
left join public.property_address_2 c on a.id=c.property_id
The query plan when I run it after a while is:
Hash Right Join (cost=670010.33..983391.75 rows=2477776 width=185) (actual time=804159.499..1065892.338 rows=2477924 loops=1)
Hash Cond: (c.property_id = a.id)
-> Seq Scan on property_address_2 c (cost=0.00..131660.48 rows=4257948 width=72) (actual time=289.781..247906.955 rows=4257973 loops=1)
-> Hash (cost=595483.13..595483.13 rows=2477776 width=117) (actual time=803833.183..803833.185 rows=2477921 loops=1)
Buckets: 32768 Batches: 128 Memory Usage: 3165kB
-> Hash Join (cost=94193.96..595483.13 rows=2477776 width=117) (actual time=98061.326..802753.642 rows=2477921 loops=1)
Hash Cond: (a.id = b.property_id)
-> Seq Scan on property_2 a (cost=0.00..265463.84 rows=6176884 width=105) (actual time=1349.284..696922.438 rows=4272433 loops=1)
-> Hash (cost=48702.76..48702.76 rows=2477776 width=20) (actual time=95497.307..95497.308 rows=2477921 loops=1)
Buckets: 65536 Batches: 64 Memory Usage: 2624kB
-> Seq Scan on property_attribute_2 b (cost=0.00..48702.76 rows=2477776 width=20) (actual time=464.476..94126.890 rows=2477921 loops=1)
Planning time: 4.034 ms
Execution time: 1065995.827 ms
And the query plan after the first run is:
Hash Right Join (cost=670010.33..983391.75 rows=2477776 width=185) (actual time=8828.873..13764.283 rows=2477924 loops=1)
Hash Cond: (c.property_id = a.id)
-> Seq Scan on property_address_2 c (cost=0.00..131660.48 rows=4257948 width=72) (actual time=0.050..1411.877 rows=4257973 loops=1)
-> Hash (cost=595483.13..595483.13 rows=2477776 width=117) (actual time=8826.620..8826.623 rows=2477921 loops=1)
Buckets: 32768 Batches: 128 Memory Usage: 3165kB
-> Hash Join (cost=94193.96..595483.13 rows=2477776 width=117) (actual time=1356.224..7925.850 rows=2477921 loops=1)
Hash Cond: (a.id = b.property_id)
-> Seq Scan on property_2 a (cost=0.00..265463.84 rows=6176884 width=105) (actual time=0.034..2652.013 rows=4272433 loops=1)
-> Hash (cost=48702.76..48702.76 rows=2477776 width=20) (actual time=1354.828..1354.829 rows=2477921 loops=1)
Buckets: 65536 Batches: 64 Memory Usage: 2624kB
-> Seq Scan on property_attribute_2 b (cost=0.00..48702.76 rows=2477776 width=20) (actual time=0.023..630.081 rows=2477921 loops=1)
Planning time: 1.181 ms
Execution time: 13872.977 ms
Also worth noting that I have a couple of other Postgres databases on this machine and different jobs use different tables on these databases on a regular basis.
If cold cache is the problem, as it seems to be the case, you can warm it up before running the query. Postgres ships with the additional module pg_prewarm providing a range of tools to populate the cache.
Instructions how to set it up here:
PostgreSQL: Force data into memory
Then you run something like:
SELECT pg_prewarm('public.property_2', 'prefetch');
SELECT pg_prewarm('public.property_attribute_2', 'prefetch');
SELECT pg_prewarm('public.property_address_2', 'prefetch');
Of course, if you always run the same SELECT query without filter predicates, you might as well just run the same query to populate the cache, without using the fancy module. Possibly scheduled with a cron job?
... are all indexed.
As you can see in the EXPLAIN output, your indexes go unused. You fetch all rows without filter predicate, so indexes typically won't help. And you do it with SELECT *, i.e. get all columns from all joined tables, so index-only scans are out, too. You might improve performance by listing only the columns you actually need in the SELECT list.
Obviously, more RAM (and proper configuration for PostgreSQL buffer cache) can help, too.
Or you might be able to reduce RAM requirements with VACUUM (FULL) or, possibly, with an optimized table definition with proper column types and order. Not just for the tables at hand, also for other tables competing for the same resources (thereby evicting "your" blocks from the cache). See:
Calculating and saving space in PostgreSQL
The difference must be caching: the first time, the data are read from disk, in subsequent runs they are found in RAM. Run EXPLAIN (ANALYZE, BUFFERS) with track_io_timing = on to confirm that.
However, it seems that either your I/O system is really slow or your tables are quite bloated. EXPLAIN (ANALYZE, BUFFERS) would show how many blocks are read, so you would know.
If bloat is indeed your problem, VACUUM (FULL) would help.

Transaction is 20x slower on production server

One my development server the test transaction (series of updates etc) runs in about 2 minutes. On the production server it's about 25 minutes.
The server reads the file and inserts records. It starts out fast but then goes slower and slower as it progresses. There is an aggregate table update for each record that gets inserted and it is that update that progressively slows down. That aggregate update does query the table being written to with the inserts.
The config is only different in max_worker_processes (development 8, prod 16), shared_buffers (dev 128MB, prod 512MB), wal_buffers (Dev 4MB, prod 16MB).
I've tried tweaking a few configs and also dumped the whole database and re-did initdb just in case it was not upgraded (to 9.6) correctly. Nothing's worked.
I'm hoping that someone with experience in this could tell me what to look for.
Edit: After receiving some comments I was able figure out what is going on and get a work around going, but I think there has to be a better way. Firstly what is happening is this:
At first there is no data in the table for the relevant index, postgresql works out this plan. Note that there is data in the table just not anything with the relevant "businessIdentifier" index or "transactionNumber".
Aggregate (cost=16.63..16.64 rows=1 width=4) (actual time=0.031..0.031 rows=1 loops=1)
-> Nested Loop (cost=0.57..16.63 rows=1 width=4) (actual time=0.028..0.028 rows=0 loops=1)
-> Index Scan using transactionlinedateindex on "transactionLine" ed (cost=0.29..8.31 rows=1 width=5) (actual time=0.028..0.028 rows=0 loops=1)
Index Cond: ((("businessIdentifier")::text = '36'::text) AND ("reconciliationNumber" = 4519))
-> Index Scan using transaction_pkey on transaction eh (cost=0.29..8.31 rows=1 width=9) (never executed)
Index Cond: ((("businessIdentifier")::text = '36'::text) AND (("transactionNumber")::text = (ed."transactionNumber")::text))
Filter: ("transactionStatus" = 'posted'::"transactionStatusItemType")
Planning time: 0.915 ms
Execution time: 0.100 ms
Then as data gets inserted it becomes a really bad plan. 474ms in this example. It needs to execute thousands of times depending on what is uploaded so 474ms is bad.
Aggregate (cost=16.44..16.45 rows=1 width=4) (actual time=474.222..474.222 rows=1 loops=1)
-> Nested Loop (cost=0.57..16.44 rows=1 width=4) (actual time=474.218..474.218 rows=0 loops=1)
Join Filter: ((eh."transactionNumber")::text = (ed."transactionNumber")::text)
-> Index Scan using transaction_pkey on transaction eh (cost=0.29..8.11 rows=1 width=9) (actual time=0.023..0.408 rows=507 loops=1)
Index Cond: (("businessIdentifier")::text = '37'::text)
Filter: ("transactionStatus" = 'posted'::"transactionStatusItemType")
-> Index Scan using transactionlineprovdateindex on "transactionLine" ed (cost=0.29..8.31 rows=1 width=5) (actual time=0.934..0.934 rows=0 loops=507)
Index Cond: (("businessIdentifier")::text = '37'::text)
Filter: ("reconciliationNumber" = 4519)
Rows Removed by Filter: 2520
Planning time: 0.848 ms
Execution time: 474.278 ms
Vacuum analyze fixes it. But you cannot run Vacuum analyze until after the transaction is committed. After Vacuum analyze postgresql uses a different plan and it's back down to 0.1 ms.
Aggregate (cost=16.63..16.64 rows=1 width=4) (actual time=0.072..0.072 rows=1 loops=1)
-> Nested Loop (cost=0.57..16.63 rows=1 width=4) (actual time=0.069..0.069 rows=0 loops=1)
-> Index Scan using transactionlinedateindex on "transactionLine" ed (cost=0.29..8.31 rows=1 width=5) (actual time=0.067..0.067 rows=0 loops=1)
Index Cond: ((("businessIdentifier")::text = '37'::text) AND ("reconciliationNumber" = 4519))
-> Index Scan using transaction_pkey on transaction eh (cost=0.29..8.31 rows=1 width=9) (never executed)
Index Cond: ((("businessIdentifier")::text = '37'::text) AND (("transactionNumber")::text = (ed."transactionNumber")::text))
Filter: ("transactionStatus" = 'posted'::"transactionStatusItemType")
Planning time: 1.134 ms
Execution time: 0.141 ms
My work around is to commit after about 100 inserts and then run Vacuum analyze and then continue. The only problem is that if something in the remainder of the data fails and it's rolled back, there will still be 100 records inserted.
Is there a better way to handle this? Should I just upgrade to version 10 or 11 or postgresql and would that help?
There is an aggregate table update for each record that gets inserted and it is that update that progressively slows down.
Here is an idea: Change the workflow to (1) import external data into a table, using COPY interface, (2) Index and ANALYZE that data, (3) run the final UPDATE with all required joins/groupings to do actual transformation and update the aggregate table.
All of that could be done in one, long transaction - if needed.
Only if the whole thing is locking some vital database objects for too long, you should consider splitting this into separate transactions / batches (processing data partitioned in some generic way, by date/time or by ID).
But you cannot run Vacuum analyze until after the transaction is committed.
To get updated costs of query plan, you need only ANALYZE not VACUUM.

Postgresql 9.x: Index to optimize `xpath_exists` (XMLEXISTS) queries

We have queries of the form
select sum(acol)
where xpath_exists('/Root/KeyValue[Key="val"]/Value//text()', xmlcol)
What index can be built to speed up the where clause ?
A btree index created using
create index idx_01 using btree(xpath_exists('/Root/KeyValue[Key="val"]/Value//text()', xmlcol))
does not seem to be used at all.
EDIT
Setting enable_seqscan to off, the query using xpath_exists is much faster (one order of magnitude) and clearly shows using the corresponding index (the btree index built with xpath_exists).
Any clue why PostgreSQL would not be using the index and attempt a much slower sequential scan ?
Since I do not want to disable sequential scanning globally, I am back to square one and I am happily welcoming suggestions.
EDIT 2 - Explain plans
See below - Cost of first plan (seqscan off) is slightly higher but processing time much faster
b2box=# set enable_seqscan=off;
SET
b2box=# explain analyze
Select count(*)
from B2HEAD.item
where cluster = 'B2BOX' and ( ( xpath_exists('/MessageInfo[FinalRecipient="ABigBank"]//text()', content) ) ) offset 0 limit 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=22766.63..22766.64 rows=1 width=0) (actual time=606.042..606.042 rows=1 loops=1)
-> Aggregate (cost=22766.63..22766.64 rows=1 width=0) (actual time=606.039..606.039 rows=1 loops=1)
-> Bitmap Heap Scan on item (cost=1058.65..22701.38 rows=26102 width=0) (actual time=3.290..603.823 rows=4085 loops=1)
Filter: (xpath_exists('/MessageInfo[FinalRecipient="ABigBank"]//text()'::text, content, '{}'::text[]) AND ((cluster)::text = 'B2BOX'::text))
-> Bitmap Index Scan on item_counter_01 (cost=0.00..1052.13 rows=56515 width=0) (actual time=2.283..2.283 rows=4085 loops=1)
Index Cond: (xpath_exists('/MessageInfo[FinalRecipient="ABigBank"]//text()'::text, content, '{}'::text[]) = true)
Total runtime: 606.136 ms
(7 rows)
plan on explain.depesz.com
b2box=# set enable_seqscan=on;
SET
b2box=# explain analyze
Select count(*)
from B2HEAD.item
where cluster = 'B2BOX' and ( ( xpath_exists('/MessageInfo[FinalRecipient="ABigBank"]//text()', content) ) ) offset 0 limit 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=22555.71..22555.72 rows=1 width=0) (actual time=10864.163..10864.163 rows=1 loops=1)
-> Aggregate (cost=22555.71..22555.72 rows=1 width=0) (actual time=10864.160..10864.160 rows=1 loops=1)
-> Seq Scan on item (cost=0.00..22490.45 rows=26102 width=0) (actual time=33.574..10861.672 rows=4085 loops=1)
Filter: (xpath_exists('/MessageInfo[FinalRecipient="ABigBank"]//text()'::text, content, '{}'::text[]) AND ((cluster)::text = 'B2BOX'::text))
Rows Removed by Filter: 108945
Total runtime: 10864.242 ms
(6 rows)
plan on explain.depesz.com
Planner cost parameters
Cost of first plan (seqscan off) is slightly higher but processing time much faster
This tells me that your random_page_cost and seq_page_cost are probably wrong. You're likely on storage with fast random I/O - either because most of the database is cached in RAM or because you're using SSD, SAN with cache, or other storage where random I/O is inherently fast.
Try:
SET random_page_cost = 1;
SET seq_page_cost = 1.1;
to greatly reduce the cost param differences and then re-run. If that does the job consider changing those params in postgresql.conf..
Your row-count estimates are reasonable, so it doesn't look like a planner mis-estimation problem or a problem with bad table statistics.
Incorrect query
Your query is also incorrect. OFFSET 0 LIMIT 1 without an ORDER BY will produce unpredictable results unless you're guaranteed to have exactly one match, in which case the OFFSET ... LIMIT ... clauses are unnecessary and can be removed entirely.
You're usually much better off phrasing such queries as SELECT max(...) or SELECT min(...) where possible; PostgreSQL will tend to be able to use an index to just pluck off the desired value without doing an expensive table scan or an index scan and sort.
Tips
BTW, for future questions the PostgreSQL wiki has some good information in the performance category and a guide to asking Slow query questions.

Trivial order by double type: performance crash

Characters:
id BIGINT
geo_point POINT (PostGIS)
stroke_when TIMESTAMPTZ (indexed!)
stroke_when_second DOUBLE PRECISION
PostgeSQL 9.1, PostGIS 2.0.
1. Query:
SELECT ST_AsText(geo_point)
FROM lightnings
ORDER BY stroke_when DESC, stroke_when_second DESC
LIMIT 1
Total runtime: 31100.911 ms !
EXPLAIN (ANALYZE on, VERBOSE off, COSTS on, BUFFERS on):
Limit (cost=169529.67..169529.67 rows=1 width=144) (actual time=31100.869..31100.869 rows=1 loops=1)
Buffers: shared hit=3343 read=120342
-> Sort (cost=169529.67..176079.48 rows=2619924 width=144) (actual time=31100.865..31100.865 rows=1 loops=1)
Sort Key: stroke_when, stroke_when_second
Sort Method: top-N heapsort Memory: 17kB
Buffers: shared hit=3343 read=120342
-> Seq Scan on lightnings (cost=0.00..156430.05 rows=2619924 width=144) (actual time=1.589..29983.410 rows=2619924 loops=1)
Buffers: shared hit=3339 read=120342
2. Selecting another field:
SELECT id
FROM lightnings
ORDER BY stroke_when DESC, stroke_when_second DESC
LIMIT 1
Total runtime: 2144.057 ms.
EXPLAIN (ANALYZE on, VERBOSE off, COSTS on, BUFFERS on):
Limit (cost=162979.86..162979.86 rows=1 width=24) (actual time=2144.013..2144.014 rows=1 loops=1)
Buffers: shared hit=3513 read=120172
-> Sort (cost=162979.86..169529.67 rows=2619924 width=24) (actual time=2144.011..2144.011 rows=1 loops=1)
Sort Key: stroke_when, stroke_when_second
Sort Method: top-N heapsort Memory: 17kB
Buffers: shared hit=3513 read=120172
-> Seq Scan on lightnings (cost=0.00..149880.24 rows=2619924 width=24) (actual time=0.056..1464.904 rows=2619924 loops=1)
Buffers: shared hit=3509 read=120172
3. Correct optimization:
SELECT id
FROM lightnings
ORDER BY stroke_when DESC
LIMIT 1
Total runtime: 0.044 ms
EXPLAIN (ANALYZE on, VERBOSE off, COSTS on, BUFFERS on):
Limit (cost=0.00..3.52 rows=1 width=16) (actual time=0.020..0.020 rows=1 loops=1)
Buffers: shared hit=5
-> Index Scan Backward using lightnings_idx on lightnings (cost=0.00..9233232.80 rows=2619924 width=16) (actual time=0.018..0.018 rows=1 loops=1)
Buffers: shared hit=5
As you can see there are two bad and very different collisions though the query is a quite primitive when the SQL optimizer uses index:
Even if the optimizer doesnt use the index, why using As_Text(geo_point) instead of id takes so much more time? There is only one row in result!
Impossibility of using first order index when an unindexed field is presented in ORDER BY. Mention that as on practice only few rows on each second are presented in DB.
Of course above is a simplified query, extracted from a more complex construction. Usually I select rows by date range, applying complicated filters.
PostgreSQL can't use your index to produce values in the desired order for the first two queries. When two or more rows have identical store_when identical they are returned from the index scan in arbitrary order. To decide the correct order for the rows would require a secondary sorting pass. Because PostgreSQL executor doesn't have a facility to perform that secondary sort it falls back to a full sort approach.
If you regularly need to query the table with that order then replace your current index with a composite index that includes both columns.
You can transform your current query into a form that explicitly specifies the secondary sort on only the largest value of store_when:
SELECT ST_AsText(geo_point) FROM lightnings
WHERE store_when = (SELECT max(store_when) FROM lightnings)
ORDER BY stroke_when_second DESC LIMIT 1
First step could be: create a composite index on {stroke_when, stroke_when_second}