I'm trying to figure out what is causing the Aurora RDS Postgres database to deteriorate in terms of performance over time.
I have a quite big table in a multi tenant architecture. But only one tenant became quite slow to perform a SELECT on this table.
I have tried to run reindex on the table and the schema, I have also done a vacuum on that table but I can't see any improvement.
Strangely if I do a backup and restore of this schema(then same size of tables) it is really fast for few days then it starts to deteriorate.
There are few batch jobs with spark that ingest and populate the schema but I do not know the details, I assume that something happens in the ingestion process.
What I'm trying to understand is what a backup and restore does more the reindex and vacuum? What am I missing?
EXPLAIN (ANALYZE, BUFFERS) SELECT...
"Limit (cost=0.56..14527.17 rows=1 width=99) (actual time=12056.841..12056.842 rows=1 loops=1)"
" Buffers: shared hit=62835 read=34334"
" -> Index Scan using idx_namekey on delqmst (cost=0.56..7379519.03 rows=508 width=99) (actual time=12056.840..12056.840 rows=1 loops=1)"
" Filter: ((upper((dmacctg)::text) ~~ '6'::text) AND (upper((dmacct)::text) ~~ '0000000000025454448'::text))"
" Rows Removed by Filter: 793350"
" Buffers: shared hit=62835 read=34334"
"Planning time: 11.435 ms"
"Execution time: 12056.876 ms"```
As you can see it took around 12 seconds if I do backup and restore of the schema it takes around 1 second.
Related
In a location based app,there's a specific query which has to run fast:
SELECT count(*) FROM users
WHERE earth_box(ll_to_earth(40.71427000, -74.00597000), 50000) #> ll_to_earth(latitude, longitude)
However, when after copying the database by using Postgres' tools:
pg_dump dummy_users > dummy_users.dump
createdb slow_db
psql slow_db < dummy_users.dump
the query takes 2.5 seconds instead of 0.5 seconds on slow_db!!
The planner chooses a different route in slow_db, eg
Explain analyze on slow_db:
"Aggregate (cost=10825.18..10825.19 rows=1 width=8) (actual time=2164.396..2164.396 rows=1 loops=1)"
" -> Bitmap Heap Scan on users (cost=205.45..10818.39 rows=2714 width=0) (actual time=26.188..2155.680 rows=122836 loops=1)"
" Recheck Cond: ('(1281995.9045467733, -4697354.822067326, 4110397.4955141144),(1381995.648489849, -4597355.078124251, 4210397.23945719)'::cube #> (ll_to_earth(latitude, longitude))::cube)"
" Rows Removed by Index Recheck: 364502"
" Heap Blocks: exact=57514 lossy=33728"
" -> Bitmap Index Scan on distance_index (cost=0.00..204.77 rows=2714 width=0) (actual time=20.068..20.068 rows=122836 loops=1)"
" Index Cond: ((ll_to_earth(latitude, longitude))::cube <# '(1281995.9045467733, -4697354.822067326, 4110397.4955141144),(1381995.648489849, -4597355.078124251, 4210397.23945719)'::cube)"
"Planning Time: 1.002 ms"
"Execution Time: 2164.807 ms"
explain analyze on the origin db:
"Aggregate (cost=8807.01..8807.02 rows=1 width=8) (actual time=239.524..239.525 rows=1 loops=1)"
" -> Index Scan using distance_index on users (cost=0.41..8801.69 rows=2130 width=0) (actual time=0.156..233.760 rows=122836 loops=1)"
" Index Cond: ((ll_to_earth(latitude, longitude))::cube <# '(1281995.9045467733, -4697354.822067326, 4110397.4955141144),(1381995.648489849, -4597355.078124251, 4210397.23945719)'::cube)"
"Planning Time: 3.928 ms"
"Execution Time: 239.546 ms"
For both tables there's an index on the location which was created in the exact same way:
CREATE INDEX
distance_index ON users USING gist (ll_to_earth(latitude, longitude))
I've tried to run maintenance tools (analyze\vaccum etc) before and after running that query, with or without the index, doesn't help!
Both DBS run on the exact same machine (so same postgres server,postgres dist,configuration).
Data on both DBS is the same (one single table), and isn't changing.
The Postgres version = 12.8.
psql's \l output for those databases:
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-------------+----------+----------+---------+-------+-----------------------
dummy_users | yoni | UTF8 | en_IL | en_IL |
slow_db | yoni | UTF8 | en_IL | en_IL |
What is going on?
(Thanks to Laurenz Albe) - after
SET enable_bitmapscan = off; and SET enable_seqscan = off; on the slow database, ran the query again here is the EXPLAIN (ANALYZE, BUFFERS) output:
"Aggregate (cost=11018.63..11018.64 rows=1 width=8) (actual time=213.544..213.545 rows=1 loops=1)"
" Buffers: shared hit=11667 read=110537"
" -> Index Scan using distance_index on users (cost=0.41..11011.86 rows=2711 width=0) (actual time=0.262..207.164 rows=122836 loops=1)"
" Index Cond: ((ll_to_earth(latitude, longitude))::cube <# '(1282077.0159892815, -4697331.573647572, 4110397.4955141144),(1382076.7599323571, -4597331.829704497, 4210397.23945719)'::cube)"
" Buffers: shared hit=11667 read=110537"
"Planning Time: 0.940 ms"
"Execution Time: 213.591 ms"
Manual VACUUM / ANALYZE after restore
After restoring a new database, there are no column statistics yet. Normally, autovacuum will kick in eventually, but since "data [...] isn't changing", autovacuum wouldn't be triggered.
For the same reason (data isn't changing), I suggest to run this once after restoring your single table:
VACUUM (ANALYZE, FREEZE) users;
You might as well run FREEZE for a table that's never changed.
(FULL isn't necessary, since there are no dead tuples in a freshly restored table.)
Explanation for the plan change
With everything else being equal, I suspect at least two major problems:
Bad column statistics
Bad database configuration (the more severe problem)
See:
Keep PostgreSQL from sometimes choosing a bad query plan
In the slow DB, Postgres expects rows=2714, while it expects rows=2130 in the fast one. The difference may not seem huge, but may well be enough to tip Postgres over to the other query plan (that turns out to be inferior).
Seeing that Postgres actually finds rows=122836, either estimate is bad. The one in the slow DB is actually less bad. But the bitmap scan turns out to be slower than the index scan, even with many more qualifying rows than expected. (!) So your database configuration is most probably way off. The main problem typically is the default random_page_cost of 4, while a realistic setting for fully cached read-only table is much closer to 1. Maybe 1.1 to allow for some additional cost. There are a couple other settings that encourage index scans. Like effective_cache_size. Start here:
https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
Estimates are just that: estimates. And column statistics are also just that: statistics. So not exact but subject to random variation. You might increase the statistics target to increase the validity of column statistics.
Cheap random reads favor index scans and discourage bitmap index scans.
More qualifying rows favor a bitmap index scan. Less favor an index scan. See:
Postgres not using index when index scan is much better option
I have a table with 4707838 rows. When I run the following query on this table it takes around 9 seconds to execute.
SELECT json_agg(
json_build_object('accessorId',
p."accessorId",
'mobile',json_build_object('enabled', p.mobile,'settings',
json_build_object('proximityAccess', p."proximity",
'tapToAccess', p."tapToAccess",
'clickToAccessRange', p."clickToAccessRange",
'remoteAccess',p."remote")
),'
card',json_build_object('enabled',p."card"),
'fingerprint',json_build_object('enabled',p."fingerprint"))
) AS permissions
FROM permissions AS p
WHERE p."accessPointId"=99
The output of explain analyze is as follows:
Aggregate (cost=49860.12..49860.13 rows=1 width=32) (actual time=9011.711..9011.712 rows=1 loops=1)
Buffers: shared read=29720
I/O Timings: read=8192.273
-> Bitmap Heap Scan on permissions p (cost=775.86..49350.25 rows=33991 width=14) (actual time=48.886..8704.470 rows=36556 loops=1)
Recheck Cond: ("accessPointId" = 99)
Heap Blocks: exact=29331
Buffers: shared read=29720
I/O Timings: read=8192.273
-> Bitmap Index Scan on composite_key_accessor_access_point (cost=0.00..767.37 rows=33991 width=0) (actual time=38.767..38.768 rows=37032 loops=1)
Index Cond: ("accessPointId" = 99)
Buffers: shared read=105
I/O Timings: read=32.592
Planning Time: 0.142 ms
Execution Time: 9012.719 ms
This table has a btree index on accessorId column and composite index on (accessorId,accessPointId).
Can anyone tell me what could be the reason for this query to be slow even though it uses an index?
Over 90% of the time is waiting to get data from disk. At 3.6 ms per read, that is pretty fast for a harddrive (suggesting that much of the data was already in the filesystem cache, or that some of the reads brought in neighboring data that was also eventually required--that is sequential reads not just random reads) but slow for a SSD.
If you set enable_bitmapscan=off and clear the cache (or pick a not recently used "accessPointId" value) what performance do you get?
How big is the table? If you are reading a substantial fraction of the table and think you are not getting as much benefit from sequential reads as you should be, you can try making your OSes readahead settings more aggressive. On Linux that is something like sudo blockdev --setra ...
You could put all columns referred to by the query into the index, to enable index-only scans. But given the number of columns you are using that might be impractical. You could want "accessPointId" to be the first column in the index. By the way, is the index currently used really on (accessorId,accessPointId)? It looks to me like "accessPointId" is really the first column in that index, not the 2nd one.
You could cluster the table by an index which has "accessPointId" as the first column. That would group the related records together for faster access. But note it is a slow operation and takes a strong lock on the table while it is running, and future data going into the table won't be clustered, only the current data.
You could try to increase effective_io_concurrency so that you can have multiple io requests outstanding at a time. How effective this is will depend on your hardware.
The below query is taking more time to run. How can I optimize the below query to run for more records? I have run Explain Analyze for this query. Attached the output for the same.
This was the existing query created as a View and taking a long time (more than hours) to return the result.
I have done vacuum, analyze and reindex on these 2 tables but no luck.
select st_tr.step_trail_id,
st_tr.test_id,
st_tr.trail_id,
st_tr.step_name,
filter.regular_expression as filter_expression,
filter.order_of_occurrence as filter_order,
filter.match_type as filter_match_type,
null as begins_with,
null as ends_with,
null as input_source,
null as pattern_expression,
null as pattern_matched,
null as pattern_status,
null as pattern_order,
'filter' as record_type
from tab_report_step st_tr,
tab_report_filter filter
where st_tr.st_tr_id = filter.st_tr_id)
Query plan:
Hash Join (cost=446852.58..1176380.76 rows=6353676 width=489) (actual time=16641.953..47270.831 rows=6345360 loops=1)
Buffers: shared hit=1 read=451605 dirtied=5456 written=5424, temp read=154080 written=154074
-> Seq Scan on tab_report_filter filter (cost=0..24482.76 rows=6353676 width=161) (actual time=0.041..8097.233 rows=6345360 loops=1)
Buffers: shared read=179946 dirtied=4531 written=4499
-> Hash (cost=318817.7..318817.7 rows=4716070 width=89) (actual time=16627.291..16627.291 rows=4709040 loops=1)
Buffers: shared hit=1 read=271656 dirtied=925 written=925, temp written=47629
-> Seq Scan on tab_report_step st_tr (cost=0..318817.7 rows=4716070 width=89) (actual time=0.059..10215.484 rows=4709040 loops=1)
Buffers: shared hit=1 read=271656 dirtied=925 written=925
You have not run VACUUM on these tables. Perhaps VACUUM (FULL), but certainly not VACUUM.
There are two things that can be improved:
Make sure that no pages have to be dirtied or written while you read them. That is most likely because this is the first time you read the rows, and PostgreSQL sets hint bits.
Running VACUUM (without FULL) would have fix that. Also, if you repeat the experiment, you shouldn't get those dirtied and written buffers any more.
Give the query more memory by increasing work_mem. The hash does not fit in work_mem and spills to disk, which causes extra disk reads and writes, which is bad for performance.
Since you join two big tables with no restricting WHERE conditions and have a lot of result rows, this query will never be fast.
I'm running Postgres 11.
I have a table with 1.000.000 (1 million) rows and each row has a size of 40 bytes (it contains 5 columns). That is equal to 40MB.
When I execute (directly executed on the DB via DBeaver, DataGrid ect.- not called via Node, Python ect.):
SELECT * FROM TABLE
it takes 40 secs first time (is this not very slow even for the first time).
The CREATE statement of my tables:
CREATE TABLE public.my_table_1 (
c1 int8 NOT NULL GENERATED ALWAYS AS IDENTITY,
c2 int8 NOT NULL,
c3 timestamptz NULL,
c4 float8 NOT NULL,
c5 float8 NOT NULL,
CONSTRAINT my_table_1_pkey PRIMARY KEY (id)
);
CREATE INDEX my_table_1_c3_idx ON public.my_table_1 USING btree (c3);
CREATE UNIQUE INDEX my_table_1_c2_idx ON public.my_table_1 USING btree (c2);
On 5 random tables: EXPLAIN (ANALYZE, BUFFERS) select * from [table_1...2,3,4,5]
Seq Scan on table_1 (cost=0.00..666.06 rows=34406 width=41) (actual time=0.125..7.698 rows=34406 loops=1)
Buffers: shared read=322
Planning Time: 15.521 ms
Execution Time: 10.139 ms
Seq Scan on table_2 (cost=0.00..9734.87 rows=503187 width=41) (actual time=0.103..57.698 rows=503187 loops=1)
Buffers: shared read=4703
Planning Time: 14.265 ms
Execution Time: 74.240 ms
Seq Scan on table_3 (cost=0.00..3486217.40 rows=180205440 width=41) (actual time=0.022..14988.078 rows=180205379 loops=1)
Buffers: shared hit=7899 read=1676264
Planning Time: 0.413 ms
Execution Time: 20781.303 ms
Seq Scan on table_4 (cost=0.00..140219.73 rows=7248073 width=41) (actual time=13.638..978.125 rows=7247991 loops=1)
Buffers: shared hit=7394 read=60345
Planning Time: 0.246 ms
Execution Time: 1264.766 ms
Seq Scan on table_5 (cost=0.00..348132.60 rows=17995260 width=41) (actual time=13.648..2138.741 rows=17995174 loops=1)
Buffers: shared hit=82 read=168098
Planning Time: 0.339 ms
Execution Time: 2730.355 ms
When I add a LIMIT 1.000.000 to table_5 (it contains 1.7 million rows)
Limit (cost=0.00..19345.79 rows=1000000 width=41) (actual time=0.007..131.939 rows=1000000 loops=1)
Buffers: shared hit=9346
-> Seq Scan on table_5(cost=0.00..348132.60 rows=17995260 width=41) (actual time=0.006..68.635 rows=1000000 loops=1)
Buffers: shared hit=9346
Planning Time: 0.048 ms
Execution Time: 164.133 ms
When I add a WHERE clause between 2 dates (I'm monitored the query below with DataDog software and the results are here (max.~ 31K rows/sec when fetching): https://www.screencast.com/t/yV0k4ShrUwSd):
Seq Scan on table_5 (cost=0.00..438108.90 rows=17862027 width=41) (actual time=0.026..2070.047 rows=17866766 loops=1)
Filter: (('2018-01-01 00:00:00+04'::timestamp with time zone < matchdate) AND (matchdate < '2020-01-01 00:00:00+04'::timestamp with time zone))
Rows Removed by Filter: 128408
Buffers: shared hit=168180
Planning Time: 14.820 ms
Execution Time: 2673.171 ms
All tables has an unique index on the c3 column.
The size of the database is like 500GB in total.
The server has 16 cores and 112GB M2 memory.
I have tried to optimize Postgres system variables - Like: WorkMem(1GB), shared_buffer(50GB), effective_cache_size (20GB) - But it doesn't seems to change anything (I know the settings has been applied - because I can see a big difference in the amount of idle memory the server has allocated).
I know the database is too big for all data to be in memory. But is there anything I can do to boost the performance / speed of my query?
Make sure CreatedDate is indexed.
Make sure CreatedDate is using the date column type. This will be more efficient on storage (just 4 bytes), performance, and you can use all the built in date formatting and functions.
Avoid select * and only select the columns you need.
Use YYYY-MM-DD ISO 8601 format. This has nothing to do with performance, but it will avoid a lot of ambiguity.
The real problem is likely that you have thousands of tables with which you regularly make unions of hundreds of tables. This indicates a need to redesign your schema to simplify your queries and get better performance.
Unions and date change checks suggest a lot of redundancy. Perhaps you've partitioned your tables by date. Postgres has its own built in table partitioning which might help.
Without more detail that's all I can say. Perhaps ask another question about your schema.
Without seeing EXPLAIN (ANALYZE, BUFFERS), all we can do is speculate.
But we can do some pretty good speculation.
Cluster the tables on the index on CreatedDate. This will allow the data to be accessed more sequentially, allowing more read-ahead (but this might not help much for some kinds of storage). If the tables have high write load, they may not stay clustered and so you would have recluster them occasionally. If they are static, this could be a one-time event.
Get more RAM. If you want to perform as if all the data was in memory, then get all the data into memory.
Get faster storage, like top-notch SSD. It isn't as fast as RAM, but much faster than HDD.
I have a table (let's called it table_a). It got around 15 million rows. It's a simple table that have a primary key.
Recently I created a backup table(let's called it table_a_bkp) and moved 12 million rows from table_a.
I used simple SQL (delete/insert) to perform the task.
Both tables have same structure and use same tablespace.
Query speed of table_a doesn't improved even total data rows reduced to 2+ million.
In fact table_a_bkp(12m rows) even have faster query speed than table_a(2m rows).
Checked with pg_stat_all_tables two tables both seems auto vacuum & analyze after deletion performed.
I expected table_a query speed should be improved as it got much less data now...
DB Version : PostgreSQL 9.1 hosted on Linux
EXPLAIN (backup table is faster than 1st table even rows is much larger) :
EXPLAIN (ANALYZE TRUE, COSTS TRUE, BUFFERS TRUE) select count(*) from txngeneral
"Aggregate (cost=742732.94..742732.95 rows=1 width=0) (actual time=73232.598..73232.599 rows=1 loops=1)"
" Buffers: shared hit=8910 read=701646"
" -> Seq Scan on txngeneral (cost=0.00..736297.55 rows=2574155 width=0) (actual time=17.614..72763.873 rows=2572550 loops=1)"
" Buffers: shared hit=8910 read=701646"
"Total runtime: 73232.647 ms"
EXPLAIN (ANALYZE TRUE, COSTS TRUE, BUFFERS TRUE) select count(*) from txngeneral_bkp
"Aggregate (cost=723840.13..723840.14 rows=1 width=0) (actual time=57134.270..57134.270 rows=1 loops=1)"
" Buffers: shared hit=96 read=569895"
" -> Seq Scan on txngeneral_bkp (cost=0.00..693070.30 rows=12307930 width=0) (actual time=5.436..54889.543 rows=12339180 loops=1)"
" Buffers: shared hit=96 read=569895"
"Total runtime: 57134.321 ms"
Resolved: VACUUM FULL did speed up table scan.
You should VACUUM ANALYZE your original table. (Probably a full VACUUM will be needed.) Since the new table was created and populated all at once, they form more or less contiguous 'blocks', while the tuples of original table are spread all over the disk.