I am using postgresql for cloudSQL on GCP.
One table is almost in the process of inserting. (Theoretically more than 10 million per day)
Auto vacuum was performed when the data was about 10 billion.
Also, while auto vacuum was running, other processes could only be used by a single user.
Perhaps it is the effect of vacuum freeze.
What is the exact cause?
And I decided that the execution period would be shorter if auto vacuum was run in a smaller amount and frequently, so I modified the parameter as follows.
autovacuum_freeze_max_age : 2 billion -> 100 million
autovacuum_multixact_freeze_max_age : 2 billion -> 100 million
Are there any parameters that need to be modified to further increase performance?
Yes, those are the correct settings to make anti-wraparound autovacuum run more often, so that individual runs are smaller.
You can further improve matters for this table if you set vacuum_freeze_min_age to 0, so that all rows are frozen when autovacuum runs.
Note that you can set these parameters on a single table like this:
ALTER TABLE tab SET (
autovacuum_freeze_max_age = 100000000,
autovacuum_multixact_freeze_max_age = 100000000,
vacuum_freeze_min_age = 0
);
That is better, because other tables in your database may be served better with the default settings for these parameters.
Note that an easy alternative to all this is to upgrade to PostgreSQL v13 or better, where autovacuum will run more often on insert-only tables for exactly this reason.
As always with VACUUM, setting maintenance_work_mem high will improve performance.
Related
I have set maintenance_work_mem 50GB and trying to run 200GB non partitioned table with 10 indexes manually. I am wondering it is taking 3 days still not completed vacuum.
I know, these many indexes bad but not understanding why vacuum not finished.
Note: No blockings observed
I know that count(*) in Postgres is generally slow, however I have a database where it's super slow. I'm talking about minutes even hours.
There is approximately 40M rows in a table and the table consists of 29 columns (most of the are text, 4 are double precision). There is an index on one column which should be unique and I've already run vacuum full. It took around one hour to complete but without no observable results.
Database uses dedicated server with 32GB ram. I set shared_buffers to 8GB and work_mem to 80MB but no speed improvement. I'm aware there are some techniques to get approximated count or to use external table to keep the count but I'm not interested in the count specifically, I'm more concerned about performance in general, since now it's awful. When I run the count there are no CPU peeks or something. Could someone point where to look? Can it be that data are structured so badly that 40M rows are too much for postgres to handle?
I have a use case to insert 100 000 rows per min at the same time in another end few threads will take the rows and delete them from my table. So definitely it will create lot of dead tuples in my table.
My auto-vacuum configurations are
autovacuum_max_workers = 3
autovacuum_naptime = 1min
utovacuum_vacuum_scale_factor = 0.2
autovacuum_analyze_scale_factor = 0.1
autovacuum_vacuum_cost_delay = 20ms
autovacuum_vacuum_cost_limit = -1
From "pg_stat_user_tables" I can find auto-vacuum is running on my table but within a few hours my disk will be full (500 GB) and I can't able to insert any new row.
on the second try, I changed the following configuration
autovacuum_naptime = 60min
autovacuum_vacuum_cost_delay = 0
This time my simulation and auto-vacuum are running well and max disk size is 180 GB.
Here my doubt is, if I change the "autovacuum_vacuum_cost_delay" to zero ms, how auto-vacuum freeing the dead tuples space and PG reuse it? why it is not working as intended if I set the value is 20 ms?
Here my doubt is, if I change the "autovacuum_vacuum_cost_delay" to zero ms, how auto-vacuum freeing the dead tuples space and PG reuse it?
The space freed up by vacuum is recorded in the free space map, from where it gets handed out for re-use by future INSERTs.
Another detail to add, in 9.6 the free space map is only vacuumed once the entire table itself is completely vacuumed, and so the freed up space is not findable until then. If the VACUUM never makes it to the very end, because it is too slow or gets interupted, then the space it is freeing up will not be reused for INSERTs. This was improved in v11.
why it is not working as intended if I set the value is 20 ms?
Because vacuum can't keep up at that value. The default values for PostgreSQL are often suitable only for smaller servers, which yours doesn't seem to be. It is appropriate and advisable to change the defaults in this situation. Note that in v12, the default was lowered from 20 to 2 (and its type was correspondingly changed from int to float, so you can now specify the value with more precision)
To summarize, your app creates tons of dead tuples and autovacuum can't keep up. Possible solutions
This sounds more like a task queue than a regular table. Perhaps a PostgreSQL table is not ideal for your this specific use case. Use a solution such as RabbitMQ/Redis instead.
Create time-based range partitions and purge old partitions once they're empty, while disabling autovacuum on this table alone. Consider not deleting rows at all and just purging old partitions if you can identify handled partitions.
Tweak with the autovacuum settings so that it works constantly, without any naps or interference. Increasing maintenance_work_mem could help speed autovacuum too. Perhaps you'll find out that you've reached your hard-drive's limits. In that case, you will have to optimize the storage so that it can accommodate those expensive INSERT+DELETE+autovacuum operations.
Well the default value is 2 ms Autovacuum. So your 20ms value is high:
autovacuum_vacuum_cost_delay (floating point)
"Specifies the cost delay value that will be used in automatic VACUUM operations. If -1 is specified, the regular vacuum_cost_delay value will be used. If this value is specified without units, it is taken as milliseconds. The default value is 2 milliseconds. This parameter can only be set in the postgresql.conf file or on the server command line; but the setting can be overridden for individual tables by changing table storage parameters."
As explained here Vacuum:
"
vacuum_cost_delay (floating point)
The amount of time that the process will sleep when the cost limit has been exceeded. If this value is specified without units, it is taken as milliseconds. The default value is zero, which disables the cost-based vacuum delay feature. Positive values enable cost-based vacuuming.
When using cost-based vacuuming, appropriate values for vacuum_cost_delay are usually quite small, perhaps less than 1 millisecond. While vacuum_cost_delay can be set to fractional-millisecond values, such delays may not be measured accurately on older platforms. On such platforms, increasing VACUUM's throttled resource consumption above what you get at 1ms will require changing the other vacuum cost parameters. You should, nonetheless, keep vacuum_cost_delay as small as your platform will consistently measure; large delays are not helpful.
"
We have on RDS a main Postgres server and a read replica.
We constantly write and update new data for the last couple of days.
Reading from the read-replica works fine when looking at older data but when trying to read from the last couple of days, where we keep updating the data on the main server, is painfully slow.
Queries that take 2-3 minutes on old data can timeout after 20 minutes when querying data from the last day or two.
Looking at the monitors like CPU I don't see any extra load on the read replica.
Is there a solution for this?
You are accessing over 65 buffers for ever 1 visible row found in the index scan (and over 500 buffers for each row which is returned by the index scan, since 90% are filtered out by the mmsi criterion).
One issue is that your index is not as well selective as it could be. If you had the index on (day, mmsi) rather than just (day) it should be about 10 times faster.
But it also looks like you have a massive amount of bloat.
You are probably not vacuuming the table often enough. With your described UPDATE pattern, all the vacuum needs are accumulating in the newest data, but the activity counters are evaluated based on the full table size, so autovacuum is not done often enough to suit the needs of the new data. You could lower the scale factor for this table:
alter table simplified_blips set (autovacuum_vacuum_scale_factor = 0.01)
Or if you partition the data based on "day", then the partitions for newer days will naturally get vacuumed more often because the occurrence of updates will be judged against the size of each partition, it won't get diluted out by the size of all the older inactive partitions. Also, each vacuum run will take less work, as it won't have to scan all of the indexes of the entire table, just the indexes of the active partitions.
As suggested, the problem was bloat.
When you update a record in an ACID database the database creates a new version of the record with the new updated record.
After the update you end with a "dead record" (AKA dead tuple)
Once in a while the database will do autovacuum and clean the table from the dead tuples.
Usually the autovacuum should be fine but if your table is really large and updated often you should consider changing the autovacuum analysis and size to be more aggressive.
I have a table that had 80K writes per minute..
I did a change that reduce the load on this table (now it's 40K writes per minute).
Currently I can see that there's an autovacuum that has been running for more than 3 days(!) and my table (autovacuum: VACUUM ANALYZE table (to prevent wraparound)
Should I run "Vacuum table" manually? Will it end as long as I'm not stopping the writes ?
Should I terminate this "autovacuum" process?
Any advice will be highly appreciated!
Run the VACUUM manually for now. If the transaction wrap-around vacuum doesn't finish in time, your server will go down.
Set autovacuum_vacuum_cost_delay to 0 for this table using ALTER TABLE. Then autovacuum will process that table as fast as it can.