Why autoanalyze not ran when autovacuum performed - postgresql

Team,
Recently identified one of the table autovacuum was in up to date but autoanalyze is not up to date. Parameters are configured default.
How this will happen? If autovacuum parameters satisfied the conditions then it should have performed autoanalyze as well .
Please share some insights on this.
Thanks

Autovacuum and autoanalyze are triggered by different conditions and use different statistics counters:
autovacuum starts running if the number of dead tuples (n_dead_tup in pg_stat_all_tables) exceeds a threshold (by default, about 20% of reltuples from pg_class). From v13 on, there is a similar condition with n_ins_since_vacuum, the number of tuples inserted since the last VACUUM.
autoanalyze starts running if the number of changed tuples (n_mod_since_analyze in pg_stat_all_tables) exceeds a threshold (by default, about 10% of reltuples from pg_class).
Whenever autovacuum or autoanalyze have completed, the respective statistics counter is reset to 0.
So there is no direct connection between autovacuum and autoanalyze runs (except that both are connected to table modifications, unless it is an anti-wraparound vacuum run).

Related

Sudden Increase in row exclusive locks and connection exhaustion in PostgreSQL

I have a scenario that repeats itself every few hours. In every few hours, there is a sudden increase in row exclusive locks in PostgreSQL DB. In Meantime there seems that some queries are not responded in time and causes connection exhaustion to happen that PostgreSQL does not accept new clients anymore. After 2-3 minutes locks and connection numbers drops and the system comes back to normal state again.
I wonder if auto vacuum can be the root cause of this? I see analyze and vacuum (NOT FULL VACCUM) take about 20 seconds to complete on one of the tables. I have INSERT,SELECT,UPDATE and DELETE operations going on from my application and I don't have DDL commands (ALTER TABLE, DROP TABLE, CREATE INDEX, ...) going on. Can auto vacuum procedure conflict with queries from my application and cause them to wait until vacuum has completed? Or it's all the applications and my bad design fault? I should say one of my tables has a field of type jsonb that keeps relatively large data for each row (10 MB roughly).
I have attached an image from monitoring application that shows the sudden increase in row exclusive locks.
ROW EXCLUSIVE locks are perfectly harmless; they are taken on tables against which DML statements run. Your graph reveals nothing. You should set log_lock_waits = on and log_min_duration_statement to a reasonable value. Perhaps you can spot something in the logs. Also, watch out for long running transactions.

postgres autovacuum retry mechanism

In postgres, when autovacuum runs and for some reason say its able to perform autovacuum - for example when hot_standby_feedback is set and there are long running queries on standby. Say for example tab1 has been updated and it triggers autovacuum, meanwhile a long running query on standby is running and sends this info to primary which will skip the vacuum on tab1.
Since the autovacuum got skipped for tab1, When does autovacuum run vacuum on the table again? Or it will not run autovacuum again on it and we would need to manually run vacuum on that table. Basically does autovacuum retry autovacuum on tables that could not be vacuumed for the first time?
Autovacuum does not get skipped due to hot_standby_feedback. It still runs, it just might not accomplish anything if no rows can be removed. If this is the case, then pg_stat_all_tables.n_dead_tup does not get decremented, which means that the table will probably get autovacuumed again the next time the database is assessed for autovacuuming as the stats that make it eligible have not changed. On an otherwise idle system, this will happen about once every however long it takes to scan not-all-visible part of the table, rounded up to the next increment of autovacuum_naptime.
It might be a good idea (although the use case is narrow enough that I doubt it) to suppress repeat autovacuuming of a table until the horizon has advanced far enough to make it worthwhile, but currently there is no code to do this.
Note that this differs from INSERT driven autovacuums. There, n_ins_since_vacuum does get reset, even if none of the tuples were marked all visible. So that vacuum will not get run again until the table cross some other threshold to make it eligible.

Stop autovacuum on old partition

We're having some issues when autovacuum triggers on one of our large tables (~100Gb).
Our ETL jobs only hit the last three partition of this table but, from my understanding, when autovacuum is triggered on a partition the whole table is vacuumed and this is causing the ETL job to wait until it's finished.
So far I've tried to set autovacuum_vacuum_scale_factor to both a higher and lower value which yields approximately the same execution time for our job.
Since a rather important number of INSERT/UPDATE is performed on this table, I would like to put a low value for autovacuum_vacuum_scale_factor on the three lasts partitions, but prevent the vacuuming of older partitions.
We are already using a vacuum script that runs every evening so I'm thinking about setting autovacuum_enabled to off on older partitionned and let the script handle the vacuum on these tables, but I'm not sure if it's the right way to treat this problem.
Another parameter I've stumbled upon is the vacuum_freeze_min_age and autovacuum_freeze_max_age, but I'm not sure I understand how to use them.

Postgres auto-vacuum doesn't reclaim the dead tuples space causes disk full issue

I have a use case to insert 100 000 rows per min at the same time in another end few threads will take the rows and delete them from my table. So definitely it will create lot of dead tuples in my table.
My auto-vacuum configurations are
autovacuum_max_workers = 3
autovacuum_naptime = 1min
utovacuum_vacuum_scale_factor = 0.2
autovacuum_analyze_scale_factor = 0.1
autovacuum_vacuum_cost_delay = 20ms
autovacuum_vacuum_cost_limit = -1
From "pg_stat_user_tables" I can find auto-vacuum is running on my table but within a few hours my disk will be full (500 GB) and I can't able to insert any new row.
on the second try, I changed the following configuration
autovacuum_naptime = 60min
autovacuum_vacuum_cost_delay = 0
This time my simulation and auto-vacuum are running well and max disk size is 180 GB.
Here my doubt is, if I change the "autovacuum_vacuum_cost_delay" to zero ms, how auto-vacuum freeing the dead tuples space and PG reuse it? why it is not working as intended if I set the value is 20 ms?
Here my doubt is, if I change the "autovacuum_vacuum_cost_delay" to zero ms, how auto-vacuum freeing the dead tuples space and PG reuse it?
The space freed up by vacuum is recorded in the free space map, from where it gets handed out for re-use by future INSERTs.
Another detail to add, in 9.6 the free space map is only vacuumed once the entire table itself is completely vacuumed, and so the freed up space is not findable until then. If the VACUUM never makes it to the very end, because it is too slow or gets interupted, then the space it is freeing up will not be reused for INSERTs. This was improved in v11.
why it is not working as intended if I set the value is 20 ms?
Because vacuum can't keep up at that value. The default values for PostgreSQL are often suitable only for smaller servers, which yours doesn't seem to be. It is appropriate and advisable to change the defaults in this situation. Note that in v12, the default was lowered from 20 to 2 (and its type was correspondingly changed from int to float, so you can now specify the value with more precision)
To summarize, your app creates tons of dead tuples and autovacuum can't keep up. Possible solutions
This sounds more like a task queue than a regular table. Perhaps a PostgreSQL table is not ideal for your this specific use case. Use a solution such as RabbitMQ/Redis instead.
Create time-based range partitions and purge old partitions once they're empty, while disabling autovacuum on this table alone. Consider not deleting rows at all and just purging old partitions if you can identify handled partitions.
Tweak with the autovacuum settings so that it works constantly, without any naps or interference. Increasing maintenance_work_mem could help speed autovacuum too. Perhaps you'll find out that you've reached your hard-drive's limits. In that case, you will have to optimize the storage so that it can accommodate those expensive INSERT+DELETE+autovacuum operations.
Well the default value is 2 ms Autovacuum. So your 20ms value is high:
autovacuum_vacuum_cost_delay (floating point)
"Specifies the cost delay value that will be used in automatic VACUUM operations. If -1 is specified, the regular vacuum_cost_delay value will be used. If this value is specified without units, it is taken as milliseconds. The default value is 2 milliseconds. This parameter can only be set in the postgresql.conf file or on the server command line; but the setting can be overridden for individual tables by changing table storage parameters."
As explained here Vacuum:
"
vacuum_cost_delay (floating point)
The amount of time that the process will sleep when the cost limit has been exceeded. If this value is specified without units, it is taken as milliseconds. The default value is zero, which disables the cost-based vacuum delay feature. Positive values enable cost-based vacuuming.
When using cost-based vacuuming, appropriate values for vacuum_cost_delay are usually quite small, perhaps less than 1 millisecond. While vacuum_cost_delay can be set to fractional-millisecond values, such delays may not be measured accurately on older platforms. On such platforms, increasing VACUUM's throttled resource consumption above what you get at 1ms will require changing the other vacuum cost parameters. You should, nonetheless, keep vacuum_cost_delay as small as your platform will consistently measure; large delays are not helpful.
"

auto vacuum vs vacuum in postgresql

Postgresql has the functionality of Vacuum for recollecting the space occupied by dead tuples. Auto vacuum is on by default and runs according to the configuration setting.
When I check the output of pg_stat_all_tables i.e. last_vacuum and last_autovacuum, autovacuum was never run for most of the tables in the database which have enough number of dead tuples(more than 1K). We also get a time window of 2-3 hours when these tables are used rarely.
Below are autovacuum settings for my database
below is the output of pg_stat_all_tables
I want to ask that is it a good idea to depend only on auto vacuum?
Are there any special setting required for autovacuum to function properly?
Should we set up a manual vacuum? Should we use both in parallel or just turn off autovacuum and use manual vacuum only?
You should definitely use autovacuum.
Are there any autovacuum processes running currently?
Does a manual VACUUM on such a table succeed?
Set log_autovacuum_min_duration = 0 to get information about autovacuum processing in the logs.
If system activity is too high, autovacuum may not be able to keep up. In this case it is advisable to configure autovacuum to be more aggressive, e.g. by setting autovacuum_vacuum_cost_limit = 1000.
https://www.postgresql.org/docs/current/static/routine-vacuuming.html
PostgreSQL databases require periodic maintenance known as vacuuming.
For many installations, it is sufficient to let vacuuming be performed
by the autovacuum daemon, which is described in Section 24.1.6. You
might need to adjust the autovacuuming parameters described there to
obtain best results for your situation. Some database administrators
will want to supplement or replace the daemon's activities with
manually-managed VACUUM commands, which typically are executed
according to a schedule by cron or Task Scheduler scripts.
vacuum creates significant IO, asjust https://www.postgresql.org/docs/current/static/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-VACUUM-COST to fit your needs.
Also you can set autovacuum settings per table, to be more "custom" https://www.postgresql.org/docs/current/static/sql-createtable.html#SQL-CREATETABLE-STORAGE-PARAMETERS
the above will give you the idea why your 1K dead tuples might be not enough for autovacuum and how to change it.
manual VACUUM is a perfect solution for one time run, while to run the system I'd definitely rely on autovacuum daemon