I'm working an part of a system where tables need to be locked in "IN SHARE ROW EXCLUSIVE MODE NOWAIT" mode.
The problem is that the Autovacumn daemon (autovacuum: ANALYZE) kicks in immediately on a table after it has been worked on - stopping the next process from taking a "IN SHARE ROW EXCLUSIVE MODE NOWAIT".
The tables are large so the vacuum takes a while. I could check for this before the transaction that is trying to lock starts and use pg_cancel_backend to stop the autovacuum daemon. Are there any consequences to this? The other option is to manually schedule vacuuming but everywhere you read it's best to avoid this. Are there any parameters which I have missed that could tweak the autovacuum to stop it from behaving like this?
Thanks
If you keep autovacuum from working, you will incur table bloat and eventually risk downtime when PostgreSQL shuts down to avoid data loss due to transaction ID wraparound.
Disabling autovacuum for this table (ALTER TABLE ... SET (autovacuum_enabled = off)) and regularly scheduling a manual VACUUM is an option.
But it would be best to change your procedure so that you don't have to lock tables explicitly all the time.
Related
I have a scenario that repeats itself every few hours. In every few hours, there is a sudden increase in row exclusive locks in PostgreSQL DB. In Meantime there seems that some queries are not responded in time and causes connection exhaustion to happen that PostgreSQL does not accept new clients anymore. After 2-3 minutes locks and connection numbers drops and the system comes back to normal state again.
I wonder if auto vacuum can be the root cause of this? I see analyze and vacuum (NOT FULL VACCUM) take about 20 seconds to complete on one of the tables. I have INSERT,SELECT,UPDATE and DELETE operations going on from my application and I don't have DDL commands (ALTER TABLE, DROP TABLE, CREATE INDEX, ...) going on. Can auto vacuum procedure conflict with queries from my application and cause them to wait until vacuum has completed? Or it's all the applications and my bad design fault? I should say one of my tables has a field of type jsonb that keeps relatively large data for each row (10 MB roughly).
I have attached an image from monitoring application that shows the sudden increase in row exclusive locks.
ROW EXCLUSIVE locks are perfectly harmless; they are taken on tables against which DML statements run. Your graph reveals nothing. You should set log_lock_waits = on and log_min_duration_statement to a reasonable value. Perhaps you can spot something in the logs. Also, watch out for long running transactions.
In postgres, when autovacuum runs and for some reason say its able to perform autovacuum - for example when hot_standby_feedback is set and there are long running queries on standby. Say for example tab1 has been updated and it triggers autovacuum, meanwhile a long running query on standby is running and sends this info to primary which will skip the vacuum on tab1.
Since the autovacuum got skipped for tab1, When does autovacuum run vacuum on the table again? Or it will not run autovacuum again on it and we would need to manually run vacuum on that table. Basically does autovacuum retry autovacuum on tables that could not be vacuumed for the first time?
Autovacuum does not get skipped due to hot_standby_feedback. It still runs, it just might not accomplish anything if no rows can be removed. If this is the case, then pg_stat_all_tables.n_dead_tup does not get decremented, which means that the table will probably get autovacuumed again the next time the database is assessed for autovacuuming as the stats that make it eligible have not changed. On an otherwise idle system, this will happen about once every however long it takes to scan not-all-visible part of the table, rounded up to the next increment of autovacuum_naptime.
It might be a good idea (although the use case is narrow enough that I doubt it) to suppress repeat autovacuuming of a table until the horizon has advanced far enough to make it worthwhile, but currently there is no code to do this.
Note that this differs from INSERT driven autovacuums. There, n_ins_since_vacuum does get reset, even if none of the tuples were marked all visible. So that vacuum will not get run again until the table cross some other threshold to make it eligible.
I have a table that had 80K writes per minute..
I did a change that reduce the load on this table (now it's 40K writes per minute).
Currently I can see that there's an autovacuum that has been running for more than 3 days(!) and my table (autovacuum: VACUUM ANALYZE table (to prevent wraparound)
Should I run "Vacuum table" manually? Will it end as long as I'm not stopping the writes ?
Should I terminate this "autovacuum" process?
Any advice will be highly appreciated!
Run the VACUUM manually for now. If the transaction wrap-around vacuum doesn't finish in time, your server will go down.
Set autovacuum_vacuum_cost_delay to 0 for this table using ALTER TABLE. Then autovacuum will process that table as fast as it can.
Postgresql has the functionality of Vacuum for recollecting the space occupied by dead tuples. Auto vacuum is on by default and runs according to the configuration setting.
When I check the output of pg_stat_all_tables i.e. last_vacuum and last_autovacuum, autovacuum was never run for most of the tables in the database which have enough number of dead tuples(more than 1K). We also get a time window of 2-3 hours when these tables are used rarely.
Below are autovacuum settings for my database
below is the output of pg_stat_all_tables
I want to ask that is it a good idea to depend only on auto vacuum?
Are there any special setting required for autovacuum to function properly?
Should we set up a manual vacuum? Should we use both in parallel or just turn off autovacuum and use manual vacuum only?
You should definitely use autovacuum.
Are there any autovacuum processes running currently?
Does a manual VACUUM on such a table succeed?
Set log_autovacuum_min_duration = 0 to get information about autovacuum processing in the logs.
If system activity is too high, autovacuum may not be able to keep up. In this case it is advisable to configure autovacuum to be more aggressive, e.g. by setting autovacuum_vacuum_cost_limit = 1000.
https://www.postgresql.org/docs/current/static/routine-vacuuming.html
PostgreSQL databases require periodic maintenance known as vacuuming.
For many installations, it is sufficient to let vacuuming be performed
by the autovacuum daemon, which is described in Section 24.1.6. You
might need to adjust the autovacuuming parameters described there to
obtain best results for your situation. Some database administrators
will want to supplement or replace the daemon's activities with
manually-managed VACUUM commands, which typically are executed
according to a schedule by cron or Task Scheduler scripts.
vacuum creates significant IO, asjust https://www.postgresql.org/docs/current/static/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-VACUUM-COST to fit your needs.
Also you can set autovacuum settings per table, to be more "custom" https://www.postgresql.org/docs/current/static/sql-createtable.html#SQL-CREATETABLE-STORAGE-PARAMETERS
the above will give you the idea why your 1K dead tuples might be not enough for autovacuum and how to change it.
manual VACUUM is a perfect solution for one time run, while to run the system I'd definitely rely on autovacuum daemon
I have found a bug in my application code where I have started a transaction, but never commit or do a rollback. The connection is used periodically, just reading some data every 10s or so. In the pg_stat_activity table, its state is reported as "idle in transaction", and its backend_start time is over a week ago.
What is the impact on the database of this? Does it cause additional CPU and RAM usage? Will it impact other connections? How long can it persist in this state?
I'm using postgresql 9.1 and 9.4.
Since you only SELECT, the impact is limited. It is more severe for any write operations, where the changes are not visible to any other transaction until committed - and lost if never committed.
It does cost some RAM and permanently occupies one of your allowed connections (which may or may not matter).
One of the more severe consequences of very long running transactions: It blocks VACUUM from doing it's job, since there is still an old transaction that can see old rows. The system will start bloating.
In particular, SELECT acquires an ACCESS SHARE lock (the least blocking of all) on all referenced tables. This does not interfere with other DML commands like INSERT, UPDATE or DELETE, but it will block DDL commands as well as TRUNCATE or VACUUM (including autovacuum jobs). See "Table-level Locks" in the manual.
It can also interfere with various replication solutions and lead to transaction ID wraparound in the long run if it stays open long enough / you burn enough XIDs fast enough. More about that in the manual on "Routine Vacuuming".
Blocking effects can mushroom if other transactions are blocked from committing and those have acquired locks of their own. Etc.
You can keep transactions open (almost) indefinitely - until the connection is closed (which also happens when the server is restarted, obviously.)
But never leave transactions open longer than needed.
There are two major impacts to the system.
The tables that have been used in those transactions:
are not vacuumed which means they are not "cleaned up" and their statistics aren't updated which might lead to bad (=slow) execution plans
cannot be changed using ALTER TABLE