Postgres: "vacuum" command does not clean up dead tuples - postgresql

We have a postgres database in Amazon RDS. Initially, we needed to load large amount of data quickly, so autovacuum was turned off according to the best practice suggestion from Amazon. Recently I noticed some performance issue when running queries. Then I realized it has not been vacuumed for a long time. As it turns out many tables have lots of dead tuples.
Surprisingly, even after I manually ran vacuum commands on some of the tables, it did not seem to remove these dead tuples at all. vacuum full takes too long to finish which usually ends up timed out after a whole night.
Why does vacuum command not work? What are my other options, restart the instance?

Use VACUUM (VERBOSE) to get detailed statistics of what it is doing and why.
There are three reasons why dead tuples cannot be removed:
There is a long running transaction that has not been closed. You can find the bad boys with
SELECT pid, datname, usename, state, backend_xmin
FROM pg_stat_activity
WHERE backend_xmin IS NOT NULL
ORDER BY age(backend_xmin) DESC;
You can get rid of a transaction with pg_cancel_backend() or pg_terminate_backend().
There are prepared transactions which have not been commited. You can find them with
SELECT gid, prepared, owner, database, transaction
FROM pg_prepared_xacts
ORDER BY age(transaction) DESC;
User COMMIT PREPARED or ROLLBACK PREPARED to close them.
There are replication slots which are not used. Find them with
SELECT slot_name, slot_type, database, xmin
FROM pg_replication_slots
ORDER BY age(xmin) DESC;
Use pg_drop_replication_slot() to delete an unused replication slot.

https://dba.stackexchange.com/a/77587/30035 explains why not all dead tuples are removed.
for vacuum full not to time out, set statement_timeout = 0
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_BestPractices.html#CHAP_BestPractices.PostgreSQL recommends disabling autovacuum for the time of database restore, further they definetely recommend using it:
Important
Not running autovacuum can result in an eventual required outage to
perform a much more intrusive vacuum operation.
Canceling all sessions and vacuuming table should help with previous dead tuples (regarding your suggestion to restart cluster). But what I suggest you to do in first place - switch autovacuum on. And better probably control vacuum on table, not on the whole cluster with autovacuum_vacuum_threshold, (ALTER TABLE) reference here: https://www.postgresql.org/docs/current/static/sql-createtable.html#SQL-CREATETABLE-STORAGE-PARAMETERS

Related

PSQL: VACUUM ANALYZE is showing incorrect oldest xmin

When I run vacuum verbose on a table, the result is showing an oldest xmin value of 9696975, as shown below:
table_xxx: found 0 removable, 41472710 nonremovable row versions in 482550 out of 482550 pages
DETAIL: 41331110 dead row versions cannot be removed yet, oldest xmin: 9696975
There were 0 unused item identifiers.
But when I check in pg_stat_activity, there are no entries with the backend_xmin value that matches this oldest xmin value.
Below is the response I get when I run the query:
SELECT backend_xmin
FROM pg_stat_activity
WHERE backend_xmin IS NOT NULL
ORDER BY age(backend_xmin) DESC;
Response:
backend_xmin
------------
10134695
10134696
10134696
10134696
10134696
The issue I am facing is that the vacuum is not removing any dead tuples from the table.
I tried methods mentioned in: this post. But it didn't help.
edit:
The PostgreSQL version is 13.6 running in Aurora cluster.
A row is only completely dead when no live transaction can see it anymore. I.e. no transaction that has been started before the row was updated / deleted is still running. That does not necessarily involve any locks at all. The mere existence of a long-running transaction can block VACUUM from cleaning up.
So the system view to consult is pg_stat_activity. Look for zombi-transactions that you can kill. Then VACUUM can proceed.
Old prepared transactions can also block for the same reason. You can check pg_prepared_xacts for those.
Of course, VACUUM only runs on the primary server, not on replicas (standby) instances - in case streaming replication has been set up.
Related:
Long running function locking the database?
What are the consequences of not ending a database transaction?
What does backend_xmin and backend_xid represent in pg_stat_activity?
Do postgres autovacuum properties persist for DB replications?
Apart from old transactions, there are some other things that can hold the “xmin horizon” back:
stale replication slots (see pg_replication_slots)
abandoned prepared transactions (see pg_prepared_xacts)

auto vacuum/vacuum not releasing dead rows in PostgreSQL 9.6 [duplicate]

We have a postgres database in Amazon RDS. Initially, we needed to load large amount of data quickly, so autovacuum was turned off according to the best practice suggestion from Amazon. Recently I noticed some performance issue when running queries. Then I realized it has not been vacuumed for a long time. As it turns out many tables have lots of dead tuples.
Surprisingly, even after I manually ran vacuum commands on some of the tables, it did not seem to remove these dead tuples at all. vacuum full takes too long to finish which usually ends up timed out after a whole night.
Why does vacuum command not work? What are my other options, restart the instance?
Use VACUUM (VERBOSE) to get detailed statistics of what it is doing and why.
There are three reasons why dead tuples cannot be removed:
There is a long running transaction that has not been closed. You can find the bad boys with
SELECT pid, datname, usename, state, backend_xmin
FROM pg_stat_activity
WHERE backend_xmin IS NOT NULL
ORDER BY age(backend_xmin) DESC;
You can get rid of a transaction with pg_cancel_backend() or pg_terminate_backend().
There are prepared transactions which have not been commited. You can find them with
SELECT gid, prepared, owner, database, transaction
FROM pg_prepared_xacts
ORDER BY age(transaction) DESC;
User COMMIT PREPARED or ROLLBACK PREPARED to close them.
There are replication slots which are not used. Find them with
SELECT slot_name, slot_type, database, xmin
FROM pg_replication_slots
ORDER BY age(xmin) DESC;
Use pg_drop_replication_slot() to delete an unused replication slot.
https://dba.stackexchange.com/a/77587/30035 explains why not all dead tuples are removed.
for vacuum full not to time out, set statement_timeout = 0
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_BestPractices.html#CHAP_BestPractices.PostgreSQL recommends disabling autovacuum for the time of database restore, further they definetely recommend using it:
Important
Not running autovacuum can result in an eventual required outage to
perform a much more intrusive vacuum operation.
Canceling all sessions and vacuuming table should help with previous dead tuples (regarding your suggestion to restart cluster). But what I suggest you to do in first place - switch autovacuum on. And better probably control vacuum on table, not on the whole cluster with autovacuum_vacuum_threshold, (ALTER TABLE) reference here: https://www.postgresql.org/docs/current/static/sql-createtable.html#SQL-CREATETABLE-STORAGE-PARAMETERS

Postgres AUTO_VACUUM (to prevent wrap-around) manually activating [duplicate]

We have a postgres database in Amazon RDS. Initially, we needed to load large amount of data quickly, so autovacuum was turned off according to the best practice suggestion from Amazon. Recently I noticed some performance issue when running queries. Then I realized it has not been vacuumed for a long time. As it turns out many tables have lots of dead tuples.
Surprisingly, even after I manually ran vacuum commands on some of the tables, it did not seem to remove these dead tuples at all. vacuum full takes too long to finish which usually ends up timed out after a whole night.
Why does vacuum command not work? What are my other options, restart the instance?
Use VACUUM (VERBOSE) to get detailed statistics of what it is doing and why.
There are three reasons why dead tuples cannot be removed:
There is a long running transaction that has not been closed. You can find the bad boys with
SELECT pid, datname, usename, state, backend_xmin
FROM pg_stat_activity
WHERE backend_xmin IS NOT NULL
ORDER BY age(backend_xmin) DESC;
You can get rid of a transaction with pg_cancel_backend() or pg_terminate_backend().
There are prepared transactions which have not been commited. You can find them with
SELECT gid, prepared, owner, database, transaction
FROM pg_prepared_xacts
ORDER BY age(transaction) DESC;
User COMMIT PREPARED or ROLLBACK PREPARED to close them.
There are replication slots which are not used. Find them with
SELECT slot_name, slot_type, database, xmin
FROM pg_replication_slots
ORDER BY age(xmin) DESC;
Use pg_drop_replication_slot() to delete an unused replication slot.
https://dba.stackexchange.com/a/77587/30035 explains why not all dead tuples are removed.
for vacuum full not to time out, set statement_timeout = 0
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_BestPractices.html#CHAP_BestPractices.PostgreSQL recommends disabling autovacuum for the time of database restore, further they definetely recommend using it:
Important
Not running autovacuum can result in an eventual required outage to
perform a much more intrusive vacuum operation.
Canceling all sessions and vacuuming table should help with previous dead tuples (regarding your suggestion to restart cluster). But what I suggest you to do in first place - switch autovacuum on. And better probably control vacuum on table, not on the whole cluster with autovacuum_vacuum_threshold, (ALTER TABLE) reference here: https://www.postgresql.org/docs/current/static/sql-createtable.html#SQL-CREATETABLE-STORAGE-PARAMETERS

is it safe to enable autovacuum for a table in PostgreSQL

Am newbie in PostgreSQL(Version 9.2) Database development. While looking one of my table a saw an option called autovaccum.
many of my table contains 20000+ rows.For testing purpose I've altered one of that table like below
ALTER TABLE theTable SET (
autovacuum_enabled = true
);
So,I wish to know the benefits/advantages/disadvantages(if any) autovacuuming a table ?
Autovacuum is enabled by default in current versions of Postgres (and has been for a while). It's generally a good thing to have enabled for performance and other reasons.
Prior to autovacuuming, you would need to explicitly vacuum tables yourself (via cronjobs which executed psql commands to vacuum them, or similar) in order to get rid of dead tuples, etc. Postgres has for a while now managed this for you via autovacuum.
I have in some cases, with tables that have immense churn (i.e. very high rates of insertions and deletions) found it necessary to still explicitly vacuum via a cron in order to keep the dead tuple count low and performance high, because the autovacuum doesn't kick in fast enough, but this is something of a niche case.
More info: http://www.postgresql.org/docs/current/static/runtime-config-autovacuum.html

postgresql 9.2 never vacuumed and analyze

I have given a postgres 9.2 DB around 20GB of size.
I looked through the database and saw that it has been never run vacuum and/or analyze on any tables.
Autovacuum is on and the transaction wraparound limit is very far (only 1% of it).
I know nothing about the data activity (number of deletes,inserts, updates), but I see, it uses a lot of index and sequence.
My question is:
does the lack of vacuum and/or analyze affect data integrity (for example a select doesn't show all the rows matches the select from a table or from an index)? The speed of querys and writes doesn't matter.
is it possible that after the vacuum and/or analyze the same query gives a different answer than it would executed before the vacuum/analyze command?
I'm fairly new to PG, thank you for your help!!
Regards,
Figaro88
Running vacuum and/or analyze will not change the result set produced by any select operation (unless there was a bug in PostgreSQL). They may effect the order of results if you do not supply an ORDER BY clause.