I found this query to see if autovacuum is diabled on a given table. I have autovacuum and statistics collector enabled on the postgresql.
SELECT reloptions FROM pg_class WHERE relname='my_table';
reloptions
----------------------------
{autovacuum_enabled=false}
(1 row)
but what I get is just a NULL value for the given table.
Does it mean autovacuum is not enabled in any of the tables i query? please advise
Related
I've got 'autovacuum_freeze_max_age' which is 200 000 000 by default.
And in theory I found a rule that autovacuum wraparound starts when:
If age(relfrozenxid) > autovacuum_freeze_max_age
But when then usual autovacuum is started? How can I count a moment:
When usual autovacuum on a table is started?
When autovacuum becomes autovacuum wraparound? Really after age(relfrozenxid) > autovacuum_freeze_max_age?
As the documentation states, normal autovacuum is triggered
if the number of tuples obsoleted since the last VACUUM exceeds the “vacuum threshold”, the table is vacuumed. The vacuum threshold is defined as:
vacuum threshold = vacuum base threshold + vacuum scale factor * number of tuples
where the vacuum base threshold is autovacuum_vacuum_threshold, the vacuum scale factor is autovacuum_vacuum_scale_factor, and the number of tuples is pg_class.reltuples.
The table is also vacuumed if the number of tuples inserted since the last vacuum has exceeded the defined insert threshold, which is defined as:
vacuum insert threshold = vacuum base insert threshold + vacuum insert scale factor * number of tuples
where the vacuum insert base threshold is autovacuum_vacuum_insert_threshold, and vacuum insert scale factor is autovacuum_vacuum_insert_scale_factor.
The second part applies only to PostgreSQL v13 and later.
Furthermore,
If the relfrozenxid value of the table is more than vacuum_freeze_table_age transactions old, an aggressive vacuum is performed to freeze old tuples and advance relfrozenxid; otherwise, only pages that have been modified since the last vacuum are scanned.
So an autovacuum worker run that was triggered by the normal mechanism can run as an anti-wraparound VACUUM if there are old enough rows in the table.
Finally,
Tables whose relfrozenxid value is more than autovacuum_freeze_max_age transactions old are always vacuumed
So if a table with old live tuples is never autovacuumed during normal processing, a special anti-wraparound autovacuum run is triggered for it, even if autovacuum is disabled. Such an autovacuum run is also forced if there are multixacts that are older than vacuum_multixact_freeze_table_age, see here. From PostgreSQL v14 on, if an unfrozen row in a table is older than vacuum_failsafe_age, an anti-wraparound autovacuum will skip index cleanup for faster processing.
Yes, this is pretty complicated.
Made a query which shows the dead turple (when simple vacuum is started) and when vacuum wrapadaround:
with dead_tup as (
SELECT st.schemaname || '.' || st.relname tablename,
st.n_dead_tup dead_tup,
current_setting('autovacuum_vacuum_threshold')::int8 +
current_setting('autovacuum_vacuum_scale_factor')::float * c.reltuples
max_dead_tup,
(current_setting('autovacuum_vacuum_threshold')::int8 +
current_setting('autovacuum_vacuum_scale_factor')::float * c.reltuples - st.n_dead_tup) as left_for_tr_vacuum,
st.last_autovacuum,
c.relnamespace,
c.oid
FROM pg_stat_all_tables st,
pg_class c
WHERE c.oid = st.relid
AND c.relkind IN ('r','m','t')
AND st.schemaname not like ('pg_temp%'))
SELECT c.oid::regclass as table,
current_setting('autovacuum_freeze_max_age')::int8 -
age(c.relfrozenxid) as xid_left,
pg_relation_size(c.oid) as relsize,
dt.dead_tup,
dt.max_dead_tup,
dt.left_for_tr_vacuum,
dt.last_autovacuum
from (pg_class c
join pg_namespace n on (c.relnamespace=n.oid)
left join dead_tup dt on (c.relnamespace=dt.relnamespace and c.oid=dt.oid))
where c.relkind IN ('r','m','t') --and (age(c.relfrozenxid)::int8 > (current_setting('autovacuum_freeze_max_age')::int8 * 0.8))
AND n.nspname not like ('pg_temp%')
order by 2
There is a table , which has 200 rows . But number of live tuples showing there is more than that (around 60K) .
select count(*) from subscriber_offset_manager;
count
-------
200
(1 row)
SELECT schemaname,relname,n_live_tup,n_dead_tup FROM pg_stat_user_tables where relname='subscriber_offset_manager' ORDER BY n_dead_tup
;
schemaname | relname | n_live_tup | n_dead_tup
------------+---------------------------+------------+------------
public | subscriber_offset_manager | 61453 | 5
(1 row)
But as seen from pg_stat_activity and pg_locks , we are not able to track any open connection .
SELECT query, state,locktype,mode
FROM pg_locks
JOIN pg_stat_activity
USING (pid)
WHERE relation::regclass = 'subscriber_offset_manager'::regclass
;
query | state | locktype | mode
-------+-------+----------+------
(0 rows)
I also tried full vacuum on this table , Below are results :
All the times no rows are removed
some times all the live tuples become dead tuples .
Here is output .
vacuum FULL VERBOSE ANALYZE subscriber_offset_manager;
INFO: vacuuming "public.subscriber_offset_manager"
INFO: "subscriber_offset_manager": found 0 removable, 67920 nonremovable row versions in 714 pages
DETAIL: 67720 dead row versions cannot be removed yet.
CPU 0.01s/0.06u sec elapsed 0.13 sec.
INFO: analyzing "public.subscriber_offset_manager"
INFO: "subscriber_offset_manager": scanned 710 of 710 pages, containing 200 live rows and 67720 dead rows; 200 rows in sample, 200 estimated total rows
VACUUM
SELECT schemaname,relname,n_live_tup,n_dead_tup FROM pg_stat_user_tables where relname='subscriber_offset_manager' ORDER BY n_dead_tup
;
schemaname | relname | n_live_tup | n_dead_tup
------------+---------------------------+------------+------------
public | subscriber_offset_manager | 200 | 67749
and after 10 sec
SELECT schemaname,relname,n_live_tup,n_dead_tup FROM pg_stat_user_tables where relname='subscriber_offset_manager' ORDER BY n_dead_tup
;
schemaname | relname | n_live_tup | n_dead_tup
------------+---------------------------+------------+------------
public | subscriber_offset_manager | 68325 | 132
How Our App query to this table .
Our application generally select some rows and based on some business calculation, update the row .
select query -- select based on some id
select * from subscriber_offset_manager where shard_id=1 ;
update query -- update some other column for this selected shard id
around 20 threads do this in parallel and One thread works on only one row .
app is writen in java and we are using hibernate to do db operations .
Postgresql version is 9.3.24
One more interesting observation :
- when i stop my java app and then do full vacuum , it works fine (number of rows and live tuples become equal). So there is something wrong if we select and update continuously from java app . –
Problem/Issue
These live tuples some times go to dead tuples and after some times again comes to live .
Due to above behaviour select from the table taking time and increasing load on server as lots of live/deadtuples are there ..
I know three things that keep VACUUM from doing its job:
Long running transactions.
Prepared transactions that did not get committed.
Stale replication slots.
See my blog post for details.
I got the issue ☺ .
For Understanding the issue consider the following flow :
Thread 1 -
Opens a hibernate session
Make some queries on Table-A
Select from subscriber_offset_manager
Update subscriber_offset_manager .
Closes the Session .
Many Threads of Type Thread-1 running in parallel .
Thread 2 -
These type of threads are running in parallel .
Opens a hibernate session
Make some select queries on Table-A
Does not close session .(session leak .)
Temporary Solution - If i close all those connection made by Thread-2 by using pg_cancel_backend then vacuuming starts working .
Also we have recreated the issue many times and tried this solution and it worked .
Now, there are following doubts which are still not answered .
Why postgres is not showing any data related to table "subscriber_offset_manager" .
This issue is not re-creating when instead of running Thread-2 , if we run select on Table-A , using psql .
why postgres is working like this with jdbc .
Some more mind blowing observation :
event if we run queries on "subscriber_offset_manager" in different session then also issue coming ;
we found many instance here where Thread 2 is working on some third table "Table-C" and issue is coming
all these type od transactions state in pg_stat_activity is "idle_in_transaction ."
#Erwin Brandstetter and #Laurenz Albe , if you know there is bug related to postgres/jdbc .
There might be locks after all, your query might be misleading:
SELECT query, state,locktype,mode
FROM pg_locks
JOIN pg_stat_activity USING (pid)
WHERE relation = 'subscriber_offset_manager'::regclass
pg_locks.pid can be NULL, then the join would eliminate rows. The manual for Postgres 9.3:
Process ID of the server process holding or awaiting this lock, or null if the lock is held by a prepared transaction
Bold emphasis mine. (Still the same in pg 10.)
Do you get anything for the simple query?
SELECT * FROM pg_locks
WHERE relation = 'subscriber_offset_manager'::regclass;
This could explain why VACUUM complains:
DETAIL: 67720 dead row versions cannot be removed yet.
This, in turn, would point to problems in your application logic / queries, locking more rows than necessary.
My first idea would be long running transactions, where even a simple SELECT (acquiring a lowly ACCESS SHARE lock) can block VACUUM from doing its job. 20 threads in parallel might chain up and lock out VACUUM indefinitely. Keep your transactions (and their locks) as brief as possible. And make sure your queries are optimized and don't lock more rows than necessary.
One more thing to note: transaction isolation levels SERIALIZABLE or REPEATABLE READ make it much harder for VACUUM to clean up. Default READ COMMITTED mode is less restrictive, but VACUUM can still be blocked as discussed.
Related:
What are the consequences of not ending a database transaction?
Postgres UPDATE … LIMIT 1
VACUUM VERBOSE outputs, nonremovable “dead row versions cannot be removed yet”?
There is a table , which has 200 rows . But number of live tuples showing there is more than that (around 60K) .
select count(*) from subscriber_offset_manager;
count
-------
200
(1 row)
SELECT schemaname,relname,n_live_tup,n_dead_tup FROM pg_stat_user_tables where relname='subscriber_offset_manager' ORDER BY n_dead_tup
;
schemaname | relname | n_live_tup | n_dead_tup
------------+---------------------------+------------+------------
public | subscriber_offset_manager | 61453 | 5
(1 row)
But as seen from pg_stat_activity and pg_locks , we are not able to track any open connection .
SELECT query, state,locktype,mode
FROM pg_locks
JOIN pg_stat_activity
USING (pid)
WHERE relation::regclass = 'subscriber_offset_manager'::regclass
;
query | state | locktype | mode
-------+-------+----------+------
(0 rows)
I also tried full vacuum on this table , Below are results :
All the times no rows are removed
some times all the live tuples become dead tuples .
Here is output .
vacuum FULL VERBOSE ANALYZE subscriber_offset_manager;
INFO: vacuuming "public.subscriber_offset_manager"
INFO: "subscriber_offset_manager": found 0 removable, 67920 nonremovable row versions in 714 pages
DETAIL: 67720 dead row versions cannot be removed yet.
CPU 0.01s/0.06u sec elapsed 0.13 sec.
INFO: analyzing "public.subscriber_offset_manager"
INFO: "subscriber_offset_manager": scanned 710 of 710 pages, containing 200 live rows and 67720 dead rows; 200 rows in sample, 200 estimated total rows
VACUUM
SELECT schemaname,relname,n_live_tup,n_dead_tup FROM pg_stat_user_tables where relname='subscriber_offset_manager' ORDER BY n_dead_tup
;
schemaname | relname | n_live_tup | n_dead_tup
------------+---------------------------+------------+------------
public | subscriber_offset_manager | 200 | 67749
and after 10 sec
SELECT schemaname,relname,n_live_tup,n_dead_tup FROM pg_stat_user_tables where relname='subscriber_offset_manager' ORDER BY n_dead_tup
;
schemaname | relname | n_live_tup | n_dead_tup
------------+---------------------------+------------+------------
public | subscriber_offset_manager | 68325 | 132
How Our App query to this table .
Our application generally select some rows and based on some business calculation, update the row .
select query -- select based on some id
select * from subscriber_offset_manager where shard_id=1 ;
update query -- update some other column for this selected shard id
around 20 threads do this in parallel and One thread works on only one row .
app is writen in java and we are using hibernate to do db operations .
Postgresql version is 9.3.24
One more interesting observation :
- when i stop my java app and then do full vacuum , it works fine (number of rows and live tuples become equal). So there is something wrong if we select and update continuously from java app . –
Problem/Issue
These live tuples some times go to dead tuples and after some times again comes to live .
Due to above behaviour select from the table taking time and increasing load on server as lots of live/deadtuples are there ..
I know three things that keep VACUUM from doing its job:
Long running transactions.
Prepared transactions that did not get committed.
Stale replication slots.
See my blog post for details.
I got the issue ☺ .
For Understanding the issue consider the following flow :
Thread 1 -
Opens a hibernate session
Make some queries on Table-A
Select from subscriber_offset_manager
Update subscriber_offset_manager .
Closes the Session .
Many Threads of Type Thread-1 running in parallel .
Thread 2 -
These type of threads are running in parallel .
Opens a hibernate session
Make some select queries on Table-A
Does not close session .(session leak .)
Temporary Solution - If i close all those connection made by Thread-2 by using pg_cancel_backend then vacuuming starts working .
Also we have recreated the issue many times and tried this solution and it worked .
Now, there are following doubts which are still not answered .
Why postgres is not showing any data related to table "subscriber_offset_manager" .
This issue is not re-creating when instead of running Thread-2 , if we run select on Table-A , using psql .
why postgres is working like this with jdbc .
Some more mind blowing observation :
event if we run queries on "subscriber_offset_manager" in different session then also issue coming ;
we found many instance here where Thread 2 is working on some third table "Table-C" and issue is coming
all these type od transactions state in pg_stat_activity is "idle_in_transaction ."
#Erwin Brandstetter and #Laurenz Albe , if you know there is bug related to postgres/jdbc .
There might be locks after all, your query might be misleading:
SELECT query, state,locktype,mode
FROM pg_locks
JOIN pg_stat_activity USING (pid)
WHERE relation = 'subscriber_offset_manager'::regclass
pg_locks.pid can be NULL, then the join would eliminate rows. The manual for Postgres 9.3:
Process ID of the server process holding or awaiting this lock, or null if the lock is held by a prepared transaction
Bold emphasis mine. (Still the same in pg 10.)
Do you get anything for the simple query?
SELECT * FROM pg_locks
WHERE relation = 'subscriber_offset_manager'::regclass;
This could explain why VACUUM complains:
DETAIL: 67720 dead row versions cannot be removed yet.
This, in turn, would point to problems in your application logic / queries, locking more rows than necessary.
My first idea would be long running transactions, where even a simple SELECT (acquiring a lowly ACCESS SHARE lock) can block VACUUM from doing its job. 20 threads in parallel might chain up and lock out VACUUM indefinitely. Keep your transactions (and their locks) as brief as possible. And make sure your queries are optimized and don't lock more rows than necessary.
One more thing to note: transaction isolation levels SERIALIZABLE or REPEATABLE READ make it much harder for VACUUM to clean up. Default READ COMMITTED mode is less restrictive, but VACUUM can still be blocked as discussed.
Related:
What are the consequences of not ending a database transaction?
Postgres UPDATE … LIMIT 1
VACUUM VERBOSE outputs, nonremovable “dead row versions cannot be removed yet”?
I am running on postgres 9.3 on mac osx and I have a database which grew out of control. I used to have table which had one column which stored large data. Then I noticed that there the db size grew up to around 19gb just because of a pg_toast table. Then I remove the mentioned column and ran vacuum in order to get the db to a smaller size again, but it remained the same. So how can I shrink the database size?
SELECT nspname || '.' || relname AS "relation"
,pg_size_pretty(pg_relation_size(C.oid)) AS "size"
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE nspname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_relation_size(C.oid) DESC
LIMIT 20;
results in
pg_toast.pg_toast_700305 | 18 GB
pg_toast.pg_toast_700305_index | 206 MB
public.catalog_hotelde_images | 122 MB
public.routes | 120 MB
VACUUM VERBOSE ANALYZE pg_toast.pg_toast_700305; INFO: vacuuming "pg_toast.pg_toast_700305"
INFO: index "pg_toast_700305_index" now contains 9601330 row versions in 26329 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.06s/0.02u sec elapsed 0.33 sec.
INFO: "pg_toast_700305": found 0 removable, 0 nonremovable row versions in 0 out of 2393157 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.06s/0.07u sec elapsed 0.37 sec.
VACUUM
structure of the routes table
id serial NOT NULL,
origin_id integer,
destination_id integer,
total_time integer,
total_distance integer,
speed_id integer,
uid bigint,
created_at timestamp without time zone,
updated_at timestamp without time zone,
CONSTRAINT routes_pkey PRIMARY KEY (id)
You can use one of the two types of vacuuming: standard or full.
standard:
VACUUM table_name;
full:
VACUUM FULL table_name;
Keep in mind that VACUUM FULL locks the table it is working on until it's finished.
You may want to perform standard vacuum more frequently on your tables which have frequent upload/delete activity, it may not give you as much space as vacuum full does but you will be able to run operations like SELECT, INSERT, UPDATE and DELETE and it will take less time to complete.
In my case, when pg_toast (along with other tables) got out of control, standard VACUUM made a slight difference but was not enough. I used VACUUM FULL to reclaim more disk space which was very slow on large relations. I decided to tune autovacuum and use standard VACUUM more often on my tables which are updated frequently.
If you need to use VACUUM FULL, you should do it when your users are less active.
Also, do not turn off autovacuum.
You can get some additional information by adding verbose to your commands:
VACUUM FULL VERBOSE table_name;
Try the following:
vacuum full
I have big tables in which I have only inserts and selects, so when autovacuum for this tables is running - system is very slow. I have switch off autovacuum for specific tables:
ALTER TABLE ag_event_20141004_20141009 SET (autovacuum_enabled = false, toast.autovacuum_enabled = false);
ALTER TABLE ag_event_20141014_20141019 SET (autovacuum_enabled = false, toast.autovacuum_enabled = false);
After this (some time after) I see:
select pid, waiting, xact_start, query_start,query from pg_stat_activity order by query_start;
18092 | f | 2014-11-04 22:21:05.95512+03 | 2014-11-04 22:21:05.95512+03 | autovacuum: VACUUM public.ag_event_20141004_20141009 (to prevent wraparound)
19877 | f | 2014-11-04 22:22:05.889182+03 | 2014-11-04 22:22:05.889182+03 | autovacuum: VACUUM public.ag_event_20141014_20141019 (to prevent wraparound)
What shell I do to switch autovacuuming for this tables at all ??
The key here is:
(to prevent wraparound)
This means Postgres must autovacuum in order to free up transaction identifiers.
You can not entirely disable this type of autovacuum, but you can reduce its frequency by tuning the autovacuum_freeze_max_age and vacuum_freeze_min_age parameters.