How can I tell if PostgreSQL's Autovacuum is running on UNIX? - postgresql

How can one tell if the autovacuum daemon in Postgres 9.x is running and maintaining the database cluster?

PostgreSQL 9.3
Determine if Autovacuum is Running
This is specific to Postgres 9.3 on UNIX.
For Windows, see this question.
Query Postgres System Table
SELECT
schemaname, relname,
last_vacuum, last_autovacuum,
vacuum_count, autovacuum_count -- not available on 9.0 and earlier
FROM pg_stat_user_tables;
Grep System Process Status
$ ps -axww | grep autovacuum
24352 ?? Ss 1:05.33 postgres: autovacuum launcher process (postgres)
Grep Postgres Log
# grep autovacuum /var/log/postgresql
LOG: autovacuum launcher started
LOG: autovacuum launcher shutting down
If you want to know more about the autovacuum activity, set log_min_messages to DEBUG1..DEBUG5. The SQL command VACUUM VERBOSE will output information at log level INFO.
Regarding the Autovacuum Daemon, the Posgres docs state:
In the default configuration, autovacuuming is enabled and the related configuration parameters are appropriately set.
See Also:
http://www.postgresql.org/docs/current/static/routine-vacuuming.html
http://www.postgresql.org/docs/current/static/runtime-config-autovacuum.html

I'm using:
select count(*) from pg_stat_activity where query like 'autovacuum:%';
in collectd to know how many autovacuum are running concurrently.
You may need to create a security function like this:
CREATE OR REPLACE FUNCTION public.pg_autovacuum_count() RETURNS bigint
AS 'select count(*) from pg_stat_activity where query like ''autovacuum:%'';'
LANGUAGE SQL
STABLE
SECURITY DEFINER;
and call that from collectd.
In earlier Postgres, "query" was "current_query" so change it according to what works.

You can also run pg_activity to see the currently running queries on your database. I generally leave a terminal open with this running most of the time anyway as it's very useful.

set log_autovacuum_min_duration to the time length that you desire and the autovacuum execution exceeds the time length will be logged.

Related

Postgres 9.6 -> 14 using pglogical, autovacuum not running

We're upgrading our Postgresql from 9.6 to 14, using pglogical (latest installed via yum). The replication is working fine without errors. What we are not seeing, however, is any autovacuum activity on the v14 database, even though we continue to see normal autovacuum activity on the v9.6 database. Also, strangely, the dead tuple counts do not change on the v14 database and are mostly 0. I did run VACUUM ANALYZE on the v14 database.
The command we are using to see autovacuum activity is
SELECT relname, last_vacuum, last_autovacuum, last_autoanalyze FROM pg_stat_user_tables;
The command we are using to see dead tuple counts is
SELECT relname, n_dead_tup FROM pg_stat_user_tables;
There's nothing in the logs except checkpoint notifications. Here is one line picked at random:
2022-09-22 11:59:46 PDT [2877]: [15846-1] user=,db=,app=,client= LOG: checkpoint complete: wrote 38220 buffers (0.9%); 0 WAL file(s) added, 0 removed, 17 recycled; write=269.923 s, sync=0.025 s, total=269.962 s; sync files=264, longest=0.007 s, average=0.001 s; distance=313936 kB, estimate=329901 kB
The v14 database is streaming to another v14 database acting as a replica.
Is this expected behavior?
After experimenting, it's clear that the n_dead_tup counts are not updated while pglogical replication is running. This also means that autovacuum never runs while pglogical replication is running. Restarting the v14 node causes the n_dead_tup counts to be updated and does trigger autovacuum, but that is a one time event (the tuples are not updated again until another restart).
Once you disable pglogical, the n_dead_tup counts are immediately updated, and autovacuum starts to work again as expected (even without a restart).

Why is Postgres idle transaction not terminated?

I have a long running idle query that is not automatically terminated.
I have set both the max timeouts to 2h (very long I know)
> select name,setting from pg_settings where name='statement_timeout' OR name='idle_in_transaction_session_timeout';
name | setting
-------------------------------------+---------
idle_in_transaction_session_timeout | 7200000
statement_timeout | 7200000
However I have this idle query (not idle_in_transaction) that is leftover from an application that crashed
> SELECT pid, age(clock_timestamp(), query_start), state, usename, query
FROM pg_stat_activity
WHERE query NOT ILIKE '%pg_stat_activity%'
ORDER BY query_start desc;
17117 | 02:11:40.795487 | idle | ms1-user | select distinct ....
Postgres 11.13 running on AWS Aurora
Can anyone explain why/what's missing?
As the name suggests, idle_in_transaction_session_timeout does not terminate idle sessions, but sessions that are "idle in transaction". For the latter, you can use idle_session_timeout introduced in PostgreSQL v14.
In your case, the problem are the TCP keepalive settings. With the default keepalive settings on Linux, it takes the server around 2 hours and 14.5 minutes to figure out that the other end of the connection is no longer there. So wait a few minutes more :^)
If you want to reduce the time, you can set the PostgreSQL parameters tcp_keepalives_idle, tcp_keepalives_interval and tcp_keepalives_count if Amazon allows you to do that. If they don't, complain.

Postgresql 9.2.1 failed to initialize after full vacuum in standalone backend mode

In postgresql version 9.2.1, the database didn't accept any commands to avoid wraparouond dataloss.The following error occured in the pg_log,
ERROR: database is not accepting commands to avoid wraparound data loss in database "XXX"
HINT: Stop the postmaster and use a standalone backend to vacuum that database.
You might also need to commit or roll back old prepared transactions.
I executed vacuum full for the database XXX in standalone backend mode.After that i tried to restart the pgsql, now pgsql server is rejecting connnections. while executing the pg_isready command, the host is rejecting connections.
Is there anything i have to do after completing the vacuum process? what are the possible reasons for the postgres server is failed to start ? Thanks in advance.
In single user mode, run
SELECT datname, datfrozenxid FROM pg_database;
to see which databases need to be vacuumed (those with the smallest values).
Run VACUUM (FREEZE) in these databases, not VACUUM (FULL).

Autovacuum is not running on Openshift Online Postgres cartridge

I have Postgres 9.2 on my Openshift Online cartridge. Using Pgadmin3, I have enabled (by ticking the box) the autovuum setting for postgresql.conf. However, the autovacuum does not seem to be running.
Here is what I have:
ps -ef | grep -i vacuum
No autovacuum process is shown.
Using psql console, show autovacuum, says that its value is ON
Using psql console, SELECT schemaname, relname, last_vacuum, last_autovacuum from FROM pg_stat_user_tables; gives no value in last_vacuum and last_autovacuum column even though I did a manual Vacuum via Maintenance function using pgadmin3.
The properties tab on the db in pgAdminIII says AUTOVACUUM value of 'not running'
What do I miss?
EDIT
I also cannot access the postgresql.conf on Openshift Online when trying to find the file on the server - hoping to manually edit the file instead of using pgAdminIII.
-- Found this https://www.openshift.com/forums/openshift/how-do-i-set-maxpreparedtransactions-on-my-postgresql-cartridge I am now able to view/edit my postgresql.conf. Apparently the autovacuum is on already so the conf has the right setting.
When issue pg_ctl restart -m fast I got
LOG: could not bind socket for statistics collector: Permission denied
LOG: trying another address for the statistics collector
LOG: could not bind socket for statistics collector: Permission denied
LOG: trying another address for the statistics collector
LOG: could not bind socket for statistics collector: Cannot assign requested address LOG: trying another address for the statistics collector
LOG: could not bind socket for statistics collector: Cannot assign requested address LOG: disabling statistics collector for lack of working socket
WARNING: autovacuum not started because of misconfiguration
HINT: Enable the "track_counts" option.
LOG: database system was shut down at 2014-04-22 09:58:19 GMT
LOG: database system is ready to accept connections
Though track_counts is already set to on in postgresql.conf
Sorry for being so stupid but any help or pointers are much appreciated.
Thank you in advance.
i ran into a similar issue and found a helpful hint in this discussion:
... for some insane reason, openshit disabled localhost, and autovacuum only connects to localhost, I suppose it makes sense that they wouldn't want to be trying to vacuum a remote db... but openshit breaks autovacuum.
one solution i've found (and that i'll probably use) is to manually add a cronjob that does a forced vacuum. here is a batch-script that looks promising but be careful with the side-effects that a forced vacuum might involve (depending on you app of course).
Patching postgres to use the OPENSHIFT_PG_HOST environment variable instead of localhost seems to solve the problem: pgstat.patch.

Force client disconnect using PostgreSQL

Is there a way to force clients to disconnect from PostgreSQL? I'm looking for the equivlent of DB2's force application all.
I'd like to do this on my development box because when I've got database consoles open, I can't load a database dump. I have to quit them first.
Kills idle processes in PostgreSQL 8.4:
SELECT procpid, (SELECT pg_terminate_backend(procpid)) as killed from pg_stat_activity
WHERE current_query LIKE '<IDLE>';
Combine pg_terminate_backend function and the pg_stat_activity system view.
This SO answer beautifully explains (full quote from araqnid between the horizontal rules, then me again):
To mark database 'applogs' as not accepting new connections:
update pg_database set datallowconn = false where datname = 'applogs';
Another possibility would be to revoke 'connect' access on the database for the client role(s).
Disconnect users from database = kill backend. So to disconnect all other users from "applogs" database, for example:
select pg_terminate_backend(procpid)
from pg_stat_activity
where datname = 'applogs' and procpid <> pg_backend_pid();
Once you've done both of those, you are the only user connected to 'applogs'. Although there might actually be a delay before the backends actually finish disconnecting?
Update from MarkJL: There is indeed a delay before the backends finish disconnecting.
Now me again: That being said, mind that the procpid column was renamed to pid in PostgreSQL 9.2 and later.
I think that this is much more helpful than the answer by Milen A. Radev which, while technically the same, does not come with usage examples and real-life suggestions.
I post my answer because I couldn't use any of them in my script, server 9.3:
psql -U postgres -c "SELECT pid, (SELECT pg_terminate_backend(pid)) as killed from pg_stat_activity WHERE datname = 'my_database_to_alter';"
In the next line, you can do anything yo want with 'my_database_to_alter'. As you can see, yo perform the query from the "postgres" database, which exists almost in every postgresql installation.
Doing by superuser and outside the problem-database itself worked perfect for me.
probably a more heavy handed approach then should be used but:
for x in `ps -eF | grep -E "postgres.*idle"| awk '{print $2}'`;do kill $x; done
I found this thread on the mailing list. It suggests using SIGTERM to cause the clients to disconnect.
Not as clean as db2 force application all.