Why is Postgres idle transaction not terminated? - postgresql

I have a long running idle query that is not automatically terminated.
I have set both the max timeouts to 2h (very long I know)
> select name,setting from pg_settings where name='statement_timeout' OR name='idle_in_transaction_session_timeout';
name | setting
-------------------------------------+---------
idle_in_transaction_session_timeout | 7200000
statement_timeout | 7200000
However I have this idle query (not idle_in_transaction) that is leftover from an application that crashed
> SELECT pid, age(clock_timestamp(), query_start), state, usename, query
FROM pg_stat_activity
WHERE query NOT ILIKE '%pg_stat_activity%'
ORDER BY query_start desc;
17117 | 02:11:40.795487 | idle | ms1-user | select distinct ....
Postgres 11.13 running on AWS Aurora
Can anyone explain why/what's missing?

As the name suggests, idle_in_transaction_session_timeout does not terminate idle sessions, but sessions that are "idle in transaction". For the latter, you can use idle_session_timeout introduced in PostgreSQL v14.
In your case, the problem are the TCP keepalive settings. With the default keepalive settings on Linux, it takes the server around 2 hours and 14.5 minutes to figure out that the other end of the connection is no longer there. So wait a few minutes more :^)
If you want to reduce the time, you can set the PostgreSQL parameters tcp_keepalives_idle, tcp_keepalives_interval and tcp_keepalives_count if Amazon allows you to do that. If they don't, complain.

Related

How to Write PostgreSQL Stored Procedure to find and kill idle query?

Need to write PostgreSQL Stored Procedure to find and kill Idle queries. Below are the query to find and kill PID:
To find list of PID:
SELECT pid FROM pg_stat_activity where datname='dbdataanalytics' and state='idle' and state_change<=current_date-1
To kill The PID:
SELECT pg_terminate_backend(pid) FROM idle_connections
But I need to run this every day automatically. Please help me write stored procedure.
You should upgrade to PostgreSQL v14 and set the idle_session_timeout parameter. Even better would be to fix the connection leak in your application.
To terminate all connections that have been idle for more then 5 minutes, tun
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE state = 'idle'
AND state_change < current_timestamp - INTERVAL '5 minutes';

postgres "idle in transaction" for 13 hours

We recently saw a few queries "idle in transaction" for quite some time
pid | usename | state | duration | application_name | wait_event | wait_event_type
------+---------+---------------------+----------+------------------+------------+----------------
31620 | results | idle in transaction | 12:52:23 | bin/rails | |
That is almost 13 hours idle in transaction.
Any idea what causes them to get stuck in idle, or how to dig deeper? We did notice some OOM errors for background jobs.
There are also a lot of "idle" queries, but thanks for the comments, those seem to be fine:
In postgresql "idle in transaction" with all locks granted #LaurenzAlbe was pointing out the idle session timeout configuration option as a band-aid, but I'd rather understand this issue than hide it.
thanks!
PS: our application is ruby on rails and we use a mix of active record and custom SQL
EDIT: original title was "idle in transaction", the queries are actually just idle most of the time and not in transaction, sorry about that
EDIT #2: found the 13 hour idle in transaction process
These sessions are actually all idle, so they are no problem.
idle is significantly different from idle in transaction: the latter is an open transaction that holds locks and blocks VACUUM, the first is harmless.
The OOM errors must have a different reason.
You should configure the machine so that
shared_buffers + max_connections * work_mem <= available RAM

How can I tell if PostgreSQL's Autovacuum is running on UNIX?

How can one tell if the autovacuum daemon in Postgres 9.x is running and maintaining the database cluster?
PostgreSQL 9.3
Determine if Autovacuum is Running
This is specific to Postgres 9.3 on UNIX.
For Windows, see this question.
Query Postgres System Table
SELECT
schemaname, relname,
last_vacuum, last_autovacuum,
vacuum_count, autovacuum_count -- not available on 9.0 and earlier
FROM pg_stat_user_tables;
Grep System Process Status
$ ps -axww | grep autovacuum
24352 ?? Ss 1:05.33 postgres: autovacuum launcher process (postgres)
Grep Postgres Log
# grep autovacuum /var/log/postgresql
LOG: autovacuum launcher started
LOG: autovacuum launcher shutting down
If you want to know more about the autovacuum activity, set log_min_messages to DEBUG1..DEBUG5. The SQL command VACUUM VERBOSE will output information at log level INFO.
Regarding the Autovacuum Daemon, the Posgres docs state:
In the default configuration, autovacuuming is enabled and the related configuration parameters are appropriately set.
See Also:
http://www.postgresql.org/docs/current/static/routine-vacuuming.html
http://www.postgresql.org/docs/current/static/runtime-config-autovacuum.html
I'm using:
select count(*) from pg_stat_activity where query like 'autovacuum:%';
in collectd to know how many autovacuum are running concurrently.
You may need to create a security function like this:
CREATE OR REPLACE FUNCTION public.pg_autovacuum_count() RETURNS bigint
AS 'select count(*) from pg_stat_activity where query like ''autovacuum:%'';'
LANGUAGE SQL
STABLE
SECURITY DEFINER;
and call that from collectd.
In earlier Postgres, "query" was "current_query" so change it according to what works.
You can also run pg_activity to see the currently running queries on your database. I generally leave a terminal open with this running most of the time anyway as it's very useful.
set log_autovacuum_min_duration to the time length that you desire and the autovacuum execution exceeds the time length will be logged.

Is there a timeout for idle PostgreSQL connections?

1 S postgres 5038 876 0 80 0 - 11962 sk_wai 09:57 ? 00:00:00 postgres: postgres my_app ::1(45035) idle
1 S postgres 9796 876 0 80 0 - 11964 sk_wai 11:01 ? 00:00:00 postgres: postgres my_app ::1(43084) idle
I see a lot of them. We are trying to fix our connection leak. But meanwhile, we want to set a timeout for these idle connections, maybe max to 5 minute.
It sounds like you have a connection leak in your application because it fails to close pooled connections. You aren't having issues just with <idle> in transaction sessions, but with too many connections overall.
Killing connections is not the right answer for that, but it's an OK-ish temporary workaround.
Rather than re-starting PostgreSQL to boot all other connections off a PostgreSQL database, see: How do I detach all other users from a postgres database? and How to drop a PostgreSQL database if there are active connections to it? . The latter shows a better query.
For setting timeouts, as #Doon suggested see How to close idle connections in PostgreSQL automatically?, which advises you to use PgBouncer to proxy for PostgreSQL and manage idle connections. This is a very good idea if you have a buggy application that leaks connections anyway; I very strongly recommend configuring PgBouncer.
A TCP keepalive won't do the job here, because the app is still connected and alive, it just shouldn't be.
In PostgreSQL 9.2 and above, you can use the new state_change timestamp column and the state field of pg_stat_activity to implement an idle connection reaper. Have a cron job run something like this:
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE datname = 'regress'
AND pid <> pg_backend_pid()
AND state = 'idle'
AND state_change < current_timestamp - INTERVAL '5' MINUTE;
In older versions you need to implement complicated schemes that keep track of when the connection went idle. Do not bother; just use pgbouncer.
In PostgreSQL 9.6, there's a new option idle_in_transaction_session_timeout which should accomplish what you describe. You can set it using the SET command, e.g.:
SET SESSION idle_in_transaction_session_timeout = '5min';
In PostgreSQL 9.1, the idle connections with following query. It helped me to ward off the situation which warranted in restarting the database. This happens mostly with JDBC connections opened and not closed properly.
SELECT
pg_terminate_backend(procpid)
FROM
pg_stat_activity
WHERE
current_query = '<IDLE>'
AND
now() - query_start > '00:10:00';
if you are using postgresql 9.6+, then in your postgresql.conf you can set
idle_in_transaction_session_timeout = 30000 (msec)
There is a timeout on broken connections (i.e. due to network errors), which relies on the OS' TCP keepalive feature. By default on Linux, broken TCP connections are closed after ~2 hours (see sysctl net.ipv4.tcp_keepalive_time).
There is also a timeout on abandoned transactions, idle_in_transaction_session_timeout and on locks, lock_timeout. It is recommended to set these in postgresql.conf.
But there is no timeout for a properly established client connection. If a client wants to keep the connection open, then it should be able to do so indefinitely. If a client is leaking connections (like opening more and more connections and never closing), then fix the client. Do not try to abort properly established idle connections on the server side.
A possible workaround that allows to enable database session timeout without an external scheduled task is to use the extension pg_timeout that I have developped.
Another option is set this value "tcp_keepalives_idle". Check more in documentation https://www.postgresql.org/docs/10/runtime-config-connection.html.

How to close idle connections in PostgreSQL automatically?

Some clients connect to our postgresql database but leave the connections opened.
Is it possible to tell Postgresql to close those connection after a certain amount of inactivity ?
TL;DR
IF you're using a Postgresql version >= 9.2
THEN use the solution I came up with
IF you don't want to write any code
THEN use arqnid's solution
IF you don't want to write any code
AND you're using a Postgresql version >= 14
THEN use Laurenz Albe's solution
For those who are interested, here is the solution I came up with, inspired from Craig Ringer's comment:
(...) use a cron job to look at when the connection was last active (see pg_stat_activity) and use pg_terminate_backend to kill old ones.(...)
The chosen solution comes down like this:
First, we upgrade to Postgresql 9.2.
Then, we schedule a thread to run every second.
When the thread runs, it looks for any old inactive connections.
A connection is considered inactive if its state is either idle, idle in transaction, idle in transaction (aborted) or disabled.
A connection is considered old if its state stayed the same during more than 5 minutes.
There are additional threads that do the same as above. However, those threads connect to the database with different user.
We leave at least one connection open for any application connected to our database. (rank() function)
This is the SQL query run by the thread:
WITH inactive_connections AS (
SELECT
pid,
rank() over (partition by client_addr order by backend_start ASC) as rank
FROM
pg_stat_activity
WHERE
-- Exclude the thread owned connection (ie no auto-kill)
pid <> pg_backend_pid( )
AND
-- Exclude known applications connections
application_name !~ '(?:psql)|(?:pgAdmin.+)'
AND
-- Include connections to the same database the thread is connected to
datname = current_database()
AND
-- Include connections using the same thread username connection
usename = current_user
AND
-- Include inactive connections only
state in ('idle', 'idle in transaction', 'idle in transaction (aborted)', 'disabled')
AND
-- Include old connections (found with the state_change field)
current_timestamp - state_change > interval '5 minutes'
)
SELECT
pg_terminate_backend(pid)
FROM
inactive_connections
WHERE
rank > 1 -- Leave one connection for each application connected to the database
If you are using PostgreSQL >= 9.6 there is an even easier solution. Let's suppose you want to delete all idle connections every 5 minutes, just run the following:
alter system set idle_in_transaction_session_timeout='5min';
In case you don't have access as superuser (example on Azure cloud), try:
SET SESSION idle_in_transaction_session_timeout = '5min';
But this latter will work only for the current session, that most likely is not what you want.
To disable the feature,
alter system set idle_in_transaction_session_timeout=0;
or
SET SESSION idle_in_transaction_session_timeout = 0;
(by the way, 0 is the default value).
If you use alter system, you must reload configuration to start the change and the change is persistent, you won't have to re-run the query anymore if, for example, you will restart the server.
To check the feature status:
show idle_in_transaction_session_timeout;
Connect through a proxy like PgBouncer which will close connections after server_idle_timeout seconds.
From PostgreSQL v14 on, you can set the idle_session_timeout parameter to automatically disconnect client sessions that are idle.
If you use AWS with PostgreSQL >= 9.6, you have to do the following:
Create custom parameter group
go to RDS > Parameter groups > Create parameter group
Select the version of PSQL that you use, name it 'customParameters' or whatever and add description 'handle idle connections'.
Change the idle_in_transaction_session_timeout value
Fortunately it will create a copy of the default AWS group so you only have to tweak the things that you deem not suitable for your use-case.
Now click on the newly created parameter group and search 'idle'.
The default value for 'idle_in_transaction_session_timeout' is set to 24 hours (86400000 milliseconds). Divide this number by 24 to have hours (3600000) and then you have to again divide 3600000 by 4, 6 or 12 depending on whether you want the timeout to be respectively 15, 10 or 5 minutes (or equivalently multiply the number of minutes x 60000, so value 300 000 for 5 minutes).
Assign the group
Last, but not least, change the group:
go to RDS, select your DB and click on 'Modify'.
Now under 'Database options' you will find 'DB parameter group', change it to the newly created group.
You can then decide if you want to apply the modifications immediately (beware of downtime).
I have the problem of denied connections as there are too much clients connected on Postgresql 12 server (but not on similar projects using earlier 9.6 and 10 versions) and Ubuntu 18.
I wonder if those settings
tcp_keepalives_idle
tcp_keepalives_interval
could be more relevant than
idle_in_transaction_session_timeout
idle_in_transaction_session_timeout indeed closes only the idle connections from failed transactions, not the inactive connections whose statements terminate correctly...
the documentation reads that these socket-level settings have no impact with Unix-domain sockets but it could work on Ubuntu.
Up to PostgreSQL 13, you can use my extension pg_timeout.