postgresql replication | application freezes when slave/recovery server goes down

postgresql replication | application freezes when slave/recovery server goes down - postgresql

I support an application hosted by a small business, web-based ROR app using pgsql database on the backend.
Postgres is setup for replication to an off-site standby server, which as far as I can tell is working fine, when I query the remote server it shows that it's in recovery, etc.
From the 'master' server:
postgres=# table pg_stat_replication ;
pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start
| state | sent_location | write_location | flush_location | replay_location | sync_priority | sync_state
-------+----------+---------+------------------+----------------+-----------------+-------------+-----------------------
--------+-----------+---------------+----------------+----------------+-----------------+---------------+------------
18660 | 1281085 | rep | postgresql2 | 192.168.81.155 | | 43824 | 2017-05-07 11:42:43.15
0057-04 | streaming | 3/B5243418 | 3/B5243418 | 3/B5243418 | 3/B5243150 | 1 | sync
(1 row)
...and on the 'slave':
postgres=# select pg_is_in_recovery();
pg_is_in_recovery
-------------------
t
(1 row)
postgres=# select now() - pg_last_xact_replay_timestamp() AS replication_delay;
replication_delay
-------------------
01:02:14.885511
(1 row)
I understand the process involved should I have to promote my remote slave DB to the role of master, but the problem I seem to have is that on 2 or 3 occasions now the network link to the remote slave server has gone down, and the application completely "freezes up" (e.g. page loads but will not allow users to logon), despite the fact that the master DB is still up and running. I have wal archiving enabled to make sure that when something like this happens the data is preserved until the link is restored and the transaction logs can be sent...but I don't understand why my master pgsql instance seems to lockup because the slave instance goes offline...kind of defeats the entire concept of replication, so I assume I must be doing something wrong?

The most likely explanation is that you are using synchronous replication with just two nodes.
Is synchronous_standby_names set on the master server?
If the only synchronous standby server is not available, no transaction can commit on the master, and data modifying transactions will “hang”, which would explain the behaviour you observe.
For synchronous replication you need at lest two slaves.

Related

Why is Postgres idle transaction not terminated?

I have a long running idle query that is not automatically terminated.
I have set both the max timeouts to 2h (very long I know)
> select name,setting from pg_settings where name='statement_timeout' OR name='idle_in_transaction_session_timeout';
name | setting
-------------------------------------+---------
idle_in_transaction_session_timeout | 7200000
statement_timeout | 7200000
However I have this idle query (not idle_in_transaction) that is leftover from an application that crashed
> SELECT pid, age(clock_timestamp(), query_start), state, usename, query
FROM pg_stat_activity
WHERE query NOT ILIKE '%pg_stat_activity%'
ORDER BY query_start desc;
17117 | 02:11:40.795487 | idle | ms1-user | select distinct ....
Postgres 11.13 running on AWS Aurora
Can anyone explain why/what's missing?

As the name suggests, idle_in_transaction_session_timeout does not terminate idle sessions, but sessions that are "idle in transaction". For the latter, you can use idle_session_timeout introduced in PostgreSQL v14.
In your case, the problem are the TCP keepalive settings. With the default keepalive settings on Linux, it takes the server around 2 hours and 14.5 minutes to figure out that the other end of the connection is no longer there. So wait a few minutes more :^)
If you want to reduce the time, you can set the PostgreSQL parameters tcp_keepalives_idle, tcp_keepalives_interval and tcp_keepalives_count if Amazon allows you to do that. If they don't, complain.

Vacuum stats on read replica

We have a hot_standby replication configured for our postgresDB, master is for read/write and replica is a read only server.
When I try to fetch the dead tuple count on both master and replica (slave) using the following query
#########vacuum and analyze stats
SELECT relname,last_autovacuum,last_autoanalyze,autovacuum_count,autoanalyze_count FROM pg_stat_user_tables;
#########vacuum and analyze stats
I got the below data on slave server
MyTestDB=# SELECT relname,last_autovacuum,last_autoanalyze,autovacuum_count,autoanalyze_count FROM pg_stat_user_tables;
relname | last_autovacuum | last_autoanalyze | autovacuum_count | autoanalyze_count
-----------------------------------+-----------------+------------------+------------------+-------------------
Table1 | | | 0 | 0
Table2 | | | 0 | 0
Here the question is, does analyze/vacuum applicable for slave server? if so, it should contain some stats like last autovacuum_count, autoanalyze_count?
Note: As per this thread in postgres forum VACUUM and ANALYZE are automatically replicated into slave

VACUUM and ANALYZE won't run on standby servers (the results of these operations on the primary are replicated along with all other data), so you will never see statistics that show that they have run.

postgres "idle in transaction" for 13 hours

We recently saw a few queries "idle in transaction" for quite some time
pid | usename | state | duration | application_name | wait_event | wait_event_type
------+---------+---------------------+----------+------------------+------------+----------------
31620 | results | idle in transaction | 12:52:23 | bin/rails | |
That is almost 13 hours idle in transaction.
Any idea what causes them to get stuck in idle, or how to dig deeper? We did notice some OOM errors for background jobs.
There are also a lot of "idle" queries, but thanks for the comments, those seem to be fine:
In postgresql "idle in transaction" with all locks granted #LaurenzAlbe was pointing out the idle session timeout configuration option as a band-aid, but I'd rather understand this issue than hide it.
thanks!
PS: our application is ruby on rails and we use a mix of active record and custom SQL
EDIT: original title was "idle in transaction", the queries are actually just idle most of the time and not in transaction, sorry about that
EDIT #2: found the 13 hour idle in transaction process

These sessions are actually all idle, so they are no problem.
idle is significantly different from idle in transaction: the latter is an open transaction that holds locks and blocks VACUUM, the first is harmless.
The OOM errors must have a different reason.
You should configure the machine so that
shared_buffers + max_connections * work_mem <= available RAM

Postgresql pg_dump adding public to all schema names

I'm still a relative newbie to Postgresql, so pardon if this is simple ignorance.
I've setup a active/read-only pacemaker cluster of Postgres v9.4 per the cluster labs documentation.
I'm trying to verify that both databases are indeed in sync. I'm doing the dump on both hosts and checking the diff between the output. The command I'm using is:
pg_sql -U myuser mydb >dump-node-1.sql
Pacemaker shows the database status as 'sync' and querying Postgres directly also seems to indicate the sync is good... (Host .59 is my read-only standby node)
psql -c "select client_addr,sync_state from pg_stat_replication;"
+---------------+------------+
| client_addr | sync_state |
+---------------+------------+
| 192.16.111.59 | sync |
+---------------+------------+
(1 row)
However, when I do a dump on the read-only host I end up with all my tables having 'public.' added to the front of the names. So table foo on the master node dumps as 'foo' whereas on the read-only node it dumps as 'public.foo'. I don't understand why this is happening... I had done a 9.2 Postgresql cluster in a similar setup and didn't see this issue. I don't have tables in the public schema on the master node...
Hope someone can help me understand what is going on.
Much appreciated!

Per a_horse_with_no_name, the security updates in 9.4.18 changed the way the dump is written compared to 9.4.15. I didn't catch that one node was still running an older version. The command that identified the problem was his suggestion to run:
psql -c "select version();"

xlog - know if two databases came from the source

I have xlog questions that I'm not sure about.
1) I have two servers that were once slaves. How can I know if they were slaves of the same master? Is it possible to check if they were split from the same source in the past? I know pg_rewind knows how to check if, but is it possible to easily check it without running pg_rewind in dry run mode?
2) Is it true that if pg_last_xlog_replay_location is empty this server was never a slave?
3) Is it possible to know from the database itself to which master the slave is connected? I know to get this info from the recovery.conf or from the process attributes, but is it written in some system tables as well?
Thanks
Avi

were slaves of the same master
indirectly. you can compare select xmin,ctid,oid, datname from pg_database. of course dropping and creating postgres and template databases will change those, so this is very unreliable. but if you check those and find that ALL identifiers match - there's a good change that databases have same source.
more reliable and sophisticated method is comparing history file. Eg - if both ex slaves have same timeline, eg in case below 4:
-bash-4.2$ psql -d 'dbname=replication replication=true sslmode=require' -U replica -h 1.1.1.1 -c 'IDENTIFY_SYSTEM'
Password for user replica:
systemid | timeline | xlogpos
---------------------+----------+--------------
9999384298900975599 | 4 | F79/275B2328
(1 row)
you can check timelines history:
-bash-4.2$ psql -d 'dbname=replication replication=true sslmode=require' -U replica -h 1.1.1.1 -c 'TIMELINE_HISTORY 4'
Password for user replica:
filename | content
------------------+------------------------------------------------------
00000004.history | 1 9E/C3000090 no recovery target specified+
| +
| 2 C1/5A000090 no recovery target specified+
| +
| 3 A52/DB2F98B8 no recovery target specified+
|
(1 row)
If both servers have same timeline and same xlog position at which a timeline was created, you can say with much reliability, I believe, that came from same sourse.
empty pg_last_xlog_replay_location
I would say so. It was never a slave and was never recovered from WALs. At least I don't know how to reset pg_last_xlog_replay_location on promoted master...
system tables to tell to which master the slave is connected
Nothing suitable comes to my mind. If you are SU then you can read recovery.conf even without shell access, if you're not, you probably would not be able to select such a view...

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse