pg_basebackup fails with " too many connections for role "replication"" - postgresql

I am trying to set up a standby server and keep getting this error. My primary server has more than enough connections to handle the load:
listen_addresses = '*'
wal_level = hot_standby
max_wal_senders = 10
max_connections=100
checkpoint_segments = 8
wal_keep_segments = 8
archive_mode = on
archive_command = 'cp %p /var/lib/postgresql/archive/%f'
This is the command that fails on the standby server:
pg_basebackup -h ${MASTER_PORT_5432_TCP_ADDR} -D ${PGDATA} -U ${REP_USER} -vPw --xlog-method=stream
I don't understand why this happens.

Related

pgbench benchmark rises using pgbouncer

I've got Postgres 13 and PGbouncer.
When I make pgbench test, when I rise client connection and quantity of query,
tps also rises. Why?
For example:
pgbench -U postgres -h .... -p 6544 datafactory -c 700 -j 8 -t 50 -S
latency average = 1591.633 ms
tps = 439.799988 (including connections establishing)
tps = 442.163375 (excluding connections establishing)
pgbench -U postgres -h ... -p 6544 datafactory -c 700 -j 8 -t 100 -S
latency average = 1286.131 ms
tps = 544.268178 (including connections establishing)
tps = 545.953341 (excluding connections establishing)
pgbench -U postgres -h ... -p 6544 datafactory -c 700 -j 8 -t 300 -S
latency average = 1246.031 ms
tps = 561.783731 (including connections establishing)
tps = 562.399700 (excluding connections establishing)
Finally, if I lower amount of query, I get low level of tps:
pgbench -U postgres -h .. -p 6544 datafactory -c 700 -j 8 -t 10 -S
latency average = 8633.526 ms
tps = 81.079273 (including connections establishing)
tps = 81.465337 (excluding connections establishing)
So, I've got a question - why does the tps rise? I thought that if load increase, then tps should reduce.
If I connect to database directly on port 5433, not using PGbouncer, then if load increases, benchmark reduces
Additional information:
All 3 (pgbench, pgbouncer, db) are running on one machine.
When directly to port 5433, pgbouncer doesn't hold connection.
Here are examples of direct connection to database on port 5433:
pgbench -U postgres -h ... -p 5433 datafactory -c 250 -j 8 -t 500
latency average = 2155.394 ms
tps = 115.988055 (including connections establishing)
tps = 116.134037 (excluding connections establishing)
pgbench -U postgres -h ... -p 5433 datafactory -c 250 -j 8 -t 700
latency average = 835.555 ms
tps = 299.202467 (including connections establishing)
tps = 299.228977 (excluding connections establishing)
Here it fails
pgbench -U postgres -h ... -p 5433 datafactory -c 250 -j 8 -t 1000
WARNING: terminating connection because of crash of another server process
pgbench -U postgres -h ... -p 5433 datafactory -c 250 -j 8 -t 700
connection to database "datafactory" failed:
FATAL: the database system is in recovery mode
PGBOUNCER:
listen_port = 6544
listen_addr = '*'
auth_type = scram-sha-256
auth_file = /etc/pgbouncer/userlist.txt
auth_proxy = on
auth_failure_threshold = 3
auth_inactivity_period = 60
auth_last_size = 10
log_audit = 1
logfile = /pgerrorlogs/tkldd-ldisu0001/pgbouncer.log
pidfile = /var/run/pgbouncer/pgbouncer.pid
admin_users = pgbouncer
max_client_conn = 1000
pool_mode = transaction
min_pool_size = 0
default_pool_size = 30
max_db_connections = 30
max_user_connections = 30
ignore_startup_parameters = extra_float_digits

PgBouncer - Server DNS lookup failed

I am trying to test PgBouncer connection with pgbench on a PostgreSQL server. But I'm getting error as closing because: server DNS lookup failed and at pgbench getting this message:
pgbench -c 10 -t 10 -C -f C:\Users\Administrator\Downloads\query.sql -U postgres -p 6432 -n tags
Password:
pgbench: error: connection to server at "localhost" (::1), port 6432 failed: FATAL: client_login_timeout (server down)
This is observed in PgBouncer log:
LOG C-01715ff0: tags/postgres#[::1]:52768 login attempt: db=tags user=postgres tls=no
WARNING DNS lookup failed: localhost: result=11001
LOG S-0174d218: tags/postgres#(bad-af):0 closing because: server DNS lookup failed (age=0s)
I'm running PostgreSQL-14.5 on Windows Server 2019
PgBouncer Config:
[databases]
postgres = host=localhost port=5435
tags = dbname=postgres host=localhost port=5435 user=postgres password=Admin#!23 auth_user=postgres
[pgbouncer]
logfile = C:\Program Files (x86)\PgBouncer\log\pgbouncer.log
pidfile = C:\Program Files (x86)\PgBouncer\log\pgbouncer.pid
listen_addr = *
listen_port = 6432
auth_type = md5
auth_file = C:\Program Files (x86)\PgBouncer\etc\userlist.txt
admin_users = postgres
stats_users = postgres
pool_mode = session
max_client_conn = 100
default_pool_size = 20
I tried solutions suggested in this post, like, setting
listen_addresses = '*' in postgresql.conf
but could not get the issue resolved. I also searched over the internet for the solution & did not find any.
Am I missing anything or please suggest a solution to this issue. Thanks in advance.
I can connect to Postgres DB after setting [databases] configuration in pgbouncer.ini like this : tags = host=127.0.0.1 port=5435 auth_user=postgres dbname=postgres.
Turns out the issue was with localhost not added in pg_hba.conf.

Run PostgreSQL streaming replication synchronous and asynchronous simultaneously

I am using PostgreSQL 14 and ubuntu as my OS.
I have done the database replication in synchronous mode
Now I want to add another server and the relationship between primary and new_standby node will be asynchronous.
Another problem, how to set application_name for different nodes? I have got same application_name=14/main for my two standby servers
Can anyone help me to solve these issues?
Yes, you can run it simultaneously.
You should have this configuration in your primary node:
listen_addresses = '*'
port = 5432
wal_level = hot_standby
max_wal_senders = 16
wal_keep_segments = 32
synchronous_commit = on
synchronous_standby_names = 'pgsql_0_node_0'
Restart the node to take the changes:
$ systemctl restart postgresql-14
Create the replication role:
$ CREATE ROLE replication_user WITH LOGIN PASSWORD 'PASSWORD' REPLICATION;
And configure this in your standby nodes:
Both:
wal_level = hot_standby
hot_standby = on
Sync:
standby_mode = 'on'
promote_trigger_file='/tmp/failover_5432.trigger'
recovery_target_timeline=latest
primary_conninfo='application_name=pgsql_0_node_0 host=PRIMARY_NODE port=5432 user=replication_user password=PASSWORD'
Replace PRIMARY_NODE, user, and password with the correct values.
Async:
promote_trigger_file='/tmp/failover_5432.trigger'
recovery_target_timeline=latest
primary_conninfo='application_name=pgsql_0_node_1 host=PRIMARY_NODE port=5432 user=replication_user password=PASSWORD'
Replace PRIMARY_NODE, user, and password with the correct values.
Restart the node to take the changes:
$ systemctl restart postgresql-14
Then, you can run this in your Primary node to see the replication nodes:
$ SELECT pid,usename,application_name,state,sync_state FROM pg_stat_replication;
pid | usename | application_name | state | sync_state
-------+------------------+------------------+-----------+------------
10951 | replication_user | pgsql_0_node_1 | streaming | async
10952 | replication_user | pgsql_0_node_0 | streaming | sync
(2 rows)

Cannot connect Barman to PostgreSQL 12

I have 2 ubuntu-20.04 VM on VMWARE with Postgres 12 installed on each
pgprimary on ip 192.168.1.131
pgbackup on ip 192.168.1.130
barman CLI tools are installed on pgprimary
barman is installed on pgbackup
I want to backup data from pgprimary on pgbackupsame 2 users as Postgress users
on each machine I created
2 Linux sudoist users
useradd barman
useradd streaming_barman
also created the same two user as Postgress users
createuser --superuser --replication -P barman
createuser --superuser --replication -P streaming_barman
here are relevant parts on the configuration files
On pgprimary
postgressql.conf
listen_addresses = '*' # what IP address(es) to listen on;
port = 5432
archive_mode = on
archive_command = 'cp %p /var/lib/postgresql/12/arc/%f'
wal_level = replica
restore_command = 'cp /var/lib/postgresql/12/arc/%f %p'
recovery_target_time = '2021-03-24 16:18:11.319298+05:30'
recovery_target_inclusive = false
pg_hba.conf
local all postgres peer
# TYPE DATABASE USER ADDRESS METHOD
local all all peer
host all all 127.0.0.1/32 md5
host all all ::1/128 md5
#local replication all peer
#host replication all 127.0.0.1/32 md5
#host replication all ::1/128 md5
# FOR TESTING
local replication all trust
host replication all 127.0.0.1/32 trust
host replication all ::1/128 trust
also did
firewall-cmd --permanent --add-port=5432/tcp
firewall-cmd --reload
========================
con
On pgbackup
sudo cat <<'EOF' >> /etc/barman.d/pgprimary.conf
[pgprimary]
description = "Example of PostgreSQL Database (Streaming-Only)"
conninfo = host=192.168.1.131 user=barman dbname=training
streaming_conninfo = host=192.168.1.131 user=streaming_barman dbname=training
backup_method = postgres
streaming_archiver = on
slot_name = barman
create_slot = auto
EOF
pg_hba.conf
cat <<'EOF' >>~/.pgpass
pgprimary:*:*:barman:barman
pgprimary:*:*:streaming_barman:barman
EOF
Then I did
barman cron
Output
Starting WAL archiving for server pgprimary
Starting streaming archiver for server pgprimary
barman check pgprimary
Then I get this error
[13643] barman.utils WARNING: Failed opening the requested log file. Using standard error instead.
Server pgprimary:
2021-10-30 21:39:15,982 [13643] barman.server ERROR: Check 'WAL archive' failed for server 'pgprimary'
WAL archive: FAILED (please make sure WAL shipping is setup)
2021-10-30 21:39:37,006 [13643] barman.postgres WARNING: Error retrieving PostgreSQL status: connection to server at "192.168.131" (192.168.0.131), port 5432 failed: Connection refused
2021-10-30 21:39:58,021 [13643] barman.server ERROR: Check 'check timeout' failed for server 'pgprimary'
check timeout: FAILED (barman check command timed out)
Why cannot connect barman to the server ?
UPDATE:
psql -h 192.168.1.131 -U barman -d training
Password for user barman:
psql (12.8 (Ubuntu 12.8-0ubuntu0.20.04.1))
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
Type "help" for help.
I also can connect to server via netstat

Postgres replication not starting due to wal error

I am using postgres version 9.3.2 on two servers one master, one primary.
I am setting up replication as follows:-
On master:-
sudo -u postgres psql -c "CREATE USER replicator REPLICATION LOGIN ENCRYPTED PASSWORD 'FOO’;"
Edit postgresql.conf
listen_address = '*'
wal_level = hot_standby
max_wal_senders = 32
checkpoint_segments = 8
wal_keep_segments = 100
Edit pg_hba.conf
hostssl replication replicator <SLAVE IP>/32 md5
On Slave:-
Edit postgresql.conf
wal_level = hot_standby
max_wal_senders = 3
checkpoint_segments = 8
wal_keep_segments = 8
hot_standby = on
Run
sudo service postgresql stop
sudo -u postgres rm -rf /var/lib/postgresql/9.3/main
sudo -u postgres pg_basebackup -h <MASTER IP> -D /var/lib/postgresql/9.3/main -U replicator -v -P
CREATE /var/lib/postgresql/9.3/main/recovery.conf
standby_mode = 'on'
primary_conninfo = 'host=<MASTER IP> port=5432 user=replicator password=FOO sslmode=require'
trigger_file = '/tmp/postgresql.trigger'
Run:-
sudo service postgresql restart
When I restart postgres on the slave I get this error message:-
LOG: database system was shut down at 2015-01-14 09:10:50 GMT 2015-01-14 09:11:01 GMT [16741-2]
LOG: entering standby mode 2015-01-14 09:11:01 GMT [16741-3] WARNING: WAL was generated with wal_level=minimal, data may be missing 2015-01-14 09:11:01 GMT [16741-4] HINT: This happens if you temporarily set wal_level=minimal without taking a new base backup. 2015-01-14 09:11:01 GMT [16741-5]
FATAL: hot standby is not possible because wal_level was not set to "hot_standby" on the master server 2015-01-14 09:11:01 GMT [16741-6] HINT: Either set wal_level to "hot_standby" on the master, or turn off hot_standby here. 2015-01-14 09:11:01 GMT [16740-1]
LOG: startup process (PID 16741) exited with exit code 1 2015-01-14 09:11:01 GMT [16740-2] LOG: aborting startup due to startup process failure ... failed!
Why is this happening? I have checked and rechecked that on the master wal_level is set to hot_standby. On the master running "show all" shows that this is the case? I am at a loss as to what I am doing wrong here.
You have to restart primary database again to let the current WAL file replayed on the standby, since this WAL file was generated when wal_level=archive.