Postgres replication not starting due to wal error - postgresql

I am using postgres version 9.3.2 on two servers one master, one primary.
I am setting up replication as follows:-
On master:-
sudo -u postgres psql -c "CREATE USER replicator REPLICATION LOGIN ENCRYPTED PASSWORD 'FOO’;"
Edit postgresql.conf
listen_address = '*'
wal_level = hot_standby
max_wal_senders = 32
checkpoint_segments = 8
wal_keep_segments = 100
Edit pg_hba.conf
hostssl replication replicator <SLAVE IP>/32 md5
On Slave:-
Edit postgresql.conf
wal_level = hot_standby
max_wal_senders = 3
checkpoint_segments = 8
wal_keep_segments = 8
hot_standby = on
Run
sudo service postgresql stop
sudo -u postgres rm -rf /var/lib/postgresql/9.3/main
sudo -u postgres pg_basebackup -h <MASTER IP> -D /var/lib/postgresql/9.3/main -U replicator -v -P
CREATE /var/lib/postgresql/9.3/main/recovery.conf
standby_mode = 'on'
primary_conninfo = 'host=<MASTER IP> port=5432 user=replicator password=FOO sslmode=require'
trigger_file = '/tmp/postgresql.trigger'
Run:-
sudo service postgresql restart
When I restart postgres on the slave I get this error message:-
LOG: database system was shut down at 2015-01-14 09:10:50 GMT 2015-01-14 09:11:01 GMT [16741-2]
LOG: entering standby mode 2015-01-14 09:11:01 GMT [16741-3] WARNING: WAL was generated with wal_level=minimal, data may be missing 2015-01-14 09:11:01 GMT [16741-4] HINT: This happens if you temporarily set wal_level=minimal without taking a new base backup. 2015-01-14 09:11:01 GMT [16741-5]
FATAL: hot standby is not possible because wal_level was not set to "hot_standby" on the master server 2015-01-14 09:11:01 GMT [16741-6] HINT: Either set wal_level to "hot_standby" on the master, or turn off hot_standby here. 2015-01-14 09:11:01 GMT [16740-1]
LOG: startup process (PID 16741) exited with exit code 1 2015-01-14 09:11:01 GMT [16740-2] LOG: aborting startup due to startup process failure ... failed!
Why is this happening? I have checked and rechecked that on the master wal_level is set to hot_standby. On the master running "show all" shows that this is the case? I am at a loss as to what I am doing wrong here.

You have to restart primary database again to let the current WAL file replayed on the standby, since this WAL file was generated when wal_level=archive.

Related

Cannot connect Barman to PostgreSQL 12

I have 2 ubuntu-20.04 VM on VMWARE with Postgres 12 installed on each
pgprimary on ip 192.168.1.131
pgbackup on ip 192.168.1.130
barman CLI tools are installed on pgprimary
barman is installed on pgbackup
I want to backup data from pgprimary on pgbackupsame 2 users as Postgress users
on each machine I created
2 Linux sudoist users
useradd barman
useradd streaming_barman
also created the same two user as Postgress users
createuser --superuser --replication -P barman
createuser --superuser --replication -P streaming_barman
here are relevant parts on the configuration files
On pgprimary
postgressql.conf
listen_addresses = '*' # what IP address(es) to listen on;
port = 5432
archive_mode = on
archive_command = 'cp %p /var/lib/postgresql/12/arc/%f'
wal_level = replica
restore_command = 'cp /var/lib/postgresql/12/arc/%f %p'
recovery_target_time = '2021-03-24 16:18:11.319298+05:30'
recovery_target_inclusive = false
pg_hba.conf
local all postgres peer
# TYPE DATABASE USER ADDRESS METHOD
local all all peer
host all all 127.0.0.1/32 md5
host all all ::1/128 md5
#local replication all peer
#host replication all 127.0.0.1/32 md5
#host replication all ::1/128 md5
# FOR TESTING
local replication all trust
host replication all 127.0.0.1/32 trust
host replication all ::1/128 trust
also did
firewall-cmd --permanent --add-port=5432/tcp
firewall-cmd --reload
========================
con
On pgbackup
sudo cat <<'EOF' >> /etc/barman.d/pgprimary.conf
[pgprimary]
description = "Example of PostgreSQL Database (Streaming-Only)"
conninfo = host=192.168.1.131 user=barman dbname=training
streaming_conninfo = host=192.168.1.131 user=streaming_barman dbname=training
backup_method = postgres
streaming_archiver = on
slot_name = barman
create_slot = auto
EOF
pg_hba.conf
cat <<'EOF' >>~/.pgpass
pgprimary:*:*:barman:barman
pgprimary:*:*:streaming_barman:barman
EOF
Then I did
barman cron
Output
Starting WAL archiving for server pgprimary
Starting streaming archiver for server pgprimary
barman check pgprimary
Then I get this error
[13643] barman.utils WARNING: Failed opening the requested log file. Using standard error instead.
Server pgprimary:
2021-10-30 21:39:15,982 [13643] barman.server ERROR: Check 'WAL archive' failed for server 'pgprimary'
WAL archive: FAILED (please make sure WAL shipping is setup)
2021-10-30 21:39:37,006 [13643] barman.postgres WARNING: Error retrieving PostgreSQL status: connection to server at "192.168.131" (192.168.0.131), port 5432 failed: Connection refused
2021-10-30 21:39:58,021 [13643] barman.server ERROR: Check 'check timeout' failed for server 'pgprimary'
check timeout: FAILED (barman check command timed out)
Why cannot connect barman to the server ?
UPDATE:
psql -h 192.168.1.131 -U barman -d training
Password for user barman:
psql (12.8 (Ubuntu 12.8-0ubuntu0.20.04.1))
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
Type "help" for help.
I also can connect to server via netstat

Running postgres container getting superuser password error?

I am trying to set up a postgres container to start and run initializing the creation of a table. I've succeeded with the straight image from docker but now that I am trying to extend the image a little to create tables when it's produced and I can't get it running. Based off what I've read here How to create User/Database in script for Docker Postgres, this is what I have:
Dockerfile:
FROM library/postgres
COPY init.sql /docker-entrypoint-initdb.d/
init.sql:
CREATE TABLE incident_disposition (
incident_disposition_code VARCHAR,
incident_disposition_code_description VARCHAR
);
From what I understand, FROM library . . . pulls the postgres image from docker hub and the COPY pushes my init.sql script into the entry point so there is no need for a big dockerfile correct?
I then build the image no issue:
Build
docker build -t my_postgres_image .
But when I run I get the issues:
Run
docker run --name testing my_postgres_image --publish 8000:8080 --detach -e POSTGRES_PASSWORD=postgres -d postgres
Errors from logs
Error: Database is uninitialized and superuser password is not specified.
You must specify POSTGRES_PASSWORD to a non-empty value for the
superuser. For example, "-e POSTGRES_PASSWORD=password" on "docker run".
You may also use "POSTGRES_HOST_AUTH_METHOD=trust" to allow all
connections without a password. This is *not* recommended.
See PostgreSQL documentation about "trust":
https://www.postgresql.org/docs/current/auth-trust.html
Attempt from comment:
docker container logs testing
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
The database cluster will be initialized with locale "en_US.utf8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
Data page checksums are disabled.
fixing permissions on existing directory /var/lib/postgresql/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Etc/UTC
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok
Success. You can now start the database server using:
pg_ctl -D /var/lib/postgresql/data -l logfile start
initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
waiting for server to start....2020-03-26 14:06:51.064 UTC [46] LOG: starting PostgreSQL 12.2 (Debian 12.2-2.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
2020-03-26 14:06:51.072 UTC [46] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2020-03-26 14:06:51.108 UTC [47] LOG: database system was shut down at 2020-03-26 14:06:50 UTC
2020-03-26 14:06:51.119 UTC [46] LOG: database system is ready to accept connections
done
server started
/usr/local/bin/docker-entrypoint.sh: running /docker-entrypoint-initdb.d/init.sql
CREATE TABLE
2020-03-26 14:06:51.231 UTC [46] LOG: received fast shutdown request
waiting for server to shut down....2020-03-26 14:06:51.232 UTC [46] LOG: aborting any active transactions
2020-03-26 14:06:51.233 UTC [46] LOG: background worker "logical replication launcher" (PID 53) exited with exit code 1
2020-03-26 14:06:51.234 UTC [48] LOG: shutting down
2020-03-26 14:06:51.290 UTC [46] LOG: database system is shut down
done
server stopped
PostgreSQL init process complete; ready for start up.
2020-03-26 14:06:51.345 UTC [1] LOG: starting PostgreSQL 12.2 (Debian 12.2-2.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
2020-03-26 14:06:51.345 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432
2020-03-26 14:06:51.345 UTC [1] LOG: listening on IPv6 address "::", port 5432
2020-03-26 14:06:51.361 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2020-03-26 14:06:51.387 UTC [64] LOG: database system was shut down at 2020-03-26 14:06:51 UTC
2020-03-26 14:06:51.398 UTC [1] LOG: database system is ready to accept connections
2020-03-26 14:07:27.715 UTC [72] ERROR: relation "incident_disposition" does not exist at character 15
2020-03-26 14:07:27.715 UTC [72] STATEMENT: select * from incident_disposition;
In addition to comments
Due to recent docker image's updates postgres images do not allow to connect to DB without a password from anywhere. So you need to specify username/password
docker run -p 8000:8080 -e POSTGRES_PASSWORD=postgres --name testing -d my_postgres_image
Or if you still don't want to use password, you can just set POSTGRES_HOST_AUTH_METHOD=trust environment variable:
docker run -p 8000:8080 -e POSTGRES_HOST_AUTH_METHOD=trust --name testing -d my_postgres_image
It is a typical Initialization scripts issue.
You can file the full explaination in postgresql docker page. https://hub.docker.com/_/postgres
Here is the brief intro:
1. One common problem is that if one of your /docker-entrypoint-initdb.d scripts fails (which will cause the entrypoint script to exit) and your orchestrator restarts the container with the already initialized data directory, it will not continue on with your scripts.
note:
in your case, you may need clean the historical docker containers(stopped) by
step 1: docker ps |grep
step 2: docker rm -f -v
Or if you are using docker-compose, the historical orchestrator could be easily removed by docker-compose down -v.

Couldn't restore postgres v11 from pg_basebackup

I'm making a backup from my postgresql database (v11) using pg_basebackup like this:
pg_basebackup -h localhost -p 5432 -U postgres -D /tmp/backup -Ft -z -Xs -P
The backup process is completed just fine without any errors.Then as a test I'd like to recover from the backup and for that I do the following steps:
Shutdown the postgres instance.
Delete everything from /var/lib/postgresql/data folder
rm -rf /var/lib/postgresl/data/*
Untar the base archive there:
tar xvf /tmp/backup/base.tar.gz -C /var/lib/postgresql/data
Untar the wal archives to a temporary directory
tar xvf /tmp/backup/pg_wal.tar.gz -C /tmp/archived_wals
Create a file recovery.conf in the /var/lib/postgresql/data folder:
touch /var/lib/postgresql/data/recovery.conf
chown postgres:postgres /var/lib/postgresql/data/recovery.conf
Specify the recovery command in the recovery.conf file:
restore_command = 'cp /path/archived_wals/%f "%p" '
Restart the postgres instance, and wait for the recovery to be finished.
However when postgres starts up I get the following error in the logs:
2020-02-04 11:34:52.599 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432
2020-02-04 11:34:52.599 UTC [1] LOG: listening on IPv6 address "::", port 5432
2020-02-04 11:34:52.613 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2020-02-04 11:34:52.709 UTC [18] LOG: database system was interrupted; last known up at 2020-02-04 08:39:54 UTC
cp: can't stat '/tmp/archived_wals/00000002.history': No such file or directory
2020-02-04 11:34:52.735 UTC [18] LOG: starting archive recovery
2020-02-04 11:34:52.752 UTC [18] LOG: invalid checkpoint record
2020-02-04 11:34:52.752 UTC [18] FATAL: could not locate required checkpoint record
2020-02-04 11:34:52.752 UTC [18] HINT: If you are not restoring from a backup, try removing the file "/var/lib/postgresql/data/backup_label".
2020-02-04 11:34:52.753 UTC [1] LOG: startup process (PID 18) exited with exit code 1
2020-02-04 11:34:52.753 UTC [1] LOG: aborting startup due to startup process failure
2020-02-04 11:34:52.769 UTC [1] LOG: database system is shut down
I guess I'm missing something, maybe I need to do additional configuration to postgres to be able to restore a pg_basebackup backup?
Try this:
/pgsql-11/bin/pg_basebackup -D /pgdata/pg11/data -X stream --waldir=/pglog/pg11/wal_log/ -P -c fast -h remoteserver\localserver -U username
Let me know if that works.
After all, it was my bad.
I enclosed the destination placeholder within % instead of double quotes in recovery.conf.
However the way I described for the backup and restore is valid and since I corrected the typo it works just fine for me, so I leave the question as it is for further reference.

pg_basebackup fails with " too many connections for role "replication""

I am trying to set up a standby server and keep getting this error. My primary server has more than enough connections to handle the load:
listen_addresses = '*'
wal_level = hot_standby
max_wal_senders = 10
max_connections=100
checkpoint_segments = 8
wal_keep_segments = 8
archive_mode = on
archive_command = 'cp %p /var/lib/postgresql/archive/%f'
This is the command that fails on the standby server:
pg_basebackup -h ${MASTER_PORT_5432_TCP_ADDR} -D ${PGDATA} -U ${REP_USER} -vPw --xlog-method=stream
I don't understand why this happens.

Postresql 9.3 replication not starting after pg_basebackup completes

I am trying to create a hot_standby server, and I receive the following error after pg_basebackup completes. Notice I use a shell script, replicator.sh, to start the replication. Can anyone give me some insight?
My specs:
Debian Wheezy 7.6
Postgresql 9.3
Database size: ~115GB
Error:
postgres#database-master:/etc/postgresql/9.3/main$ sh replicator.sh
Stopping PostgreSQL
[ ok ] Stopping PostgreSQL 9.3 database server: main.
Cleaning up old cluster directory
Starting base backup as replicator
Password:
113720266/113720266 kB (100%), 1/1 tablespace
NOTICE: WAL archiving is not enabled; you must ensure that all required WAL segments are copied through other means to complete the backup
pg_basebackup: base backup completed
Starting Postgresql
[....] Starting PostgreSQL 9.3 database server: main[....] The PostgreSQL server failed to start.
Please check the log output: 2014-09-11 17:56:33 UTC LOG: database system was interrupted; last
known up at 2014-09-11 16:54:29 UTC 2014-09-11 17:56:33 UTC LOG: creating missing WAL directory
"pg_xlog/archive_status" 2014-09-11 17:56:33 UTC LOG: incomplete startup packet 2014-09-11 17:56:33
UTC LOG: invalid checkpoint record 2014-09-11 17:56:33 UTC FATAL: could not locate required
checkpoint record 2014-09-11 17:56:33 UTC HINT: If you are not restoring from a backup, try
removing the file "/var/lib/p[FAILesql/9.3/main/backup_label". 2014-09-11 17:56:33 UTC LOG: startup
process (PID 21972) exited with exit code 1 2014-09-11 17:56:33 UTC LOG: aborting startup due to
startup process failure ... failed! failed!
Contents of replicator.sh:
#!/bin/bash
echo Stopping PostgreSQL
/etc/init.d/postgresql stop
echo Cleaning up old cluster directory
rm -rf /var/lib/postgresql/9.3/main
echo Starting base backup as replicator
pg_basebackup -h 123.456.789.123 -D /var/lib/postgresql/9.3/main -U replicator -v -P
echo Writing recovery.conf file
sudo -u postgres bash -c "cat > /var/lib/postgresql/9.3/main/recovery.conf <<- _EOF1_
standby_mode = 'on'
primary_conninfo = 'host=123.456.789.123 port=5432 user=replicator password=XXXXX sslmode=require'
trigger_file = '/tmp/postgresql.trigger'
_EOF1_
"
echo Starting Postgresql
/etc/init.d/postgresql start
Thank you,
Jake
My best guess from the above is that the pg_basebackup failed and your shell script doesn't check for error return codes or use set -e to automatically abort after errors, so it just carried on regardless.
It's also possible that you don't have WAL archiving configured, or don't have a restore_command set in the replica. In that case, the transaction logs required to start the base backup will not be available and startup will fail.
I strongly recommend that you:
Use pg_basebackup -X stream so that the required transaction logs get copied along with the backup; and
Use set -e in your shell script, or test for errors with a suitable if ! pg_basebackup .... ; then block.