WAL archive: FAILED (please make sure WAL shipping is setup) - postgresql

I am trying to configure Barman to backup. When I do a barman check replica I keep getting:
Server replica:
WAL archive: FAILED (please make sure WAL shipping is setup)
PostgreSQL: OK
superuser: OK
wal_level: OK
directories: OK
retention policy settings: OK
backup maximum age: FAILED (interval provided: 1 day, latest backup age: No available backups)
compression settings: OK
failed backups: OK (there are 0 failed backups)
minimum redundancy requirements: FAILED (have 0 backups, expected at least 2)
ssh: OK (PostgreSQL server)
not in recovery: FAILED (cannot perform exclusive backup on a standby)
archive_mode: OK
archive_command: OK
continuous archiving: OK
archiver errors: OK
I am using Postgresql 9.6 and barman 2.1; I am not sure as to what the issue is could someone help?
Here is my Barman server configuration:
description = "Database backup"
conninfo = host=<db-ip> user=postgres dbname=db
backup_method = rsync
ssh_command = ssh postgres#<db-ip>
archiver = on

barman check tries to confirm that archiving is set up correctly by asserting that there's actually something in the archive. However, WAL segments are generally only archived once they're filled up, and if your server is idle, this is never going to happen.
To work around this, Barman provides a command to force a segment switch, wait for the completed WAL to show up, and then archive it immediately:
barman switch-xlog --force --archive replica

in brief
Barman's incoming_wals_directory and Postgresql.conf's archive_command not matched as described in details here
details
Another cause is that the not matched between
Barman's incoming_wals_directory
Postgresql.conf's archive_command
Bash util to check
barman#backup $ barman show-server pg | grep incoming_wals_directory
# output1
# > incoming_wals_directory: /var/lib/barman/pg/incoming
postgres#pg $ cat /etc/postgresql/10/main/postgresql.conf | grep archive_command
# output2
# > archive_command = 'rsync -a %p barman#staging:/var/lib/barman/pg/incoming/%f'
We must have same path in :output1 and :output2
Make them matched if they don't and don't forget to restart postgres afterward.

Related

Unable to do backup using barman due to systemid error

I am trying to backup using barman command: barman backup pg but it shown error like
ERROR: Impossible to start the backup. Check the log for more details, or run 'barman check pg'
Later I checked using barman command: barman check pg I found another error
systemid coherence: FAILED . Next I check systemid of postgres at barman, I found systemdid is different.
What need to do in this case?
I removed identity.json file form barman. Though somehow it solved my issue. But I am not sure whether it is right way or not, to solve this issue?
What is the actual use of identity.json? i am looking for expert opinion.
Server pg:
PostgreSQL: OK
superuser or standard user with backup privileges: OK
PostgreSQL streaming: OK
wal_level: OK
replication slot: OK
directories: OK
retention policy settings: OK
backup maximum age: OK (interval provided: 1 day, latest backup age: 2 hours, 57 minutes, 55 seconds)
backup minimum size: OK (876.1 MiB)
wal maximum age: OK (no last_wal_maximum_age provided)
wal size: OK (31.5 KiB)
compression settings: OK
failed backups: OK (there are 0 failed backups)
minimum redundancy requirements: OK (have 3 backups, expected at least 1)
ssh: OK (PostgreSQL server)
systemid coherence: FAILED (the system Id of the connected PostgreSQL server changed, stored in "/var/lib/barman/pg/identity.json")
pg_receivexlog: OK
pg_receivexlog compatible: OK
receive-wal running: OK
archive_mode: OK
archive_command: OK
continuous archiving: OK
archiver errors: FAILED (duplicates: 50)

Recover Postgresql pgBarman

I've setup a postgresql DB and I want to backup it.
I've 1 server with my main DB et 1 with Barman.
All the setup is working, I can backup my DB with barman.
I just don't understand how I can recover my DB on a exact time point between the backups that I do everyday.
barman#ubuntu:~$ barman check main-db-server
WARNING: No backup strategy set for server 'main-db-server' (using default 'exclusive_backup').
WARNING: The default backup strategy will change to 'concurrent_backup' in the future. Explicitly set 'backup_options' to silence this warning.
Server main-db-server:
PostgreSQL: OK
is_superuser: OK
wal_level: OK
directories: OK
retention policy settings: OK
backup maximum age: OK (interval provided: 1 day, latest backup age: 9 minutes, 59 seconds)
compression settings: OK
failed backups: OK (there are 0 failed backups)
minimum redundancy requirements: OK (have 6 backups, expected at least 0)
ssh: OK (PostgreSQL server)
not in recovery: OK
systemid coherence: OK (no system Id available)
archive_mode: OK
archive_command: OK
continuous archiving: OK
archiver errors: OK
And when I backup my DB
barman#ubuntu:~$ barman backup main-db-server
WARNING: No backup strategy set for server 'main-db-server' (using default 'exclusive_backup').
WARNING: The default backup strategy will change to 'concurrent_backup' in the future. Explicitly set 'backup_options' to silence this warning.
Starting backup using rsync-exclusive method for server main-db-server in /var/lib/barman/main-db-server/base/20210427T150505
Backup start at LSN: 0/1C000028 (00000005000000000000001C, 00000028)
Starting backup copy via rsync/SSH for 20210427T150505
Copy done (time: 2 seconds)
Asking PostgreSQL server to finalize the backup.
Backup size: 74.0 MiB. Actual size on disk: 34.9 KiB (-99.95% deduplication ratio).
Backup end at LSN: 0/1C0000C0 (00000005000000000000001C, 000000C0)
Backup completed (start time: 2021-04-27 15:05:05.289717, elapsed time: 11 seconds)
Processing xlog segments from file archival for main-db-server
00000005000000000000001B
00000005000000000000001C
00000005000000000000001C.00000028.backup
I don't know how to restore my DB on a time between 2 backups :/
Thanks

Fatal error starting postgres

I'm unfamiliar with how to use postgres and need some help. I'm currently running OSX Yosemite.
When I start postgres I get this:
pg_ctl: could not start server
Examine the log output.
There was an error executing [start] on postgres. Check /Users/work/git/proj/var/log/postgres.log for details.
createuser: could not connect to database postgres: FATAL: could not open relation mapping file "global/pg_filenode.map": No such file or directory
The log is below.
When I try to stop postgres I get this:
Postgres not running
And when I run ps -ef |grep postgres I get this:
20010 13398 1 0 Jul07 ? 00:00:00 /usr/pgsql-9.3/bin/postgres -h -k /Users/work/git/proj/var/pg
20010 13399 13398 0 Jul07 ? 00:00:09 postgres: logger process
20010 13401 13398 0 Jul07 ? 00:00:10 postgres: checkpointer process
20010 13402 13398 0 Jul07 ? 00:00:00 postgres: writer process
20010 13403 13398 0 Jul07 ? 00:00:00 postgres: wal writer process
20010 13404 13398 0 Jul07 ? 00:00:36 postgres: autovacuum launcher process
20010 13405 13398 0 Jul07 ? 00:00:02 postgres: stats collector process
20010 18112 17723 0 10:22 pts/0 00:00:00 grep postgres
What does this all mean and how could I possibly fix this?
log text
Postgres data dir doesn't exist. Creating
The files belonging to this database system will be owned by user "rose.smith".
This user must also own the server process.
The database cluster will be initialized with locale "C".
The default database encoding has accordingly been set to "SQL_ASCII".
The default text search configuration will be set to "english".
Data page checksums are disabled.
creating directory /Users/work/git/proj/postgres ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
creating configuration files ... ok
creating template1 database in /Users/work/git/proj/postgres/base/1 ... ok
initializing pg_authid ... ok
initializing dependencies ... ok
creating system views ... ok
loading system objects' descriptions ... ok
creating collations ... ok
creating conversions ... ok
creating dictionaries ... ok
setting privileges on built-in objects ... ok
creating information schema ... ok
loading PL/pgSQL server-side language ... ok
vacuuming database template1 ... ok
copying template1 to template0 ... ok
copying template1 to postgres ... ok
syncing data to disk ... ok
Success. You can now start the database server using:
/usr/pgsql-9.3/bin/postgres -D /Users/work/git/proj/postgres
or
/usr/pgsql-9.3/bin/pg_ctl -D /Users/work/git/proj/postgres -l logfile start
waiting for server to start....< 2015-06-04 17:24:57.966 GMT >LOG: redirecting log output to logging collector process
< 2015-06-04 17:24:57.966 GMT >HINT: Future log output will appear in directory "pg_log".
done
server started
waiting for server to shut down.... done
server stopped
waiting for server to start....< 2015-06-04 18:10:18.044 GMT >LOG: redirecting log output to logging collector process
< 2015-06-04 18:10:18.044 GMT >HINT: Future log output will appear in directory "pg_log".
done
server started
"/Users/work/git/proj/var/log/postgres.log" 413L, 20935C
after running /usr/pgsql-9.3/bin/postgres -D /Users/work/git/proj/postgres
< 2015-07-08 14:40:36.331 GMT >FATAL: lock file "postmaster.pid" already exists
< 2015-07-08 14:40:36.331 GMT >HINT: Is another postmaster (PID 18145) running in data directory "/Users/work/git/proj/postgres"?
I can't speak to why this worked after trying these commands just a few minutes ago, but it is now working. Good luck to anyone else with the same problem.
stop postgres
killall postgres
remove postgres database with rm -rf postgres
start postgres
This website was helpful. I think my problem may have been the same as his.
I had deleted ~/Library/Containers/com.heroku.postgres or ~/Application Support/Postgres/ while the Postgres.app was still running. The old version was still running since I deleted the pid file, and it didn't know how to shut it down.
Source: https://github.com/PostgresApp/PostgresApp/issues/96
I faced same issue. I solved the problem with the following commands.
If you install postgresql using HomeBrew...
rm /usr/local/var/postgres/postmaster.pid
pg_ctl -D /usr/local/var/postgres -l /usr/local/var/postgres/server.log start
Hope this helps you!

Postresql 9.3 replication not starting after pg_basebackup completes

I am trying to create a hot_standby server, and I receive the following error after pg_basebackup completes. Notice I use a shell script, replicator.sh, to start the replication. Can anyone give me some insight?
My specs:
Debian Wheezy 7.6
Postgresql 9.3
Database size: ~115GB
Error:
postgres#database-master:/etc/postgresql/9.3/main$ sh replicator.sh
Stopping PostgreSQL
[ ok ] Stopping PostgreSQL 9.3 database server: main.
Cleaning up old cluster directory
Starting base backup as replicator
Password:
113720266/113720266 kB (100%), 1/1 tablespace
NOTICE: WAL archiving is not enabled; you must ensure that all required WAL segments are copied through other means to complete the backup
pg_basebackup: base backup completed
Starting Postgresql
[....] Starting PostgreSQL 9.3 database server: main[....] The PostgreSQL server failed to start.
Please check the log output: 2014-09-11 17:56:33 UTC LOG: database system was interrupted; last
known up at 2014-09-11 16:54:29 UTC 2014-09-11 17:56:33 UTC LOG: creating missing WAL directory
"pg_xlog/archive_status" 2014-09-11 17:56:33 UTC LOG: incomplete startup packet 2014-09-11 17:56:33
UTC LOG: invalid checkpoint record 2014-09-11 17:56:33 UTC FATAL: could not locate required
checkpoint record 2014-09-11 17:56:33 UTC HINT: If you are not restoring from a backup, try
removing the file "/var/lib/p[FAILesql/9.3/main/backup_label". 2014-09-11 17:56:33 UTC LOG: startup
process (PID 21972) exited with exit code 1 2014-09-11 17:56:33 UTC LOG: aborting startup due to
startup process failure ... failed! failed!
Contents of replicator.sh:
#!/bin/bash
echo Stopping PostgreSQL
/etc/init.d/postgresql stop
echo Cleaning up old cluster directory
rm -rf /var/lib/postgresql/9.3/main
echo Starting base backup as replicator
pg_basebackup -h 123.456.789.123 -D /var/lib/postgresql/9.3/main -U replicator -v -P
echo Writing recovery.conf file
sudo -u postgres bash -c "cat > /var/lib/postgresql/9.3/main/recovery.conf <<- _EOF1_
standby_mode = 'on'
primary_conninfo = 'host=123.456.789.123 port=5432 user=replicator password=XXXXX sslmode=require'
trigger_file = '/tmp/postgresql.trigger'
_EOF1_
"
echo Starting Postgresql
/etc/init.d/postgresql start
Thank you,
Jake
My best guess from the above is that the pg_basebackup failed and your shell script doesn't check for error return codes or use set -e to automatically abort after errors, so it just carried on regardless.
It's also possible that you don't have WAL archiving configured, or don't have a restore_command set in the replica. In that case, the transaction logs required to start the base backup will not be available and startup will fail.
I strongly recommend that you:
Use pg_basebackup -X stream so that the required transaction logs get copied along with the backup; and
Use set -e in your shell script, or test for errors with a suitable if ! pg_basebackup .... ; then block.

log shipping error postgres

I was performing log shipping from postgres 9.0.4 (redhat ) to 9.0.6 (fedoara14)
but I received an error
HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.
LOG: entering standby mode
LOG: restored log file "000000010000000200000065" from archive
LOG: record with zero length at 2/65000100
WARNING: WAL was generated with wal_level=minimal, data may be missing
HINT: This happens if you temporarily set wal_level=minimal without taking a new base backup.
FATAL: hot standby is not possible because wal_level was not set to "hot_standby" on the master server
HINT: Either set wal_level to "hot_standby" on the master, or turn off hot_standby here.
LOG: startup process (PID 9438) exited with exit code 1
LOG: aborting startup due to startup process failure
ls ../archive/
000000010000000200000051 000000010000000200000059 00000001000000020000005F.00000020.backup
000000010000000200000052 000000010000000200000059.00000020.backup 000000010000000200000060
000000010000000200000053 00000001000000020000005A 000000010000000200000061
000000010000000200000054 00000001000000020000005B 000000010000000200000061.00000020.backup
000000010000000200000055 00000001000000020000005B.00000020.backup 000000010000000200000062
000000010000000200000055.00000020.backup 00000001000000020000005C 000000010000000200000063
000000010000000200000056 00000001000000020000005D 000000010000000200000064
000000010000000200000057 00000001000000020000005E 000000010000000200000065
000000010000000200000058 00000001000000020000005F
ls pg_xlog
000000010000000200000061.00000020.backup 000000010000000200000067 00000001000000020000006A archive_status
000000010000000200000065 000000010000000200000068 00000001000000020000006B RECOVERYXLOG
000000010000000200000066 000000010000000200000069 00000001000000020000006C
cat recovery.conf
### RECOVERY
standby_mode = 'on'
restore_command = 'cp -i /var/lib/pgsql/9.0/archive/%f %p'
when I remove the recovery.conf file from the data/ directory
and turned off 'hot_standby' in postgresql.conf file then I can start the postgres and can select the data
I want the secondary postgres should be start in a hot_standby mode
can any one tell me how to get rid of this issue !!!
Please, check postgresql.conf on your master database. According to your log:
WARNING: WAL was generated with wal_level=minimal, data may be missing
HINT: This happens if you temporarily set wal_level=minimal without taking a new base backup.
FATAL: hot standby is not possible because wal_level was not set to "hot_standby" on the master server
HINT: Either set wal_level to "hot_standby" on the master, or turn off hot_standby here.
The message is pretty informative. You should either use wal_level = hot_standby on the master database (consider running a full backup after turning this on), or use hot_standby = off on the standby side (this change requires no extra manipulations).
In fact, in order to maintain standby you need either archive or hot_standby level of WAL, per documentation.
If you have activated your standby by removing recovery.conf and starting the cluster, then you should re-create standby from the latest full backup.