postgresql online backup tar: Error exit delayed from previous errors - postgresql

everybody
I use pg_start_backup and pg_stop_backup to backup Postgresql Database.sometime backup log say tar: Error exit delayed from previous errors; Then I find backup log said tar: /data/pgsql/5432/base/21796/25283: file changed as we read it; How can I avoid this log ??
I test the backup is ok for recovery ?
operational process is
1、 select pg_start_backup('labe');
2、 tar czvf data.tar.gz /data/pgsql/5432 --exclude /data/pgsql/5432/pg_xlog
3、 select pg_stop_backup();
second question is someone use pg_basebackup to backup postgresql ? I test is the same as pg_start_backup and pg_stop_backup?
very thanks

file changed as we read it is just a warning and the backup is OK, provided pg_start_backup has been called. To silence the warning, if using GNU tar, you may add the option:
--warning=no-file-changed
See http://www.gnu.org/software/tar/manual/html_section/warnings.html
pg_basebackup is another way of taking a hot backup. It differs mostly by not needing file access on the db server (it uses a PostgreSQL connection to get the data), and providing some specific options related to WAL files.

does someone use pg_basebackup to backup your postgresql database ??
and pg_baebackup take a full backup ervery day and use archive data to recovery to point to time recovery

Related

Could not open file "pg_clog/0000": No such file or directory

I am getting the error like following while accessing a Postgres database
ERROR: could not access status of transaction 69675
DETAIL: Could not open file "pg_clog/0000": No such file or directory.
I didn't do anything with the pg_clog folder but the 0000 file is not there.
Is there any way to recover that file or in any way to fix this issue?
Any help would be appreciated.
You are experiencing database corruption, and you should restore from a backup. You should try to figure out what happened to the database so you can prevent it in the future.
Is your storage reliable?
Are you using dangerous settings like fsync = off?
Were there any crashes recently?
Are you really running 9.1? If yes, you shouldn't do that, as it is out of support.
Are there any files in the pg_clog directory? There should be.
Did you have an out-of-space problem recently that may have led someone to remove files from a “log” directory?
As stated in the previous response, you're better off restoring from backup, however, I discovered the metadata for those transaction files are not stored in the same location as the data when we restored the data on a server where we were doing some testing with full vacuum and needed to restore the database to an earlier state before the vacuum. In the event where your data integrity isn't as critical like a test database you can get away with creating empty files for the missing transaction logs like this:
dd if=/dev/zero of=/path/to/db/pg_clog/xxxx bs=256k count=1
chown postgres.postgres /path/to/db/pg_clog/xxxx
chmod go-rwX /path/to/db/pg_clog/xxxx
There may be multiple missing files, but if it's just a few files this is an alternative to consider.

Postgres Continuous Archiving and Point-in-Time Recovery (PITR)

I am trying to setup Continuous Archiving and Point-in-Time Recovery (PITR) in Postgres. When I go through the documentation it says:
The archive command should generally be designed to refuse to overwrite any pre-existing archive file. This is an important safety feature to preserve the integrity of your archive in case of administrator error (such as sending the output of two different servers to the same archive directory).
But I see that the same WAL file is changing multiple times when I open a connection and do some changes time to time. So for example, when I first connect the database and do some changes (like deleting or inserting some rows), it creates a WAL file named 000000010000000000000090 and my archive_command is immediately run. My archive_command is
test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f
This is based on the documentation, which checks if the file already exists in the archive directory, if exists, it doesn't copy and copies only if the file doesn't exist. So the first time the condition passes and the file is copied, but when I am doing some more changes with the same connection (I am even having the same issue when I reconnect from the same PC) the original WAL file is being changed. But the next time the copy doesn't work because the file already exists.
If this is allowed to happen, we may lose some changes in the backup. Anyone knows about any solution, so it creates a new file for every change instead of modifying the old file?
I am using Postgres version 10.2 on my local computer (Mac).
Does that really happen to you? Because it shouldn't.
PostgreSQL writes transaction logs in “WAL files” (WAL for Write Ahead Log) of 16MB size.
Whenever a WAL file is full, the log is switched to a new WAL file, and the old WAL file is archived with archive_command.
If archive_command completes with an exit status of 0 (success), the WAL file is recycled, otherwise archiving is retried until it succeeds. Failures will be logged.
Anyway, as long as there are no errors, each WAL file will only be archived once.
The behavior you describe shouldn't happen.
Check the PostgreSQL log to see if there were errors reported form archive_command. If you fix the error condition. normal operation will be resumed.

Backup postgresql WAL logs

I try to configure backuping database in postgresql with pg_basebackup and WAL logs.
For now I created full backup once a week and want to backup wal logs too. But, as I understand, posgresql writes them all the time. So, how can I copy them and be shure that they are not corrupted?
Thanks
You set archive_command to a shell command that copies the WAL file to a safe archive location, so that burden is mostly on you.
When PostgreSQL runs archive_command, it assumes that the WAL file is not corrupted. Only a PostgreSQL bug or a bug in the storage system could cause a corrupted WAL segment.
There is no better protection against PostgreSQL bugs than always running the latest bugfix release, and you can invest in storage hardware that will at least detect failure.
You can also write your archive_command with a certain amount of paranoia, e.g. by comparing the md5sum of the WAL segment and its archive copy.
Another idea is to write two copies of the WAL file to different storage systems.

Roll-forward is required following the Restore

I have three different databases for my different environments (hsprd, hstst,hstrn). hsprd is my production environment with live data.
Every so often, a request comes through to restore production data to hstrn or hstst. I typically run this command (after stopping, then dropping the db):
db2 restore db hsprd taken at 20140331180002 to /dbs into hstrn newlogpath /dbs/log/hstrn without rolling forward;
When running this, I receive this message:
SQL2537N Roll-forward is required following the Restore.
Could someone advise how to fix this?
Thanks.
edit: My backups are here:
(/home/dbtmp/backups)> ll
total 22791416
-rwxrwxr-x 1 hsprd cics 11669123072 Mar 31 18:03 HSPRD.0.hsprd.NODE0000.CATN0000.20140331180002.001
After restoring my database and omitting without rolling forward, I receive this message when trying to query the database:
SQL1117N A connection to or activation of database "HSTRN" cannot be made
because of ROLL-FORWARD PENDING. SQLSTATE=57019
When I try to rollforward, with this command, I receive this response:
(/home/dbtmp/backups)> db2 rollforward db hstrn to end of backup and complete;
SQL4970N Roll-forward recovery on database "HSTRN" cannot reach the specified
stop point (end-of-log or point-in-time) on database partition(s) "0".
Roll-forward recovery processing has halted on log file "S0006353.LOG".
The first error suggests that you are restoring an online backup, which must be rolled forward. Alternatively, use an offline backup image, then you can include the without rolling forward option.
The second error means that you need to issue the ROLLFORWARD command before you can use the database restored from an online backup.
Finally the third error means that the ROLLFORWARD command is unable to find the logs required for it to succeed. Assuming the logs are included in the backup image, you'll need to specify the LOGTARGET option on the RESTORE command to extract them, presumably to the NEWLOGPATH location.

DB2: not able to restore from backup

I am using command
db2 restore db S18 from /users/intadm/s18backup/ taken at 20110913113341 on /users/db2inst1/ dbpath on /users/db2inst1/ redirect without rolling forward
to restore database from backup file located in /users/intadm/s18backup/ .
Command execution gives such output:
SQL1277W A redirected restore operation is being performed. Table space
configuration can now be viewed and table spaces that do not use automatic
storage can have their containers reconfigured.
DB20000I The RESTORE DATABASE command completed successfully.
When I'm trying to connect to restored DB (by executing 'db2 connect to S18'), I'm getting this message:
SQL0752N Connecting to a database is not permitted within a logical unit of
work when the CONNECT type 1 setting is in use. SQLSTATE=0A001
When I'm trying to connect to db with db viewer like SQuireL, the error is like:
DB2 SQL Error: SQLCODE=-1119, SQLSTATE=57019, SQLERRMC=S18, DRIVER=3.57.82
which means that 'error occurred during a restore function or a restore is still in progress' (from IBM DB2 manuals)
How can I resolve this and connect to restored database?
UPD: I've executed db2ckbkp on backup file and it did not identified any issues with backup file itself.
without rolling forward can only be used when restoring from an offline backup. Was your backup taken offline? If not, you'll need to use roll forward.
When you do a redirected restore, you are telling DB2 that you want to change the locations of the data files in the database you are restoring.
The first step you show above will execute very quickly.
Normally, after you execute this statement, you would have one or more SET TABLESPACE CONTAINERS to set the new locations of each data file. It's not mandatory to issue these statements, but there's no point in specifying the redirect option in your RESTORE DATABASE command if you're not changing anything.
Then, you would issue the RESTORE DATABASE S18 COMPLETE command, which would actually read the data from the backup image, writing it to the data files.
If you did not execute the RESTORE DATABASE S18 COMPLETE, then your restore process is incomplete and it makes sense that you can't connect to the database.
What I did and what has worked:
Executed:
db2 restore db S18 from /users/intadm/s18backup/ taken at 20110913113341 on /<path with sufficient disk space> dbpath on /<path with sufficient disk space>
I got some warnings before, that some table spaces are not moved. When I specified dbpath to partition with sufficient disk space - warning has disappeared.
After that, as I have an online backup, I issued:
db2 rollforward db S18 to end of logs and complete
That's it! Now I'm able to connect.