I tried to synchronize between source/target clusters by using pg_rewind by the following command.
And I know 100% the content of in the source/target clusters are not the same anymore.
/usr/pgsql-12/bin/pg_rewind --source-server="192.168.100.100 user=postgres password=mypassword" -D /var/lib/pgsql/12/data --progress
but pg_rewind gives the following message
pg_rewind: source and target cluster are on the same timeline
pg_rewind: no rewind required
I didn't understand how the content of pg_wal & base directories are different between source/target but pg_rewind did't realize that !!
pg_rewind only undoes data modifications on the target server later than the latest common checkpoint.
Modifications that have happened on the source server after the latest common checkpoint are ignored – these will be recovered anyway when the target server becomes a standby of the source server.
So the target server probably was shut down cleanly before the source server got promoted.
The message about the timeline is coincidental, it is not the cause of the second message.
Related
I am new to postgres and was trying to simulate a postgresql cluster so:
I have two nodes installed for postgres latest version and acting as active / hot standby and with master configuration :
archive_mode = on
archive_command = 'test ! -f /data/%f && cp %p /data/%f'
and slave configuration
primary_slot_name = 'standby_db2_slot'
hot_standby = on
and others default and related configuration
my question is if the standby was off for some time and the master crashes how to recover the data from my archived wal files also how to get the last wal file that the master was writing to before crashing?
You could copy the files from the archive (if it is still available) into the replica's pg_wal folder. Or more typically, you would set restore_command to copy each of them from the archive upon request.
how to get the last wal file that the master was writing to before crashing?
If it was a hard crash where the master's storage was irreparably destroyed, you likey can't get it. That is why streaming is great, it copies the data stream in near-real time to minimize loss. And if was a soft crash, why are you trying to promote the replica anyway, rather than just turning the master back on? If the master's storage was only partially destroyed, then just copy this last file to the archive manually.
I am very new to postgres and being new I got stuck at a point and need some help, please pardon if you find it silly.
I am doing a pgpool HA and at postgres level i have streaming replication between 3 nodes of postgresql-9.5 - 1 master and 2 slaves
I was trying to configure auto failover but when i switched back to my original master, and restarted the postgres service, I am getting the following error:
slave 1-highest timeline 1 of the primary is behind recovery timeline 11
slave 2-highest timeline 1 of the primary is behind recovery timeline 10
slave 3-highest timeline 1 of the primary is behind recovery timeline 3
I tried deleting pg_xlog files in slaves and copying all the files from master pg_xlog into the slaves and then did a rsync.
i also did a pg_rewind but it says:
target server needs to use either data checksums or wal_log_hints = on
(I have wal_log_hints = on set in postgresql.conf already)
I've tried doing a pg_basebackup but since the data base server in slaves are still starting up its not able to connect to the server
Is there any way to bring the master and the slave at a same timeline?
In my case, it happened because ( experimentally ), I updated the standby database tables and again when I simulate the master-standby streaming replication I got the same errors.
So once again I cleaned the whole standby database directory and migrate the master database using cmd like
"pg_basebackup -P -R -X stream -c fast -h 10.10.40.105 -U postgres -D standby/"
I think something is wrong in your pgpool configuration. What tool you have been using for manement of replication and master-slave control? Is it post master or repmgr?
I was trying to configure pgpool with 3 data nodes using a tutorial from http://jensd.be/591/linux/setup-a-redundant-postgresql-database-with-repmgr-and-pgpool and have done it correctly.
Also you can lean auto failover here.
(These question is obviously duplicate of this one, so I'll repeat the answer also.)
I'm not sure what you exactly mean by "when i switched back to my original master", but it looks that you are doing the wrongest possible thing in PostgreSQL streaming replication - introducing the second master.
The most important thing you should know about PostgreSQL replication is that once the failover is performed, you cannot simply "switch back to original master" - there's now a new master in cluster, and existence of two masters will make damage.
After a slave is promoted to master, the only way for you to re-join the old master is to:
Destroy it (delete the data directory);
Join it as a slave.
If you want it to be master again you'll continue with the following:
Let it run for awhile as a slave so that it can sync the data;
Kill temporary master and failover to old master;
Rejoin temporary master again as a slave.
You cannot simply switch master servers! Master can be created ONLY by failover (promoting a slave)
You should also know that whenever you are performing failover (whenever the master is changed), all slaves (except for the one that is promoted) need to be reconfigured to target the new master.
I suggest you reading this tutorial - it'll help.
Using postgres 9.3.
I'm a bit confused on the proper usage of pg_archivecleanup.
I'm using both streaming replication and backup with continuous archiving for PITR recovery.
I don't think I can configure pg_archivecleanup in recovery.conf on the standby as it wouldn't achieve anything. The master is not archiving to a location accessible to the standby. The master is archiving to a location on its local disk, and then those archives and the associated backup are being rsync'd to a large backup disk.
So, it seems the solution would be to run pg_archivecleanup in "standalone" mode on the master, such as:
/usr/lib/postgresql/9.3/bin/pg_archivecleanup -d /archive 0000000100000010000000F0.00000028.backup
So, I'd do a cron job that would run the pg_archivecleanup command for any .backup files which are older than the latest one, and then delete those backup files, leaving only the latest one.
Is my understanding and plan correct?
If you want to retain only WAL segments after the latest base backup, you simply run pg_archivecleanup in standalone mode for the latest .backup file (not for those older than the latest).
But do you really want to have only one available backup? First of all, you won't be able to restore to the point before the last backup. Second, it makes sense to have some backups just in case (corruptions, etc).
And it seems strange to archive segments to local disk and then rsync them elsewhere. Why not putting your rsync (and then sync to flush OS buffers to disk) into archive_command? This ensures that the segment won't be removed from pg_xlog before it reaches the destination.
I have three different databases for my different environments (hsprd, hstst,hstrn). hsprd is my production environment with live data.
Every so often, a request comes through to restore production data to hstrn or hstst. I typically run this command (after stopping, then dropping the db):
db2 restore db hsprd taken at 20140331180002 to /dbs into hstrn newlogpath /dbs/log/hstrn without rolling forward;
When running this, I receive this message:
SQL2537N Roll-forward is required following the Restore.
Could someone advise how to fix this?
Thanks.
edit: My backups are here:
(/home/dbtmp/backups)> ll
total 22791416
-rwxrwxr-x 1 hsprd cics 11669123072 Mar 31 18:03 HSPRD.0.hsprd.NODE0000.CATN0000.20140331180002.001
After restoring my database and omitting without rolling forward, I receive this message when trying to query the database:
SQL1117N A connection to or activation of database "HSTRN" cannot be made
because of ROLL-FORWARD PENDING. SQLSTATE=57019
When I try to rollforward, with this command, I receive this response:
(/home/dbtmp/backups)> db2 rollforward db hstrn to end of backup and complete;
SQL4970N Roll-forward recovery on database "HSTRN" cannot reach the specified
stop point (end-of-log or point-in-time) on database partition(s) "0".
Roll-forward recovery processing has halted on log file "S0006353.LOG".
The first error suggests that you are restoring an online backup, which must be rolled forward. Alternatively, use an offline backup image, then you can include the without rolling forward option.
The second error means that you need to issue the ROLLFORWARD command before you can use the database restored from an online backup.
Finally the third error means that the ROLLFORWARD command is unable to find the logs required for it to succeed. Assuming the logs are included in the backup image, you'll need to specify the LOGTARGET option on the RESTORE command to extract them, presumably to the NEWLOGPATH location.
My company's website uses a PostgreSQL database. In our data center we have a master DB and a few read-only slave DB's, and we use Londiste for continuous replication between them.
I would like to setup another read-only slave DB for reporting purposes, and I'd like this slave to be in a remote location (outside the data center). This slave doesn't need to be 100% up-to-date. If it's up to 24 hours old, that's fine. Also, I'd like to minimize the load I'm putting on the master DB. Since our master DB is busy during the day and idle at night, I figure a good idea (if possible) is to get the reporting slave caught up once each night.
I'm thinking about using log shipping for this, as described on
http://www.postgresql.org/docs/8.4/static/continuous-archiving.html
My plan is:
Setup WAL archiving on the master DB
Produce a full DB snapshot and copy it to the remote location
Restore the DB and get it caught up
Go into steady state where:
DAYTIME -- the DB falls behind but people can query it
NIGHT -- I copy over the day's worth of WAL files and get the DB caught up
Note: the key here is that I only need to copy a full DB snapshot one time. Thereafter I should only have to copy a day's worth of WAL files in order to get the remote slave caught up again.
Since I haven't done log-shipping before I'd like some feedback / advice.
Will this work? Does PostgreSQL support this kind of repeated recovery?
Do you have other suggestions for how to set up a remote semi-fresh read-only slave?
thanks!
--S
Your plan should work.
As Charles says, warm standby is another possible solution. It's supported since 8.2 and has relatively low performance impact on the primary server.
Warm Standby is documented in the Manual: PostgreSQL 8.4 Warm Standby
The short procedure for configuring a
standby server is as follows. For full
details of each step, refer to
previous sections as noted.
Set up primary and standby systems as near identically as possible,
including two identical copies of
PostgreSQL at the same release level.
Set up continuous archiving from the primary to a WAL archive located
in a directory on the standby server.
Ensure that archive_mode,
archive_command and archive_timeout
are set appropriately on the primary
(see Section 24.3.1).
Make a base backup of the primary server (see Section 24.3.2), and load
this data onto the standby.
Begin recovery on the standby server from the local WAL archive,
using a recovery.conf that specifies a
restore_command that waits as
described previously (see Section
24.3.3).
To achieve only nightly syncs, your archive_command should exit with a non-zero exit status during daytime.
Additional Informations:
Postgres Wiki about Warm Standby
Blog Post Warm Standby Setup
9.0's built-in WAL streaming replication is designed to accomplish something that should meet your goals -- a warm or hot standby that can accept read-only queries. Have you considered using it, or are you stuck on 8.4 for now?
(Also, the upcoming 9.1 release is expected to include an updated/rewritten version of pg_basebackup, a tool for creating the initial backup point for a fresh slave.)
Update: PostgreSQL 9.1 will include the ability to pause and resume streaming replication with a simple function call on the slave.