This is specifically about maintaining confidence in using various replication solutions that you'd be able to failover to the other server without data loss. Or in a master-master situation that you could know within a reasonable amount of time if one of the databases has fallen out of sync.
Are there any tools out there for this, or do people generally depend on the replication system itself to warn over inconsistencies? I'm currently most familiar with postgresql WAL shipping in a master-standby setup, but am considering a master-master setup with something like PgPool. However, as that solution is a little less directly tied with PostgreSQL itself (my basic understanding is that it provides the connection an app would use, thus intercepting the various SQL statements, and would then send them on to whatever servers were in its pool), it got me thinking more about actually verifying data consistency.
Specific requirements:
I'm not talking about just table structure. I'd want to know that actual record data is the same, so that I'd know if records were corrupted or missed (in which case, I would re-initialize the bad database with a recent backup + WAL files before bringing it back into the pool)
Databases are in the order of 30-50 GB. I'm doubting that raw SELECT queries would work very well.
I don't see the need for real-time checking (though it would, of course, be nice). Hourly or even daily would be better than nothing.
Block-level checking wouldn't work. It would be two databases with independent storage.
Or is this type of verification simply not realistic?
You can check the current WAL locations on both the machines...
If they represent the same value, that means your underlying databases are consistent with each other...
$ psql -c "SELECT pg_current_xlog_location()" -h192.168.0.10 (do it on primary host)
pg_current_xlog_location
--------------------------
0/2000000
(1 row)
$ psql -c "select pg_last_xlog_receive_location()" -h192.168.0.20 (do it on standby host)
pg_last_xlog_receive_location
-------------------------------
0/2000000
(1 row)
$ psql -c "select pg_last_xlog_replay_location()" -h192.168.0.20 (do it on standby host)
pg_last_xlog_replay_location
------------------------------
0/2000000
(1 row)
you can also check this with the help of walsender and walreceiver processes:
[do it on primary] $ ps -ef | grep sender
postgres 6879 6831 0 10:31 ? 00:00:00 postgres: wal sender process postgres 127.0.0.1(44663) streaming 0/2000000
[ do it on standby] $ ps -ef | grep receiver
postgres 6878 6872 1 10:31 ? 00:00:01 postgres: wal receiver process streaming 0/2000000
If you are looking for the whole table you should be able to do something like this (assuming a table that quite easily fits in RAM):
SELECT md5(array_to_string(array_agg(mytable), ' '))
FROM mytable order by id;
That will give you a hash on the tuple representation on the tables.
Note that you could break this down by ranges, etc. Depending on the type of replication you could even break it down by page range (for streaming replication).
Related
I'm running a looooong pg_restore process of a database with 70 tables and 800Gb. The process is taking 5 days now. I'm monitoring some aspects of the process to evaluate how long will it take but I've some things missing and this is why I'm asking.
I run pg_dump with parameters -F d -j 10 the dump took about 12 hours. I noticed each one of the 10 threads took responsibility of a single table from start to end. After ending of processing a single table, the same process (pid) started with another table not taken by another process.
Running pg_restore is taking much longer (5 days and still working). The main reason is that I'm restoring to a NAS external drive mounted using nfs and that drive is very slow compared to a local hard drive. This is NOT a problem, I'll migrate the information back from the NAS to the original hard drive once I format the hard drive again and install the new operating system.
I'm doing two things to monitor progress:
In a separate terminal I launch du -sh /var/lib/pgsql and evaluate the disk space consumed in the new installation. It has to reach, more or less, the same space the original database was using.
In a separate terminal I launch ps -fu postgress and I see several pg_restore processes running. Each one of then linked with another process with this shape postgress: postress {dbname} [local] {command} where {dbname} is the database name, and {command} varies. Initially, there was the COPY command I think that was used to restore the table content. I also saw some CREATE INDEX commands for re-creating the indexes of that table, and now I see ALTER TABLE commands, don't know exactly for what.
At this time, all processes are just doing ALTER TABLE and the overall used space almost matches the initial space, but the process does not ends (and it is taking 5 days now).
So I'm asking if someone has more experience and can tell me what pg_restore is doing with the ALTER_TABLE command and if there is any other mechanism to estimate how long will it take.
Thanks!
Ignacio
The ALTER TABLE statements at the end of a pg_restore create primary and unique keys as well as foreign key constraints. They could also be attaching partitions, but that is normally very fast.
Look into pg_stat_progress_create_index if you have a recent enough PostgreSQL version (you didn't say), then you can monitor the progress of primary and unique key indexes being created.
I have a PostgreQL DB that is about 6TB. I want to transfer this database to another server using for example pg_dumpall. The problem I have is that I only have a 1TB HD. How can I do to copy this database to the other new server that has enough space? Let's suppose I can not get another HD. Is there the possibility to do partial backup files, upload them to the new server, erase the HD and get another batch of backup files until the transfer is complete?
This works here(proof of concept):
shell-command issued from the receiving side
remote side dumps through the network-connection
local side psql just accepts the commands from this connection
the data is never stored in a physical file
(for brevity, I only sent the table definitions, not the actual data: --schema-only)
you could have some problems with users and tablespaces (these are global for an installation in Postgres) pg_dumpall will dump+restore these, too, IIRC.
#!/bin/bash
remote=10.224.60.103
dbname=myremotedbname
pg_dump -h ${remote} --schema-only -c -C ${dbname} | psql
#eof
As suggested above if you have a fast network connection between source and destination you can do it without any extra disk.
However for a 6 TB DB (which includes indexes I assume) using the archive dump format (-Fc) could yield a database dump of less than 1 TB.
Regarding the "by parts" question: yes, it possible using the table pattern (-t, --table):
pg_dump -t TABLE_NAME ...
You can also exclude tables using -T, --exclude-table:
pg_dump -T TABLE_NAME ...
The above options (-t , -T) can be specified multiple times and can be even combined.
They also support patterns for specifying the tables:
pg_dump -t 'employee_*' ...
Postgres and the default location for its storage is at my C-drive. I would like to restore a backup to another database but to access it via the same Postgres server instance - the issue is that the size of the DB is too big to be restore on the same c-drive ...would it be possible to tell Postgres that the second database should be restore and placed on another location/drive (while still remaining the first one)? Like database1 at my C-drive and database2 at my D-drive?
Otherwise the second best solution would be to install 2 separate Postgres instances - but that also seems a bit overkill?
That should be entirely achievable, if you've used the postgres pg_dump command.
The pg_dump command does not create the database, so you create it yourself first. Use CREATE TABLESPACE to specify the location.
CREATE TABLESPACE secondspace LOCATION 'D:\postgresdata';
CREATE DATABASE seconddb TABLESPACE secondspace;
This creates an empty database on the D: drive.
Then the standard restore from a pg_dump should work:
psql seconddb < dumpfile
Replication
Sounds like you need database replication.
There are several ways to do this with Postgres, one built-in, and other approaches using add-on libraries.
Built-in replication feature
The built-in replication feature is likely to suit your needs. See the manual. In this approach, you have an instance of Postgres running on your primary server, doing reads and writes of your data. On a second server, an entirely separate computer, you run another instance of Postgres known as the replica. You first set up the replica by doing a full backup of your database on the first server, and restore to the second server.
Next you configure the replication feature. The replica needs to know it is playing the role of a replica rather than a regular database server. And the primary server needs to know the replica exists, so that every database change, every insert, modification, and deletion, can be communicated.
WAL
This communication happens via WAL files.
The Write-Ahead Log (WAL) feature in Postgres is where the database writes all changes first to the WAL, and only after that is complete, then writes to the actual database. In case of crash, power outage, or other failure, the database upon restarting can detect a transaction left incomplete. If incomplete, the transaction is rolled back, and the database server can try again by seeing the "To-Do" list of work listed in the WAL.
Every so often the current WAL is closed, with a new WAL file created to take over the work. With replication enabled, the closed WAL file is copied to the replica. The replica then incorporates that WAL file, to follow the same "To-Do" list of changes as written in that WAL file. So all changes are made to the replica database exactly as they were made to the primary database. Your replica is an exact match to the primary, except for a slight lag in time. The replica is always just one WAL file behind the progress of the primary.
In times of trouble, the replica serves as a warm stand-by. You can shutdown the primary, then tell the replica that it is now the primary. You can even configure the replica to be a hot stand-by, meaning it will automatically take-over when the primary seems to have failed. There are pros and cons to hot stand-by.
Offload read-only queries
As a bonus feature, the replica can be used for read-only queries. If your database is heavily used, you can offload some of the work burden from your primary to the replica. Any queries that do not require the absolute latest information can be shifted by connecting to the replica rather than the original. For example, a quarterly sales report likely does not need the latest data stored in the active WAL file that has not yet arrived on the replica.
Physical replication means all databases are copied
Caveat: This built-in replication feature is physical replication. This means all the changes to the entire Postgres installation (formally known as a cluster, not to be confused with a hardware cluster) is copied to the replica. If you use one Postgres server to server multiple databases, all those databases must be replicated – you cannot pick and choose which get copied over. There may be alternative replication features in the future related to logical replication.
More to learn
I am being brief here. The topics of replication, high-availability, and disaster-recovery are broad and complex, too much for an Answer on Stack Overflow.
Tip: This kind of Question might have been better asked on the sister site, DBA.StackExchange.com.
I have xlog questions that I'm not sure about.
1) I have two servers that were once slaves. How can I know if they were slaves of the same master? Is it possible to check if they were split from the same source in the past? I know pg_rewind knows how to check if, but is it possible to easily check it without running pg_rewind in dry run mode?
2) Is it true that if pg_last_xlog_replay_location is empty this server was never a slave?
3) Is it possible to know from the database itself to which master the slave is connected? I know to get this info from the recovery.conf or from the process attributes, but is it written in some system tables as well?
Thanks
Avi
were slaves of the same master
indirectly. you can compare select xmin,ctid,oid, datname from pg_database. of course dropping and creating postgres and template databases will change those, so this is very unreliable. but if you check those and find that ALL identifiers match - there's a good change that databases have same source.
more reliable and sophisticated method is comparing history file. Eg - if both ex slaves have same timeline, eg in case below 4:
-bash-4.2$ psql -d 'dbname=replication replication=true sslmode=require' -U replica -h 1.1.1.1 -c 'IDENTIFY_SYSTEM'
Password for user replica:
systemid | timeline | xlogpos
---------------------+----------+--------------
9999384298900975599 | 4 | F79/275B2328
(1 row)
you can check timelines history:
-bash-4.2$ psql -d 'dbname=replication replication=true sslmode=require' -U replica -h 1.1.1.1 -c 'TIMELINE_HISTORY 4'
Password for user replica:
filename | content
------------------+------------------------------------------------------
00000004.history | 1 9E/C3000090 no recovery target specified+
| +
| 2 C1/5A000090 no recovery target specified+
| +
| 3 A52/DB2F98B8 no recovery target specified+
|
(1 row)
If both servers have same timeline and same xlog position at which a timeline was created, you can say with much reliability, I believe, that came from same sourse.
empty pg_last_xlog_replay_location
I would say so. It was never a slave and was never recovered from WALs. At least I don't know how to reset pg_last_xlog_replay_location on promoted master...
system tables to tell to which master the slave is connected
Nothing suitable comes to my mind. If you are SU then you can read recovery.conf even without shell access, if you're not, you probably would not be able to select such a view...
I'm trying to move a database from server1 to server2. I read docummentation of postgres, and I think everything is right except that after I dumped db from server1 moved it and restored on server2 the sizes are different.
Server1
SELECT pg_size_pretty(pg_database_size('db_name'));
pg_size_pretty
----------------
118 MB
(1 row)
Server2
select pg_size_pretty(pg_database_size('db_name'));
pg_size_pretty
----------------
81 MB
(1 row)
I've made the dump with -a -Fc -Z9 flags and restore with pg_restore -U user -c -d db_name dump_file.dump
My questions are:
Why the sizes are different?
What is the correct approach to move a database like this if the application that access the db is a rails one? (I mean, I want a restore that doesn't affect future rails migrations)
Do you have other ideas? Other docummentation that I can read?
Thank you for reading this.
This is fine and normal.
Dump and reload produces a more compact database because there's no dead space in the tables and the b-tree indexes are newly reindexed so they're packed and well balanced. You'll find the size is the same or much closer if you:
VACUUM FULL;
REINDEX DATABASE mydb;
on the main DB.
On a side note, though, I strongly recommend restoring using the -1 option to pg_restore unless you need parallel restore. That way you'll either get an empty DB or a complete restore. Of course, you should also always check the return codes from pg_dump and pg_restore.
No comment on the Rails part, I don't know what you're referring to. Please don't do multi-questions like this, they're hard to answer definitively and you get different "correct" answers in different parts. Post a new SO question for a new question.