Validating the postgres upgrade using logical replication - postgresql

Currently I am trying to upgrade my postgres 9.1 to 10 using logical replication. As 9.1 does not support native logical_replication, I tried slony and made a replica successfully.
P.S. The above replica I created is using a sample dump from an year ago which is only of 800mb.
Now I have few questions.
How can I validate whether the replica has all the data replicated successfully. Few suggested to put the master on maintenance mode(few downtime) and do a last N items comparison with both the database on all the tables.
I tried with 800mb. Will there be any problem when I go and try with 100+ GB?
Please share your personal experience here. I have been trying to document what are the things that could go wrong so I can always try to anticipate the next course of action.

You may use the Data Validator that is shipped with trial version of EDB Postgres Replication server for validating the data between old and new PostgreSQL databases.
You may read the details of Data Validator at Data Validator Document
To download the EDB Replication Server please follow this link: EDB Replication Server
Disclosure: I work for EnterpriseDB (EDB)

Related

I want to set the Streaming replication between two different postgresql versions

my primary server is Postgresql-9.4 and the secondary server is Postgresql-13, I followed all the steps but while restarting the secondary server, I am facing the error "An old version of the database format was found. You need to dump and reload before using PostgreSQL 13." how I should resolve it.
You cannot have streaming replication between different PostgreSQL versions, and you cannot have logical replication with versions less than v10.
You will have to use trigger-based replication like with Slony-I.

Is it possible to upgrade postgres 9.6 to 10 with zero downtime. What are the approaches that can be followed?

I'm trying to upgrade my postgres database from 9.6 to 10 without any downtime. Can this be done?
As "a_horse_with_no_name" mentioned logical replication is a very good choice in your situation.
Only problem is 9.6 does not have internal implementation yet so you would have to use extension "pglogical" on both DB - here I found some description - https://rosenfeld.herokuapp.com/en/articles/infrastructure/2017-11-10-upgrading-postgresql-from-9-6-to-10-with-minimal-downtime-using-pglogical - skip parts about Docker and see how pglogical works.
The only slight problem is pglogical must be added into "shared_preload_libraries" parameter and postgresql service must be restarted which can be sometimes difficult on production...
I did quite a lot of tests with pglogical (here are some notes - http://postgresql.freeideas.cz/pglogical-postgresql-9-6-small-hints-debian/) although at the end I never used pglogical on production. So I do not have experiences from long usage.
But I presume some problems can be similar to the internal implementation of logical replication in PG 10 and 11. So here are my notes from my current usage of internal logical replication on PG 11 - http://postgresql.freeideas.cz/setting-logical-replication-is-not-entirely-straight-forward/ - maybe something from it would help you.
My recommendation for you would be:
make a hot backup copy of your PG 9.6 database on some other machine on cloud VM with exactly the same OS and if possible disks types and configuration using pg_basebackup - you can find some inspiration here:
http://postgresql.freeideas.cz/pg_basebackup-bash-script-backup-archiving-google-storage/
http://postgresql.freeideas.cz/pg_basebackup-experiences/
or if you already use pg_basebackup for tar backups of your db restore latest backup on other machine or VM (http://postgresql.freeideas.cz/pg_basebackup-pgbarman-restore-tar-backup/)
start is as normal server (not as hot standby) and test pglogical on this copy of your DB against some testing installation of PG 10 - test it as close to the production environment as possible including at least simulated DML operations of similar intensity - this will show you differences in load on machine / VM.
I highly recommend to set monitoring for example using telegraf + influxdb + Grafana (easiest implementation by my opinion) to be able to analyze CPU and memory usage later - this can be very crucial part for usage on production !
after hopefully short and successful tests implement it and celebrate your success :-) and please write about you experiences. Because I believe a lot of people would welcome it.

How do I restore a database dump to a Citus cluster?

While restoring a (pg_dump-produced) database dump, I get the following error:
Cannot execute COPY FROM on a distributed table on master node
How can I work around this?
COPY support was added in Citus 5.1, which was released May 2016 and is available in the official PostgreSQL Linux package repositories (PGDG).
Are you trying to load data via a pg_dump output? Creating distributed tables is slightly different than regular tables, and requires picking of partition columns and partitioning method. Take a look at the docs to get more information on both.

Migrate database from Heroku to AWS

I want to migrate our postgres db from heroku to our own postgres on AWS.
I have tried using pg_dump and pg_restore to do the migration and it works; but it takes a really long time to do this. Our database size is around 20GB.
What's the best way to do the migration with minimal downtime?
If you mean AWS RDS PostgreSQL:
pg_dump and pg_restore
I know you don't like it, but you don't really have other options. With a lot of hoop jumping you might be able to do it with Londiste or Slony-I on a nearby EC2 instance, but it'd be ... interesting. That's not the most friendly way to do an upgrade, to say the least.
What you should be able to do is ship WAL into RDS PostgreSQL, and/or stream replication logs. However Amazon don't support this.
Hopefully Amazon will adopt some part of 9.4's logical replication and logical changeset extraction features, or better yet the BDR project - but I wouldn't hold my breath.
If you mean AWS EC2
If you're running your own EC2 instance with Pg, use replication then promote the standby into the new master.

How to salvage data from Heroku Postgres

we are using Heroku Postgres with Ruby on Rails 3.2.
A few days before, we deleted important data by mistake using 'heroku run db:load' with misconfigured data.yml, that is, drop tables and the recreate tables with almost no data.
Backup data is only available 2 weeeks before, so we lost data of 2 weeks.
So We need to recover not by PG Backup/pg_dump but by postgresql's system data files.
I think, the only way to recover data is to restore data from xlog or archive file, but of course we don't have permission to be Super User/Replication Role to copy postgres database on heroku (or Amazon EC2) to local server.
Is there anyone who confronted such a case and resolved the problem?
Your only option is the backups provided by the PgBackups service (if you had that running). If not, Heroku support might have more options available.
At a minimum, you will have some data loss, but you can guarantee you won't do it again ;)