I want to migrate our postgres db from heroku to our own postgres on AWS.
I have tried using pg_dump and pg_restore to do the migration and it works; but it takes a really long time to do this. Our database size is around 20GB.
What's the best way to do the migration with minimal downtime?
If you mean AWS RDS PostgreSQL:
pg_dump and pg_restore
I know you don't like it, but you don't really have other options. With a lot of hoop jumping you might be able to do it with Londiste or Slony-I on a nearby EC2 instance, but it'd be ... interesting. That's not the most friendly way to do an upgrade, to say the least.
What you should be able to do is ship WAL into RDS PostgreSQL, and/or stream replication logs. However Amazon don't support this.
Hopefully Amazon will adopt some part of 9.4's logical replication and logical changeset extraction features, or better yet the BDR project - but I wouldn't hold my breath.
If you mean AWS EC2
If you're running your own EC2 instance with Pg, use replication then promote the standby into the new master.
Related
We want to migrate an external postgresql database into amazon RDS however for some time we need to keep both of them working and in sync. I have found ways of doing it but only with RDS being the master and not the Slave. Is there a good and viable solution which could help us?
There is AWS service Database Migration Service which can be used to migrate external databases to Amazon AWS RDS.
https://docs.aws.amazon.com/dms/latest/userguide/Welcome.html
It supports postgresql 9.4 and higher.
There are different task types and you need to use Ongoing Replication to keep them sync.
https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Task.CDC.html
To setup that task you need to enable Logical Replication for source database
I'm using postgres 9.5.7 in RDS and want to create a slave/read replica on an EC2 box. I've figured out how to get logical replication working on RDS and am able to use pg_recvlogical to tap into the replication slot on the EC2 box.
My challenge now has been that, unfortunately, RDS doesn't support pglogical and it seems that I'm left with either test_decoding or wal2json for my output formats. Is there something out there that knows how to take either of those formats and turn them into SQL that can be executed on the slave?
Most of the guides I've found online only go as far as getting pg_recvlogical working, and don't take that extra last step of showing how to actually get those changes into the slave database.
perhaps you want to check https://wiki.postgresql.org/wiki/Logical_Decoding_Plugins#decoder_raw
this plugins output a sql statements that can be run in slave postgres
I have a requirement of replicate data from AWS RDS Postgres(9.6) Database to On-Premise Postgres(9.5) Database. I have found stuff about replication from On-premise to On-premise. But How can we implement it for AWS RDS to On-premise?
I do this using Bucardo.
Check-out this: https://bucardo.org/Bucardo/
With Bucardo you can replicate RDS postgres instance to a slave postgres present somewhere, only configuring slave, so without the needs to configure RDS stuff.
Also you can do this with zero downtime.
Anyway I am not sure this will work using different versions of Postgresql. You should use same version if possible. I tested it with 9.4.x and it is working.
UPDATE
I can confirm that this is working also using different version of Postgres, for example I was able to replicate with these versions:
AWS RDS postgresql 9.4.x
On-premise postgresql 9.6.x
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_PostgreSQL.html#PostgreSQL.Concepts.General.FeatureSupport.LogicalReplication
Beginning with PostgreSQL version 9.4, PostgreSQL supports the
streaming of WAL changes using logical replication slots. Amazon RDS
supports logical replication for a PostgreSQL DB instance version
9.4.9 and higher and 9.5.4 and higher. Using logical replication, you can set up logical replication slots on your instance and stream
database changes through these slots to a client like pg_recvlogical.
Logical slots are created at the database level and support
replication connections to a single database.
mind possible problems eg https://dba.stackexchange.com/questions/173267/aws-rds-postgres-logical-replication
I am considering using AWS Aurora, however I am concerned for being locked into AWS indefinitely. So I am wondering how difficult it would be to transfer data from Aurora to my own Postgres database.
Thanks!
This is a very valid concern. Firstly, there is no seamless migration like there is from Postgres to Aurora. Following, needs to be considered:
How to do it: You will have to take a dump of your aurora db and then import it into postgres.
Because of 1 above; you cannot have concurrent CURD operations running on your aurora during migration. Hence, you need to shut down all products connecting to your aurora till you migrate to Postgres. Hence, there will be downtime.
Because of 2 ; Depending on size of your DB; it might take few mins ( few GB of data ) to many hours if you have huge DB.
Hence, you need to consider how much data you have and how much downtime you can live with if you want to migrate back to Postgres.
what's a good way to backup a Postgres DB (running on Amazon RDS).
The built in snapshoting from RDS is by default daily and you can not export the snapshots. Besides that, it can take quite a long time to import a snapshot.
Is there a good service that takes dumps on a regular basis and stores them on e.g. S3? We don't want to spin up and maintain a ec2 instance which does that.
Thank you!
I want the backups to be automated, so I would prefer to have dedicated service for that.
Your choices:
run pg_dump from an EC2 instance on a schedule. This is a great use case for Spot instances.
restore a snapshot to a new RDS instance, then run pg_dump as above. This reduces database load.
Want to run a RDS snapshot more often than daily? Kick it off manually.
These are all automateable. For "free" (low effort on your part) you get daily snapshots. I agree, I wish they could be sent to S3.
SOLUTION: Now you can do a pg_dumpall and dump all Postgres databases on a single AWS RDS Instance.
It has caveats and so its better to read the post before going ahead and compiling your own version of pg_dumpall for this. Details here.