how to setup replication instance in on premises postgres for master database in AWS RDS postgres? - postgresql

I have a requirement of checking whether the exact copy of master database from AWS RDS can be created in on premises or not..
I have already established the connectivity between on prem and aws. Also checked the data migration using pg dump. But i am not getting how to create the replica without using DMS. Due to some security purpose we are not supposed to use DMS. So is there any other way out to implement thi ?
Any help will be much appreciated

It appears that your goal is disaster recovery.
Amazon RDS offers a few options for this:
Amazon RDS Snapshots are a backup of the database, stored in a region. If your database is in an Availability Zone that fails, the snapshot can be restored as a new database in another AZ. All AZs are physically separate data centers, much like your own data center is physically separate from an AWS data center.
Snapshots can also be copied to other Regions, which would guarantee a separation distance between data centers.
Multi-AZ Amazon RDS Databases keep a second copy of the data in another AZ and can switch-over to the alternate AZ without losing any data. This is faster than restoring a snapshot, but costs twice as much since two separate database servers are deployed.
These options would be easier to manage than replicating your data to an on-premises system. A Multi-AZ will automatically start the secondary instance, so your app can continue operating with only a short delay and no data loss. This is much better than you could offer if you fail-over to an on-premises system.

Related

Migrate data from Citus to RDS

Since Citus is not going to be available as a Managed Service in AWS, I am trying move the database to RDS (not the whole history but only the transactional portion as an OLTP). The migration from Citus is not clear because the data does not reside in a single node. I want to check the options we might have to move data from Citus to RDS.
Amazon DMS: This option is good for the supported databases (PostgreSQL) but we do not know what behavior this will have in Citus from the distributed nature of the engine. Has someone migrated the data to S3, to another DB or something in these lines?
I saw this paper from AWS https://d1.awsstatic.com/whitepapers/aws-cloud-data-ingestion-patterns-practices.pdf?did=wp_card&trk=wp_card on how to ingest data from different sources and DMS seems like a good option but I do not know the internals of Citus that well to tell if we will get all the data and gather the CDC correctly.
A Custom migration: Via a support ticket, we can access the S3 buckets that Citus uses for Disaster recovery where the WAL logs are available and we could use something like WAL-G to take those logs and replicate them in a Postgres instance. The issue here is that this is a very custom migration and the development time might be too high.
Is there any other option to move data from Citus to RDS or Aurora in AWS, what looks like a good path to make the database migration? All the documents refer to move data the other way around, from Aurora or RDS to Citus.
Sumedh from Citus Cloud here. Please go ahead and open a support ticket with us to further investigate solutions. We can evaluate if using DMS is a viable approach for your use-case.

Can you use AWS DMS to move Aurora DB from one account to another?

I am trying to migrate an Aurora cluster from one of our accounts to another. We actually do not have a lot write requests and the database itself is quite small, but somehow we decided to minimize the downtime.
I have looked into several options
Use snapshot: cut off the mutation in source DB, take snapshot, share and restore in another account. This would definitely introduce some downtime
Use Aurora cloning: cut off the mutation in source DB, clone the cluster in target account and switch to the target DB. According to AWS, the cloning is much faster than taking and restoring a snapshot, so the downtime should be shorter.
I am not sure if I can use DMS to do this as I did not find useful doc/tutorials about moving Aurora across accounts. Also, I am not sure whether DMS will sync any write requests to target DB during migration.
If DMS can not live sync, then I probably should use Bucardo to live migrate.
Looking at the docs, AWS Aurora with PostgreSQL compatibility is allowed as source & target endpoints. So, answering your question, yes it's possible.
Obviously, your source Aurora DB should be accessible from the target account. Check that the DB endpoint is public and the traffic is not restricted by ACLs rules or SGs rules.
Also, if you want to enable ongoing replication, you need to grant rds_replication (or rds_superuser) role to the source database user. Link to the docs.
We actually ended up using DMS for this migration. What we did was:
Take a snapshot of the target DB in the original account.
Share the snapshot to the target account and restore it over there. (You have to use snapshot for migrating things like triggers, custom types, sequence, etc)
Setup connections (like VPC peering or security groups) between two accounts.
Setup DMS in source account (endpoints, replication instance, task)
Write SQL to temporarily disable/delete constraints, triggers, etc which may cause error when load source data.
Using DMS to load source data and enable ongoing replication.
Enable/add constraints, triggers, etc back.
Post migration test

real-time sync between local Postgres instance and Azure Cloud Postgres instance

I need to set up real time sync process between a on premise postgresql instance with cloud postgresql instance. Please let me know what are all the options available through which i can achieve it.
Do i have to use any specific tool or it can be managed through replication .
Please advice
Use PgPool
http://www.pgpool.net/mediawiki/index.php/Main_Page
from their web page:
pgpool-II can manage multiple PostgreSQL servers. Using the replication function enables creating a realtime backup on 2 or more physical disks, so that the service can continue without stopping servers in case of a disk failure.

How to setup cross region replica of AWS RDS for PostgreSQL

I have a RDS for PostgreSQL setup in ASIA and would like to have a read copy in US.
But unfortunately just found from the official site that only RDS for MySQL has cross-region replica but not for PostgreSQL.
And I saw this page introduced other ways to migrate data in to and out of RDS for PostgreSQL.
If not buy an EC2 to install a PostgreSQL by myself in US, is there any way the synchronize data from ASIA RDS to US RDS?
It all depends on the purpose of your replication. Is it to provide a local data source and avoid network latencies ?
Assuming that your goal is to have cross-region replication, you have a couple of options.
Custom EC2 Instances
You can create your own EC2 instances and install PostgreSQL so you can customize replication behavior.
I've documented configuring master-slave replication with PostgreSQL on my blog: http://thedulinreport.com/2015/01/31/configuring-master-slave-replication-with-postgresql/
Of course, you lose some of the benefits of AWS RDS, namely automated multi-AZ redundancy, etc., and now all of a sudden you have to be responsible for maintaining your configuration. This is far from perfect.
Two-Phase Commit
Alternate option is to build replication into your application. One approach is to use a database driver that can do this, or to do your own two-phase commit. If you are using Java, some ideas are described here: JDBC - Connect Multiple Databases
Use SQS to uncouple database writes
Ok, so this one is the one I would personally prefer. For all of your database writes you should use SQS and have background writer processes that take messages off the queue.
You will need to have a writer in Asia and a writer in the US regions. To publish on SQS across regions you can utilize SNS configuration that publishes messages onto multiple queues: http://docs.aws.amazon.com/sns/latest/dg/SendMessageToSQS.html
Of course, unlike a two phase commit, this approach is subject to bugs and it is possible for your US database to get out of sync. You will need to implement a reconciliation process -- a simple one can be a pg_dump from Asian and pg_restore into US on a weekly basis to re-sync it, for instance. Another approach can do something like a Cassandra read-repair: every 10 reads out of your US database, spin up a background process to run the same query against Asian database and if they return different results you can kick off a process to replay some messages.
This approach is common, actually, and I've seen it used on Wall St.
So, pick your battle: either you create your own EC2 instances and take ownership of configuration and devops (yuck), implement a two-phase commit that guarantees consistency, or relax consistency requirements and use SQS and asynchronous writers.
This is now directly supported by RDS.
Example of creating a cross region replica using the CLI:
aws rds create-db-instance-read-replica \
--db-instance-identifier DBInstanceIdentifier \
--region us-west-2 \
--source-db-instance-identifier arn:aws:rds:us-east-1:123456789012:db:my-postgres-instance

Replicate data from one RDS server to another

Can we replicate data from one RDS server to another? Or can we set master slave relationship between two RDS servers?
Should we replicate data from non RDS instance to RDS instance?
RDS can replicate from external mysql and also be a master of an external slave. It depends on your usecase if you "should" do it.
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/MySQL.Procedural.Importing.External.Repl.html
While i guess you could setup replication between two RDS instances yourself I don't see why you should since starting a RDS read replica is just a few clicks in AWS console or an api call.
It can be possible to replicate data from RDS to RDS. It is also possible to replicate data from RDS to some other MySQL server.
Steps:
You can go creating your ec2 server and install MySQL.
Change configuration to replicate data.
That will require additional work to manage ec2 instance in case if your data is increasing and crossing the server limits
Then you have to do all the manual work again to replicate data as we can't increase storage in ec2 server.
RDS provides an easy mechanism to create Read replica via a few clicks. (Note: replica is quite a costlier option.)
But going with that you will save manual work one person salary who will be managing the database and doing these setups regularly.
If you are using postgresql database on RDS then you can use bucardo for asynchronous replication. You need to create a EC2 or use can use local system also but it will not be fast enough.
Use the following tutorial if you want to use bucardo.
https://www.installvirtual.com/how-to-install-bucardo-for-postgres-replication/
I think you can using snapshot to clone another rds database