Replicate postgres database to Redshift - Ongoing replication - postgresql

I have several postgres databases which need to be replicated as-is to a single aws redshift.
We have currently set up DMS services to the same. However, we keep encountering issues such as source database full, large column issues and most importantly the issue in DMS when new columns with defaults are added on postgres databases(This does not replicate with ongoing replication)
So, are there any other ways that we can set up this ongoing replication?

Related

Postgres pg_dumpall consistency across databases that are backed up

Is there a way to backup all of the databases on a hyperscaler managed postgres server as of a certain time in order to maintain data consistency between the databases with either pg_dumpall, pg_dump or something else?
Background:
With the utilization of micro-services, an application may have many databases associated to it on a single hyperscaler managed postgres server. The hyperscalers do perform a functional snapshot backup; however, when a hyperscaler managed postgres server is accidentally deleted, the postgres backups are lost as well. These hyperscalers provide locks to prevent accidental deletes of a postgres server and mention that their support teams can be contacted to restore a deleted server, however, we still had a postgres server get deleted. We were able to recover by contacting the hyperscalers support team but would like to have a second way of backing up a hyperscaler managed postgres server.
I realize that the micro-services should be able to auto-recover to a data consistent point but the reality is that many of the micro-services have not been designed nor written to that requirement. I really do not want to get into the aspect of micro-service design and want to retain this to be a DBA backup question.
pg_dumpall will not take a consistent backup across all databases. Each database backup will be consistent, but the snapshots for the backups of the different databases will be taken at different times.
If you need a consistent backup across several databases in a single cluster, use an online file system backup with pg_basebackup.
You can use pg_basebackup which most of the PostgreSQL DBA’s would end up using on a daily basis be it via scripts or manually. This creates the base backup of the database which can help in recovering in multiple situations. This takes an online backup of the database and hence is very useful when being used in production
you can review this for more details

RDS postgres - replicate a partitioned table to another RDS postgres

I have a partitioned table in RDS1 (version 10) that I want to replicate to another RDS postgres (version10)
I'd like to know if it's possible. If yes, what would be the method that I should apply ?
The data should be replicated in real-time, very low latency. Partitions in the master are added and removed in daily basis.
The motivation for the replication is to reduce load from the master db.
Thanks!

How safe is to use Postgres multiple schemas?

Heroku warns of using Multiple Schemas with Postgres. But does not specify numerous operational problems caused.
As posted on Heroku docs:
The most common use case for using multiple schemas in a database is
building a software-as-a-service application wherein each customer has
their own schema. While this technique seems compelling, we strongly
recommend against it as it has caused numerous cases of operational
problems. For instance, even a moderate number of schemas (> 50) can
severely impact the performance of Heroku’s database snapshots tool,
PG Backups.
I think, the problem of Backups can be solved by adding a follower db.
I have 60 tables per schema, so with 1000 schemas I will have 60,000 tables. How will this impact database performance? What kind of problems I can expect while scaling?
The first issues with running a large volume of schemas and/or tables don't typically interfere with a running database. The major problem that operators encounter is that they will be unable to create logical backups of the database. Running heroku pg:backups is likely to fail as is running pg_dump manually. Typically, this is the error you'll see in the attempted backup logs:
ERROR: out of shared memory HINT: You might need to increase max_locks_per_transaction
The large volume of locks needed end up causing OOM conditions for the database. This isn't always a problem on Heroku. If you're using a production database you can rely on their point-in-time recovery option as your DR solution. That being said, exporting the data off Heroku will be difficult if you can't run a logical backup since they don't currently support external replication. It's less than ideal but hypothetically you could try to dump the database schema by schema in order avoid the OOM conditions.

Replication of AWS RDS Postgresql into On-Premise Postgresql

I have a requirement of replicate data from AWS RDS Postgres(9.6) Database to On-Premise Postgres(9.5) Database. I have found stuff about replication from On-premise to On-premise. But How can we implement it for AWS RDS to On-premise?
I do this using Bucardo.
Check-out this: https://bucardo.org/Bucardo/
With Bucardo you can replicate RDS postgres instance to a slave postgres present somewhere, only configuring slave, so without the needs to configure RDS stuff.
Also you can do this with zero downtime.
Anyway I am not sure this will work using different versions of Postgresql. You should use same version if possible. I tested it with 9.4.x and it is working.
UPDATE
I can confirm that this is working also using different version of Postgres, for example I was able to replicate with these versions:
AWS RDS postgresql 9.4.x
On-premise postgresql 9.6.x
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_PostgreSQL.html#PostgreSQL.Concepts.General.FeatureSupport.LogicalReplication
Beginning with PostgreSQL version 9.4, PostgreSQL supports the
streaming of WAL changes using logical replication slots. Amazon RDS
supports logical replication for a PostgreSQL DB instance version
9.4.9 and higher and 9.5.4 and higher. Using logical replication, you can set up logical replication slots on your instance and stream
database changes through these slots to a client like pg_recvlogical.
Logical slots are created at the database level and support
replication connections to a single database.
mind possible problems eg https://dba.stackexchange.com/questions/173267/aws-rds-postgres-logical-replication

Migration from AWS Aurora to a local Postgres 9.6 database

I am considering using AWS Aurora, however I am concerned for being locked into AWS indefinitely. So I am wondering how difficult it would be to transfer data from Aurora to my own Postgres database.
Thanks!
This is a very valid concern. Firstly, there is no seamless migration like there is from Postgres to Aurora. Following, needs to be considered:
How to do it: You will have to take a dump of your aurora db and then import it into postgres.
Because of 1 above; you cannot have concurrent CURD operations running on your aurora during migration. Hence, you need to shut down all products connecting to your aurora till you migrate to Postgres. Hence, there will be downtime.
Because of 2 ; Depending on size of your DB; it might take few mins ( few GB of data ) to many hours if you have huge DB.
Hence, you need to consider how much data you have and how much downtime you can live with if you want to migrate back to Postgres.