Predict if a snapshot is corrupted - postgresql

I am working with Aurora postgres on AWS ans I have snapshots for the DB cluster.
I want to predict in case of a disaster, if the snapshot will work as expected to restore the cluster or not, is there any way I can regularly check the integrity of the snapshot before the disaster happens?

Related

Backup and Restore AWS RDS Aurora cluster

I would like to backup every single PostgreSql database of my AWS RDS Cluster (Aurora DB Engine). Are there some managed tools (like Veeam or N2WS) or best practices, how to backup and restore a single database or schema from AWS S3?
Many thanks
You can use automatic backup combined with manual backup for Aurora PostgreSql database. For automatic backup, the max retention period is 35 days, and support any point in time restore and recovery. However, if you need a backup beyond the backup retention period (35 days), you can also take a snapshot of the data in your cluster volume.
If you use third-party tools, such as Veeam, it will also invoke AWS RDS snapshot API to take the backup, so the underly mechanism is the same.
You can also use the pg_dump utility for backing up the RDS for PostgreSQL database, and run pg_dump on read replica to minimize the performance impact to the primary database.

Postgres incremental backups per db (not per cluster)

According to the WAL Point-in-time recovery and incremental backup documentation, pg_basebackup and wals are used to take backups and incremental backups of a running PostgreSQL cluster.
Is there a way to take incremental backups per database instead of the whole cluster ?
Thanks !

Migration from AWS Aurora to a local Postgres 9.6 database

I am considering using AWS Aurora, however I am concerned for being locked into AWS indefinitely. So I am wondering how difficult it would be to transfer data from Aurora to my own Postgres database.
Thanks!
This is a very valid concern. Firstly, there is no seamless migration like there is from Postgres to Aurora. Following, needs to be considered:
How to do it: You will have to take a dump of your aurora db and then import it into postgres.
Because of 1 above; you cannot have concurrent CURD operations running on your aurora during migration. Hence, you need to shut down all products connecting to your aurora till you migrate to Postgres. Hence, there will be downtime.
Because of 2 ; Depending on size of your DB; it might take few mins ( few GB of data ) to many hours if you have huge DB.
Hence, you need to consider how much data you have and how much downtime you can live with if you want to migrate back to Postgres.

Migrate database from Heroku to AWS

I want to migrate our postgres db from heroku to our own postgres on AWS.
I have tried using pg_dump and pg_restore to do the migration and it works; but it takes a really long time to do this. Our database size is around 20GB.
What's the best way to do the migration with minimal downtime?
If you mean AWS RDS PostgreSQL:
pg_dump and pg_restore
I know you don't like it, but you don't really have other options. With a lot of hoop jumping you might be able to do it with Londiste or Slony-I on a nearby EC2 instance, but it'd be ... interesting. That's not the most friendly way to do an upgrade, to say the least.
What you should be able to do is ship WAL into RDS PostgreSQL, and/or stream replication logs. However Amazon don't support this.
Hopefully Amazon will adopt some part of 9.4's logical replication and logical changeset extraction features, or better yet the BDR project - but I wouldn't hold my breath.
If you mean AWS EC2
If you're running your own EC2 instance with Pg, use replication then promote the standby into the new master.

MongoDB Backups - Is it safe to snapshot only the dbpath volume?

Assumption: Single MongoDB instance.
I have tested a backup and restore using an EBS snapshot of only the volume storing my data (dbpath) and NOT the /logs or /journal volumes. The restore seems to work fine and the data is available.
Are there any risks or downsides to doing this? In other words, do I lose anything if I don't have a backup snapshot of the /logs and /journal volumes?
Backing up if journal and dbpath are on separate EBS volumes
If your /journal directory is on a different EBS volume from your dbpath, the only way to get a consistent backup would be to use db.fsyncLock() to ensure there are no pending write operations. The fsyncLock() command has the side effect of blocking all writes to your database, so typically you would only want to use this approach if you are backing up from a secondary in a replica set (rather than a sole mongod, as per your assumption in the question description).
Backing up if journal and dbpath are on the same EBS volumes
If the journal and dbpath are on the same EBS volume you can get a consistent backup using EBS snapshots.
Do you need to backup the log directory?
Strictly speaking, you do not need to backup the logs. For troubleshooting purposes it can be useful to rotate the logs and keep a few days of recent log files.
I have tested a backup and restore using an EBS snapshot of only the volume storing my data (dbpath) and NOT the /logs or /journal volumes. The restore seems to work fine and the data is available.
This approach will be fine, until it isn't -- that fateful day when you want to need to restore from backup and realise that your last n backups are unusable as you try them one at a time, or perhaps encounter unexpected errors days after you assumed a restored database was OK. If you don't backup the journal file this is effectively the same as running without journaling, and the recommended recovery procedures involve running a repair before restarting. The risk isn't so much about changes that haven't been flushed from the journal, but rather the unlucky timing if the power goes out in the middle of a write to the data files leaving things in an inconsistent state with no recovery information (aka: the journal).
If you're going to take backups, definitely follow the correct procedure to remove unnecessary risk.
For more information see EC2 Backup and Restore in the MongoDB manual.