GCP Cloud SQL Postgres Transaction Log Backup / Recovery - google-cloud-sql

I would like to recover (restore) a GCP Cloud Postgres SQL up to the last 15 minutes for DR purposed (RPO is 15 minutes). That means the database must be backed up (typically the transaction log) every 15 minutes. Is this possible in Cloud Postgres and if so, what's the process?
In addition I am concerned about someone or an application bug deleting data. This just happened to us on another system. Ideally, it would be very beneficial to restore the DB backup and the 15 minute incremental transaction log backups to another DB and pick and choose the data that needs to be recovered. Is this possible?

the database must be backed up (typically the transaction log) every
15 minutes.
You can not back up the transaction log, you only back up instance Data. To back up your instance every 15 minutes, the automatic backup would not be enough. Thus, you must use on-demand backup and trigger it every 15 minutes.
restore the DB backup and the 15 minute incremental transaction log
backups to another DB
Yes, you can restore an instance backup to another instance
pick and choose the data that needs to be recovered.
Not in an easy way. You could implement this in some way by not creating an on-demand back up but instead export a SQL dump file or a CSV file and then running your own process to obtain the data you need.
Having said that, and your reference to DR (Disaster Recovery), I would like to point out that Cloud SQL has Automatic Fail-Over Replicas called High Availability. Also, for replication you could use Read-Replicas

Related

FiveTran connects with PostgreSQL database restored every day

I have set up a Fivetran connector to connect to a PostgreSQL database in an EC2 server and snowflake. The connection seems to work (no error), but the data is not really updated.
On the EC2 server, every day a script will pull down the latest dump of our app production database and restore it on the EC2 server, and then the Fivetran connector is expected to sync the database to snowflake. But the data after the first setup date is not synced with the snowflake. Could FiveTran be used in such a setup? If so, do you know what may be the issue of the sync failing?
Could FiveTran be used in such a setup?
Yes, but it's not ideal.
If so, do you know what may be the issue of the sync failing?
It's hard to answer this question without more context, however: Fivetran uses logging to replicate your DB (WAL in the case of PostgreSQL), so if you restore the DB every single day Fivetran will loose track of the changes and will need to re-sync the whole database.
The point made by NickW is completely valid, why not replicate from the DB? I assume the answer is along the lines of the data you need to modify. You can use column blocking and/or hashing to prevent sensible data from being transfered, or to obfuscate it before it's flushed to Snowflake.

Cloud SQL - Growing each day, but not replicating

I've had a replica slave set up for about two weeks now. It has been failing replication due to configuration issues, but still growing in the size of the master each day (about 5gb a day).
Until today, binary logs were disabled. And if I go to Monitoring -> slave instance, under Backup Configuration, it says "false".
How do I determine why this is growing each day?
I noticed in monitoring in the InnoDB Pages Read/Write section, there are upticks of Write each day, but no read. But what is it writing to? The DB hasn't changed. and there are no binary logs.
I noticed in the docs, it says "Point-in-time recovery is enabled by default when you create a new Cloud SQL instance."
But there has never been a "Backup" listed in the Operations list on the instance. And when I do gcloud sql instances describe my-instance, it's not listed under backUpConfiguration
The issue you are having could possibly happen due to Point-in-time recovery, it will show an increase to your storage constantly.
There, you will be able to keep automated backups enabled while disabling point-in-time recovery. Once you disable it, the binary logs will be deleted and you will notice an immediate reduction in storage usage.
Here are the steps to disable Point-in-time recovery:
Select your instance
Select Backups
Under Settings, select Edit
Uncheck box for point-in-time recovery
To add an explanation of Point-in-time recovery, I will add Google Cloud SQL documentation with Postgres and MySQL.
It is necessary to archive the WAL files for instances it is enabled on. This archiving is done automatically on the backend and will consume storage space (even if the instance is idle), and, consequently, using this feature would cause an increased storage space on your DB instance.

PostgreSQL / WAL-archiving: can I leave archive_command empty when doing image snapshot backups?

I have a PostgreSQL 9.5 instance running off an Azure VM. As described here, I must specify a post- and a prescript to tell Azure: "Yes, I've taken care of putting the VM in a state, so the entire VM/blob can be backed up as a snapshot that can be restored as a working new VM" and "Now I'm done", thus Azure will flag the backup as Application consistent.
In terms of PostgreSQL, I have read the docs on continuous archiving, that instruct why and how to enable WAL Archiving to allow for backups. And here comes my question:
If I set archive_mode = on and wal_level = archive, can I leave the archive_command empty, and does this even make sense? Or - should I do some kind of archiving here (like e.g. copying the log segments to another location / disk), and is this archiving necessary to ensure a working database upon restoring the VM in my scenario?
I only need to tell the PostgreSQL "Wait a minute / hold your data-writes (or whatever goes on), while I create a snapshot of the entire VM". The plan is to execute pg_start_backup() before , take the snapshot and then pg_stop_backup().
I do realize, this method (if it's even valid) is essentially a file system level backup, and according to docs, the postgres-service must be shut down for the fs-backup to be valid. Another place I've read that hitting the pg_start_backup() should be enough to guarantee for a valid stand-alone physical backup.
If the snapshots you plan to take are truly atomic, that is, the restored snapshot represents the state of the file system at some point in time, you can just restart the database from such a snapshot, and it will perform crash recovery and come up in a consistent state.
In that case, there is no need to care about WAL archiving or backup mode. You could set archive_mod = off and not worry about it.
If the snapshot is not truly atomic, or you want point-in-time-recovery (the ability to restore the database to a point in time between backups), you need WAL archiving set up and running, because you need the WALs to restore the database to a consistent state.
In that case archive_mode must be on and archive_command must be a command that returns success only if the WAL file has been archived successfully. If only one WAL is missing between your last backup and the time to which you want to restore the database, it will not work.

RDS snapshot restore taking too long

As part of our blue-green deployment strategy we are snapshoting the prod RDS instance and then restoring this snapshot into a new instance applying db migrations after it and linking the newly Green application to it.
Our RDS instance has a 100 GB space, but our DB uses only 10 MB at the moment.
Taking a snapshot takes roughly < 2 minutes
Restoring from the Snapshot takes 25 minutes!
25 minutes for the restore is too long considering users are forced to stay in read only mode for all this period and that our DB size is less than 10 mb at the moment.
I am wondering if this restore time is the usual time for Amazon RDS or if we are doing something wrong.
Amazon RDS Postgres.
Multi AZ: Yes
Instance Class: Medium
General Purpose (SSD)
IOPS: disabled.
After some experimentation we were able to reduce the restoring time from 25 minutes to 5 minutes. This was due to the fact, that RDS first tries to restores the snapshot. (In our case this took 5 minutes). And afterwards it applied the Multi Az change to the new instance. (this was taking like 20 minutes)
Previously we were waiting for the DB to finish the MULTI AZ change, and status="available" to continue with our Deployment, but after contacting AWS, they have confirmed that is safe to start using the new instance even when the instance is being modified to apply the MULTI AZ change. So we continue our deployment process as soon as the restored instance status change from "creating" to "modifying"
This solution as correctly said, might not scale very well but at the moment this is not a concern as we are not expecting this DB to grow significantly.
We consider this approach to be very safe, as any DB schema changes wont affect the live DB and we can safely test the whole GREEN stack before switching to PROD. The only caveat here is that the application need to be in read-only mode, so as not to loose information between the blue and green environments

Mongo db partial back ups

We have a 5 node replication set up on our development server. We are looking for a way to allow developers to back up a subset of data in a mongo db and restore this to their local development enviroments.
We have looked into the clonedb and the mongodump utils, but both only allow for a backup/dump of the complete database. Due to the possible size of the database, we need an option that allows us to limit the data being backed up or restored.
Do any know of a util or way to achieve this?
I just stumbled upon this question again and decided to add a description of our backup strategy we opt in for:
Current back up strategy for our mongo db this server consist of 2 setups; backup via delayed passive secondarynode and daily backup using mongodump (takes journalling and oplog into play).
Besides our normal production nodes, we have setup another secondary node with a priority of 0 (this can either be on its own server or piggy backing off another mongo server but using a seperate port), hidden as true and a delay of 7200 seconds (2hours). This slave is there for "butter fingers", when some one accidentally drops a database or clears a collection, we have 2 hours before these changes replicate to this passive secondary. The passive secondary can NOT be used for READING or WRITING. It's role is simply a back up node. We also use this node for nightly backup to prevent unnecessary overhead on any of the other nodes.
The nightly backup is set to run every night at 23:00 via a cron tab. The command simply executes a script setup in /opt/auto-mongo-backup. This script can be found at https://github.com/jaconel/automongobackup (originally found it at https://github.com/micahwedemeyer/automongobackup). This script allows for a single nightly cron to cover weekly backups and monthly backups. Back ups are saved at /var/backups/mongodb.
Hope this helps some one out.