Cloud SQL (Postgres) Backup and Restore - google-cloud-sql

I understand that Cloud SQL ( Postgres) on-demand backup is incremental. when you restore an instance using this backup, the existing data is wiped off before the instance is restored with all new data. in other words, the "backup" process is incremental but there is no way to restore only a specific incremental backup into an instance
Please, can you confirm if the aforementioned understanding is right?

Indeed, Cloud SQL backups are incremental. Taken from the Documentation:
Backups for Second Generation instances are incremental; they contain
only data that has changed since the previous backup was taken. This
means that your oldest backup is a similar size to your database, but
the sizes of subsequent backups depend on the rate of change of your
data. When the oldest backup is deleted, the size of the next oldest
backup increases so that a full backup still exists.
Yet, Cloud SQL stores up to seven automated backups for each instance. This in fact allows you to restore to any specific backup, but of course, you will be delete all the data on the instance in order to restore the one in the backup.
If you are asking if it will be possible to restore only the incremental differences of an specific backup, then no, it is not possible. And it is also meant that way by the concept of Incremental Backups. You see, by definition, the incremental backup must have all the backups before. So, by restoring an "specific incremental backup into an instance" you are restoring the fullback + all the incremental backups up to the incremental backup you are requesting.

Related

Interbase Backup Validation

We have a custom backup solution utilizing the ibx controls in Delphi to perform nightly automatic backups. As part of our current validation for a successful backup, we read the output logs generated by the backup looking for the "closing file, committing and finishing" verbiage that's last in the log file. Additionally we perform a full restore to a separate area to ensure the ibk file is valid. That's turning out to be problematic in terms of available drive space so looking for other ideas to make sure the backup is successful.
How else might we ensure that our ibk file is valid?
Jeff,
Not sure what your database size or backup file size is, and if they are too big for the remaining disk space. Can you share the database and backup size details?
Older InterBase (2017 and earlier) had a way for the command line tool, gbak, to pipe the output from the backup to another gbak process restoring from the backup. This would allow you to save the disk space on the backup file. But, since you are using the IBX backup/restore service, this is not possible. Also, InterBase 2020 has a different backup format which requires random (not sequential) write access to the backup file, thereby not allowing any pipe output even via the 'gbak' command line tool.
Here are a couple of ways to "reduce" the disk storage requirements that may work for you.
** Backup file **
You can have the InterBase backup service (from your application) store the target backup file in an external storage medium (HDD, USB stick etc., or a SAN disk/network file share). The backup/restore service can read/write backup file(s) from network shares/external medium.
** Restored database **
When restoring the database you can use the service parameter option UseAllSpace (http://docwiki.embarcadero.com/Libraries/Sydney/en/IBX.IBServices.TRestoreOptions), equivalent to gbak option "-use_all_space". This will save you about 20% space on restored data pages.
Turn off index creation, thereby reducing page consumption (possibly quite a bit depending on your index definitions). But, you will lose index validation because of this. "DeactivateIndexes" option (gbak option "-inactive") in the same page above.
Restore the database to a remote InterBase server with its own storage medium, or, to an attached USB stick or SAN disk. Since you are using the restored database only for validating the backup file, you can have this restored database on a slower I/O medium or a slower server over the network.

How to restore multiple databases on a single Google Cloud SQL for PostgreSQL instance to the same point in time?

I have 2 separate PostgreSQL databases in Google Cloud SQL for PostgreSQL. The databases should be separate due to security concerns, but they can reside in the same database instance. When restoring from a backup, it is important for me to have the databases restored to the same point in time, due to data relations.
I tried to understand how the backup works. I tried to research the backup documentation and restore one, but found no clue.
Can I be sure that restoring an instance would bring all its databases to the same point in time?
If you restore a Cloud SQL instance backup, it will restore all the databases in that instance to the point when backup was taken. One thing to note here is that if you restore the backup in the same instance it will overwrite the current data with the one in backup.
You can see the General tips about performing a restore to further help you understand the restoration.

DB2 Online Restore but Without Roll Forward?

I read a lot of documentations for db2 restore but I could not find how to perform online restore from the last database backup but without roll forwarding of logs?
I will appreciate command example.
On example my last online backup is made 1st february. I want to do ONLINE RESTORE of that backup but without logs after 1st February (similar with offline restore option WITHOUT ROLL FORWARD).
I am using db2 9.7
Thank you in advance
The database backup contains a snapshot of the tablespaces, and they may not be in stable state. Roll-forward is always required (unless you want to take insane risks by forcing DB2 to start using potentially corrupt data) to reach the nearest stable state.
If you are asking your question because you want manageable database backup dumps without having to worry about shipping logs etc, use the INCLUDE LOGS option when taking the backup. It will include in the backup file the minimum set of transaction logs that would be required for reaching stable state. When restoring you could then use the LOGS to extract them and then ROLLFORWARD DATABASE for the required typical 0-x seconds (depending on your database transactions).
A lazy dba would probably just use the RECOVER DB SAMPLE TO 2013-02-01-00.00.00 and allow the DB2 to worry about all the details. It will automatically fetch the required database backup and transaction files (even from the backup tapes etc if you set them up correctly), and do everything for you - as long as you don't attempt to manually manage them.

Backing up the DB vs. backing up the VM

We're serving a Django/Postgres site running on a VM hypervisor. We're now trying to figure out our back up strategy and have two probable options:
Back up the DB directly using pg_dump
Back up the VM directly by copying the VM image
I'm with the latter as I think, I could simply back up everything that has to do with the site. I'm not sure whether I have to shut down the VM for this though.
What is a better and more recommended way of backing up a DB? Are there any reasons for not using the VM backup?
Thanks
The question basically boils down to, can you consider a hot copy of PostgreSQL's data files a backup?
The answer is: not really. PostgreSQL tries very hard through the use of WAL to ensure that its files are in a consistent state all the time and that it can survive a power failure, but starting it up from a copy of these files puts PostgreSQL into recovery mode. If the backup happened at the wrong second and PostgreSQL can't recover from the state of these files, your backup is useless. You don't want your backup/restore mechanism to depend on the recovery mechanism (unless you're dealing with "crash only" software, which PostgreSQL is not).
The probability of PostgreSQL not being able to recover from these files is not high, but it's not zero either. The probability of PostgreSQL not being able to load an SQL dump that it made, on the other hand, is zero. I prefer backup choices with lower probabilities of failure. pg_dump was designed for doing backups.
PostgreSQL recommends using pg_dump for backups, as a file system (or VM) backup requires the database to be shut down (and has other drawbacks):
http://www.postgresql.org/docs/8.1/static/backup-file.html
Edit: Also, a pg_dump backup will be significantly smaller than a filesystem dump of the same database.
There is an additional option. With PostgreSQL you can make an online backup that allows you to snapshot the file system and maintain consistency. You can see details here:
http://www.postgresql.org/docs/9.0/static/continuous-archiving.html
We use this exact method for making backups when we run PostgreSQL in a VM.

Best method for PostgreSQL incremental backup

I am currently using pg_dump piped to gzip piped to split. But the problem with this is that all output files are always changed. So checksum-based backup always copies all data.
Are there any other good ways to perform an incremental backup of a PostgreSQL database, where a full database can be restored from the backup data?
For instance, if pg_dump could make everything absolutely ordered, so all changes are applied only at the end of the dump, or similar.
Update: Check out Barman for an easier way to set up WAL archiving for backup.
You can use PostgreSQL's continuous WAL archiving method. First you need to set wal_level=archive, then do a full filesystem-level backup (between issuing pg_start_backup() and pg_stop_backup() commands) and then just copy over newer WAL files by configuring the archive_command option.
Advantages:
Incremental, the WAL archives include everything necessary to restore the current state of the database
Almost no overhead, copying WAL files is cheap
You can restore the database at any point in time (this feature is called PITR, or point-in-time recovery)
Disadvantages:
More complicated to set up than pg_dump
The full backup will be much larger than a pg_dump because all internal table structures and indexes are included
Does not work well for write-heavy databases, since recovery will take a long time.
There are some tools such as pitrtools and omnipitr that can simplify setting up and restoring these configurations. But I haven't used them myself.
Also check out http://www.pgbackrest.org
pgBackrest is another backup tool for PostgreSQL which you should be evaluating as it supports:
parallel backup (tested to scale almost linearly up to 32 cores but can probably go much farther..)
compressed-at-rest backups
incremental and differential (compressed!) backups
streaming compression (data is compressed only once at the source and then transferred across the network and stored)
parallel, delta restore (ability to update an older copy to the latest)
Fully supports tablespaces
Backup rotation and archive expiration
Ability to resume backups which failed for some reason
etc, etc..
Another method is to backup to plain text and use rdiff to create incremental diffs.