As part of our blue-green deployment strategy we are snapshoting the prod RDS instance and then restoring this snapshot into a new instance applying db migrations after it and linking the newly Green application to it.
Our RDS instance has a 100 GB space, but our DB uses only 10 MB at the moment.
Taking a snapshot takes roughly < 2 minutes
Restoring from the Snapshot takes 25 minutes!
25 minutes for the restore is too long considering users are forced to stay in read only mode for all this period and that our DB size is less than 10 mb at the moment.
I am wondering if this restore time is the usual time for Amazon RDS or if we are doing something wrong.
Amazon RDS Postgres.
Multi AZ: Yes
Instance Class: Medium
General Purpose (SSD)
IOPS: disabled.
After some experimentation we were able to reduce the restoring time from 25 minutes to 5 minutes. This was due to the fact, that RDS first tries to restores the snapshot. (In our case this took 5 minutes). And afterwards it applied the Multi Az change to the new instance. (this was taking like 20 minutes)
Previously we were waiting for the DB to finish the MULTI AZ change, and status="available" to continue with our Deployment, but after contacting AWS, they have confirmed that is safe to start using the new instance even when the instance is being modified to apply the MULTI AZ change. So we continue our deployment process as soon as the restored instance status change from "creating" to "modifying"
This solution as correctly said, might not scale very well but at the moment this is not a concern as we are not expecting this DB to grow significantly.
We consider this approach to be very safe, as any DB schema changes wont affect the live DB and we can safely test the whole GREEN stack before switching to PROD. The only caveat here is that the application need to be in read-only mode, so as not to loose information between the blue and green environments
Related
I want to create a free tier clone of a production AWS RDS PostgreSQL. As per my understanding, following are different ways
create a snapshot of the production DB and restore it on t2.micro
create a read replica of the production DB using t2.micro and then detach it as independent database
create a free tier database and restore a database dump of the production db
Option 3 is my last preference.
The problem is while creating read replica or restoring from snapshot, AWS doesn't explicitly allow to choose the free tier template. I just want to know if restoring to t2.micro without any advanced features like autoscaling, performance monitoring etc. is equivalent to free tier or not? I read here that the key thing with AWS production DB is that AWS provisions a secondary database provisioned to fallback in event of failure of the primary database or the Availability Zone in which the database is running.
AWS Free Tier doesn't actually care about the kind of service you use. Per their website you just get 750 instance hours per month for a db.t2.micro.
You can use these in any service you see fit and the discount will be applied automatically for the first 12 months.
Looking at the pricing page for RDS Postgres I can see, that these instances aren't listed anymore, which seems weird. The t2 instance family is fairly old, so they're probably trying to phase it out, but typically you can provision older instance types using the API directly if they're not available in the Console.
So what you want to do is create your db.t2.micro instance using one of the SDKs or the AWS CLI and restore from a snapshot. Alternatively you can create a read replica from the CLI and set the class to db.t2.micro. Later detaching that from the main cluster should work.
The production ready stuff refers to the Multi-AZ deployment, which is good for production use, but for anything production related a t2.micro seems like a bad choice, so I'm going to assume you're not planing to do that.
I would like to know how much downtime should I expect to reboot an multi az rds instance without failover. I want to apply some static parameter changes. The instance is t2 medium with 100 GiB storage. It's running postgres 9.6. I am not looking for exact number but an estimate.
I recently rebooted db.t3.micro RDS postgres 10.6 instance without failover, and it took about 1min when the status became available again, DB size 18GB.
In my experience that would take about 5 to 10 minutes, but I'm not sure if the DB would be down and inaccessible the whole time. Please don't make any critical business decisions based on my estimate though. If this is a critical issue, then you should create a copy of the database from a snapshot and test out the parameter changes on the copy.
If you are concerned with limiting the downtime, just add a replica, make the changes and let it failover, then once the changes are done remove the replica.
I would like to recover (restore) a GCP Cloud Postgres SQL up to the last 15 minutes for DR purposed (RPO is 15 minutes). That means the database must be backed up (typically the transaction log) every 15 minutes. Is this possible in Cloud Postgres and if so, what's the process?
In addition I am concerned about someone or an application bug deleting data. This just happened to us on another system. Ideally, it would be very beneficial to restore the DB backup and the 15 minute incremental transaction log backups to another DB and pick and choose the data that needs to be recovered. Is this possible?
the database must be backed up (typically the transaction log) every
15 minutes.
You can not back up the transaction log, you only back up instance Data. To back up your instance every 15 minutes, the automatic backup would not be enough. Thus, you must use on-demand backup and trigger it every 15 minutes.
restore the DB backup and the 15 minute incremental transaction log
backups to another DB
Yes, you can restore an instance backup to another instance
pick and choose the data that needs to be recovered.
Not in an easy way. You could implement this in some way by not creating an on-demand back up but instead export a SQL dump file or a CSV file and then running your own process to obtain the data you need.
Having said that, and your reference to DR (Disaster Recovery), I would like to point out that Cloud SQL has Automatic Fail-Over Replicas called High Availability. Also, for replication you could use Read-Replicas
I am using RDS to run a postgresql server (9.6.3) and this morning, a backup was automatically kicked off. It is still going 6 hours later which seems absurd. The database is not that big (~ 600 GB), and as far as I can tell, this is the first time i've experienced this problem. The machine is relatively beefy (db.m4.2xlarge), so it seems like these backups should take a lot less than 6 hours.
I am also surprised by the fact that a backup would be kicked off at 5:30 AM, which seems awfully close to standard biz hours.
Any ideas?
You scheduled the 5:30 AM backup window. Amazon didn't randomly kick it off at that time. Look at your RDS instance's settings and you will see a backup window that was defined when you created the instance.
An RDS backup is like an EBS snapshot, and it shouldn't be reliant on the CPU of the server at all. It should also not affect server performance at all.
You should look into migrating to Amazon Aurora now that the PostreSQL version is out of beta. Among other benefits, you will get extremely fast snapshot creation with Aurora.
Sometimes things like this become "stuck" due to an issue behind the scenes. If that happens all you can do is open a support ticket with AWS to get it fixed.
We have a 5 node replication set up on our development server. We are looking for a way to allow developers to back up a subset of data in a mongo db and restore this to their local development enviroments.
We have looked into the clonedb and the mongodump utils, but both only allow for a backup/dump of the complete database. Due to the possible size of the database, we need an option that allows us to limit the data being backed up or restored.
Do any know of a util or way to achieve this?
I just stumbled upon this question again and decided to add a description of our backup strategy we opt in for:
Current back up strategy for our mongo db this server consist of 2 setups; backup via delayed passive secondarynode and daily backup using mongodump (takes journalling and oplog into play).
Besides our normal production nodes, we have setup another secondary node with a priority of 0 (this can either be on its own server or piggy backing off another mongo server but using a seperate port), hidden as true and a delay of 7200 seconds (2hours). This slave is there for "butter fingers", when some one accidentally drops a database or clears a collection, we have 2 hours before these changes replicate to this passive secondary. The passive secondary can NOT be used for READING or WRITING. It's role is simply a back up node. We also use this node for nightly backup to prevent unnecessary overhead on any of the other nodes.
The nightly backup is set to run every night at 23:00 via a cron tab. The command simply executes a script setup in /opt/auto-mongo-backup. This script can be found at https://github.com/jaconel/automongobackup (originally found it at https://github.com/micahwedemeyer/automongobackup). This script allows for a single nightly cron to cover weekly backups and monthly backups. Back ups are saved at /var/backups/mongodb.
Hope this helps some one out.