Sandbox version for AWS RedShift - amazon-redshift

I have been using RedShift for a few months and I like it. But I need to add some tests around it and I am not sure what the most cost effective way of doing it is. I can only think of using one server RedShift cluster as Sandbox but that seems to be too costly even if I only use it during testing

Databases in Redshift cannot 'see' each other and cross-database queries are not supported. Therefore we simply have 'development', 'test' and 'production' databases on the same cluster.
When we're ready to push to production we:
take a snapshot
drop production
rename test to production
This generally works fine for use because we find Redshift to be over-provisioned on storage, i.e., filling our nodes to their max storage capacity does not provide acceptable performance.
NOTE: You cannot drop your "master" database defined when the cluster was created. If you are using that as your primary database you will have to unload your cluster and recreate it for this approach to be viable.

I got the answer from AWS RedShift forum: "There is no way of creating a sandbox version of Redshift. We'll add this to our backlog of feature requests"

Related

Migrate data from Citus to RDS

Since Citus is not going to be available as a Managed Service in AWS, I am trying move the database to RDS (not the whole history but only the transactional portion as an OLTP). The migration from Citus is not clear because the data does not reside in a single node. I want to check the options we might have to move data from Citus to RDS.
Amazon DMS: This option is good for the supported databases (PostgreSQL) but we do not know what behavior this will have in Citus from the distributed nature of the engine. Has someone migrated the data to S3, to another DB or something in these lines?
I saw this paper from AWS https://d1.awsstatic.com/whitepapers/aws-cloud-data-ingestion-patterns-practices.pdf?did=wp_card&trk=wp_card on how to ingest data from different sources and DMS seems like a good option but I do not know the internals of Citus that well to tell if we will get all the data and gather the CDC correctly.
A Custom migration: Via a support ticket, we can access the S3 buckets that Citus uses for Disaster recovery where the WAL logs are available and we could use something like WAL-G to take those logs and replicate them in a Postgres instance. The issue here is that this is a very custom migration and the development time might be too high.
Is there any other option to move data from Citus to RDS or Aurora in AWS, what looks like a good path to make the database migration? All the documents refer to move data the other way around, from Aurora or RDS to Citus.
Sumedh from Citus Cloud here. Please go ahead and open a support ticket with us to further investigate solutions. We can evaluate if using DMS is a viable approach for your use-case.

how to setup replication instance in on premises postgres for master database in AWS RDS postgres?

I have a requirement of checking whether the exact copy of master database from AWS RDS can be created in on premises or not..
I have already established the connectivity between on prem and aws. Also checked the data migration using pg dump. But i am not getting how to create the replica without using DMS. Due to some security purpose we are not supposed to use DMS. So is there any other way out to implement thi ?
Any help will be much appreciated
It appears that your goal is disaster recovery.
Amazon RDS offers a few options for this:
Amazon RDS Snapshots are a backup of the database, stored in a region. If your database is in an Availability Zone that fails, the snapshot can be restored as a new database in another AZ. All AZs are physically separate data centers, much like your own data center is physically separate from an AWS data center.
Snapshots can also be copied to other Regions, which would guarantee a separation distance between data centers.
Multi-AZ Amazon RDS Databases keep a second copy of the data in another AZ and can switch-over to the alternate AZ without losing any data. This is faster than restoring a snapshot, but costs twice as much since two separate database servers are deployed.
These options would be easier to manage than replicating your data to an on-premises system. A Multi-AZ will automatically start the secondary instance, so your app can continue operating with only a short delay and no data loss. This is much better than you could offer if you fail-over to an on-premises system.

Can you use AWS DMS to move Aurora DB from one account to another?

I am trying to migrate an Aurora cluster from one of our accounts to another. We actually do not have a lot write requests and the database itself is quite small, but somehow we decided to minimize the downtime.
I have looked into several options
Use snapshot: cut off the mutation in source DB, take snapshot, share and restore in another account. This would definitely introduce some downtime
Use Aurora cloning: cut off the mutation in source DB, clone the cluster in target account and switch to the target DB. According to AWS, the cloning is much faster than taking and restoring a snapshot, so the downtime should be shorter.
I am not sure if I can use DMS to do this as I did not find useful doc/tutorials about moving Aurora across accounts. Also, I am not sure whether DMS will sync any write requests to target DB during migration.
If DMS can not live sync, then I probably should use Bucardo to live migrate.
Looking at the docs, AWS Aurora with PostgreSQL compatibility is allowed as source & target endpoints. So, answering your question, yes it's possible.
Obviously, your source Aurora DB should be accessible from the target account. Check that the DB endpoint is public and the traffic is not restricted by ACLs rules or SGs rules.
Also, if you want to enable ongoing replication, you need to grant rds_replication (or rds_superuser) role to the source database user. Link to the docs.
We actually ended up using DMS for this migration. What we did was:
Take a snapshot of the target DB in the original account.
Share the snapshot to the target account and restore it over there. (You have to use snapshot for migrating things like triggers, custom types, sequence, etc)
Setup connections (like VPC peering or security groups) between two accounts.
Setup DMS in source account (endpoints, replication instance, task)
Write SQL to temporarily disable/delete constraints, triggers, etc which may cause error when load source data.
Using DMS to load source data and enable ongoing replication.
Enable/add constraints, triggers, etc back.
Post migration test

How to downsize an AWS RDS instance to free tier

I want to create a free tier clone of a production AWS RDS PostgreSQL. As per my understanding, following are different ways
create a snapshot of the production DB and restore it on t2.micro
create a read replica of the production DB using t2.micro and then detach it as independent database
create a free tier database and restore a database dump of the production db
Option 3 is my last preference.
The problem is while creating read replica or restoring from snapshot, AWS doesn't explicitly allow to choose the free tier template. I just want to know if restoring to t2.micro without any advanced features like autoscaling, performance monitoring etc. is equivalent to free tier or not? I read here that the key thing with AWS production DB is that AWS provisions a secondary database provisioned to fallback in event of failure of the primary database or the Availability Zone in which the database is running.
AWS Free Tier doesn't actually care about the kind of service you use. Per their website you just get 750 instance hours per month for a db.t2.micro.
You can use these in any service you see fit and the discount will be applied automatically for the first 12 months.
Looking at the pricing page for RDS Postgres I can see, that these instances aren't listed anymore, which seems weird. The t2 instance family is fairly old, so they're probably trying to phase it out, but typically you can provision older instance types using the API directly if they're not available in the Console.
So what you want to do is create your db.t2.micro instance using one of the SDKs or the AWS CLI and restore from a snapshot. Alternatively you can create a read replica from the CLI and set the class to db.t2.micro. Later detaching that from the main cluster should work.
The production ready stuff refers to the Multi-AZ deployment, which is good for production use, but for anything production related a t2.micro seems like a bad choice, so I'm going to assume you're not planing to do that.

Will Serverless support AWS DocumentDB?

I work in a company that's using Serverless to build cloud-native applications and services. Today we use DynamoDB and SQL Databases with AWS Aurora.
We want to go with DocumentDB for our next application, but we could not find anything about Serverless and AWS DocumentDB. Does Serverless support AWS DocumentDB? If not, is there any plans to support it in the future?
Serverless supports any AWS resources that you can define using CloudFormation. As per the Serverless docs here:
Define your AWS resources in a property titled resources. What goes in
this property is raw CloudFormation template syntax, in YAML...
The YAML for creating a DocumentDB cluster is, going to look something like:
resources:
Resources:
DBCluster:
Type: "AWS::DocDB::DBCluster"
DeletionPolicy: Delete
Properties:
DBClusterIdentifier: "MyCluster"
MasterUsername: "MasterUser"
MasterUserPassword: "Password1234!"
DBInstance:
Type: "AWS::DocDB::DBInstance"
Properties:
DBClusterIdentifier: "MyCluster"
DBInstanceIdentifier: "MyInstance"
DBInstanceClass: "db.r4.large"
DependsOn: DBCluster
You can find the other CloudFormation resources that you can define in the resources parameter of your Serverless.yaml here.
DocumentDB is not a serverless service. You need to manage the backend server to use it.
Please refer to this blog: https://blogs.itemis.com/en/serverless-services-on-aws, you can see it is not in the list of "SERVERLESS SERVICES ON AWS".
No, this won't support serverless, if you really want this you can go with DynamoDB. Also, can see differences if you want.
DocumentDB
MongoDB is supported in this database, which provide ease to learn
Stored procedures are needed in this, where data retrieval and data accumulation is done with help
Document size is limited to 16MB and storage is maximized up to 64TB of data.
Daily backups are managed by the database itself, and can be recovered whenever required
This is costly as we require paying around $200/month even if the user uses only some instances of database or only used few hours.
AWS is not involved in the user credentials stored area as that will be stored in DB directly
Available in specific regions
Can be easily migrated out of AWS into any MongoDB
In case of primary node failure, service promotes read-replica to primary. Multi A-Z has to be configured by users. Backup can be copied across regions
DynamoDB
MongoDB is not directly supported i this and even not easy to migrate from MongoDB to DynamoDB
Stored procedures are not needed in this, which makes the process easier for users
There is no limit in the document size as it can be scaled up to the size of user requirements
Daily backups are not available which makes the user too backup the data which triggered explicitly by users, and can be recovered whenever needed
There is initial cost associated with this, but overall cost is less. Also, on-demand pricing is available where user manage with the lesser amount of $1/month. 25GB data is provided for free in first stage.
AWS controls the user access to the database through identity and access management where authentication and authorization is needed for low level as well
Available in all regions
Can not be easily migrated out of AWS into any MongoDB, you need to write a code to transform
Support global tables, which protect users against regional failure. Data is automatically replicated across multiple AZs in a single region.