Will Serverless support AWS DocumentDB? - mongodb

I work in a company that's using Serverless to build cloud-native applications and services. Today we use DynamoDB and SQL Databases with AWS Aurora.
We want to go with DocumentDB for our next application, but we could not find anything about Serverless and AWS DocumentDB. Does Serverless support AWS DocumentDB? If not, is there any plans to support it in the future?

Serverless supports any AWS resources that you can define using CloudFormation. As per the Serverless docs here:
Define your AWS resources in a property titled resources. What goes in
this property is raw CloudFormation template syntax, in YAML...
The YAML for creating a DocumentDB cluster is, going to look something like:
resources:
Resources:
DBCluster:
Type: "AWS::DocDB::DBCluster"
DeletionPolicy: Delete
Properties:
DBClusterIdentifier: "MyCluster"
MasterUsername: "MasterUser"
MasterUserPassword: "Password1234!"
DBInstance:
Type: "AWS::DocDB::DBInstance"
Properties:
DBClusterIdentifier: "MyCluster"
DBInstanceIdentifier: "MyInstance"
DBInstanceClass: "db.r4.large"
DependsOn: DBCluster
You can find the other CloudFormation resources that you can define in the resources parameter of your Serverless.yaml here.

DocumentDB is not a serverless service. You need to manage the backend server to use it.
Please refer to this blog: https://blogs.itemis.com/en/serverless-services-on-aws, you can see it is not in the list of "SERVERLESS SERVICES ON AWS".

No, this won't support serverless, if you really want this you can go with DynamoDB. Also, can see differences if you want.
DocumentDB
MongoDB is supported in this database, which provide ease to learn
Stored procedures are needed in this, where data retrieval and data accumulation is done with help
Document size is limited to 16MB and storage is maximized up to 64TB of data.
Daily backups are managed by the database itself, and can be recovered whenever required
This is costly as we require paying around $200/month even if the user uses only some instances of database or only used few hours.
AWS is not involved in the user credentials stored area as that will be stored in DB directly
Available in specific regions
Can be easily migrated out of AWS into any MongoDB
In case of primary node failure, service promotes read-replica to primary. Multi A-Z has to be configured by users. Backup can be copied across regions
DynamoDB
MongoDB is not directly supported i this and even not easy to migrate from MongoDB to DynamoDB
Stored procedures are not needed in this, which makes the process easier for users
There is no limit in the document size as it can be scaled up to the size of user requirements
Daily backups are not available which makes the user too backup the data which triggered explicitly by users, and can be recovered whenever needed
There is initial cost associated with this, but overall cost is less. Also, on-demand pricing is available where user manage with the lesser amount of $1/month. 25GB data is provided for free in first stage.
AWS controls the user access to the database through identity and access management where authentication and authorization is needed for low level as well
Available in all regions
Can not be easily migrated out of AWS into any MongoDB, you need to write a code to transform
Support global tables, which protect users against regional failure. Data is automatically replicated across multiple AZs in a single region.

Related

Can you use AWS DMS to move Aurora DB from one account to another?

I am trying to migrate an Aurora cluster from one of our accounts to another. We actually do not have a lot write requests and the database itself is quite small, but somehow we decided to minimize the downtime.
I have looked into several options
Use snapshot: cut off the mutation in source DB, take snapshot, share and restore in another account. This would definitely introduce some downtime
Use Aurora cloning: cut off the mutation in source DB, clone the cluster in target account and switch to the target DB. According to AWS, the cloning is much faster than taking and restoring a snapshot, so the downtime should be shorter.
I am not sure if I can use DMS to do this as I did not find useful doc/tutorials about moving Aurora across accounts. Also, I am not sure whether DMS will sync any write requests to target DB during migration.
If DMS can not live sync, then I probably should use Bucardo to live migrate.
Looking at the docs, AWS Aurora with PostgreSQL compatibility is allowed as source & target endpoints. So, answering your question, yes it's possible.
Obviously, your source Aurora DB should be accessible from the target account. Check that the DB endpoint is public and the traffic is not restricted by ACLs rules or SGs rules.
Also, if you want to enable ongoing replication, you need to grant rds_replication (or rds_superuser) role to the source database user. Link to the docs.
We actually ended up using DMS for this migration. What we did was:
Take a snapshot of the target DB in the original account.
Share the snapshot to the target account and restore it over there. (You have to use snapshot for migrating things like triggers, custom types, sequence, etc)
Setup connections (like VPC peering or security groups) between two accounts.
Setup DMS in source account (endpoints, replication instance, task)
Write SQL to temporarily disable/delete constraints, triggers, etc which may cause error when load source data.
Using DMS to load source data and enable ongoing replication.
Enable/add constraints, triggers, etc back.
Post migration test

How to downsize an AWS RDS instance to free tier

I want to create a free tier clone of a production AWS RDS PostgreSQL. As per my understanding, following are different ways
create a snapshot of the production DB and restore it on t2.micro
create a read replica of the production DB using t2.micro and then detach it as independent database
create a free tier database and restore a database dump of the production db
Option 3 is my last preference.
The problem is while creating read replica or restoring from snapshot, AWS doesn't explicitly allow to choose the free tier template. I just want to know if restoring to t2.micro without any advanced features like autoscaling, performance monitoring etc. is equivalent to free tier or not? I read here that the key thing with AWS production DB is that AWS provisions a secondary database provisioned to fallback in event of failure of the primary database or the Availability Zone in which the database is running.
AWS Free Tier doesn't actually care about the kind of service you use. Per their website you just get 750 instance hours per month for a db.t2.micro.
You can use these in any service you see fit and the discount will be applied automatically for the first 12 months.
Looking at the pricing page for RDS Postgres I can see, that these instances aren't listed anymore, which seems weird. The t2 instance family is fairly old, so they're probably trying to phase it out, but typically you can provision older instance types using the API directly if they're not available in the Console.
So what you want to do is create your db.t2.micro instance using one of the SDKs or the AWS CLI and restore from a snapshot. Alternatively you can create a read replica from the CLI and set the class to db.t2.micro. Later detaching that from the main cluster should work.
The production ready stuff refers to the Multi-AZ deployment, which is good for production use, but for anything production related a t2.micro seems like a bad choice, so I'm going to assume you're not planing to do that.

How can I deploy Mongo database on AWS?

I am building my own webapp which requires a huge database. I want to build and manage my own Mongo database on AWS rather than using Mongo Atlas. Which will be more cost saving? And whether I should go for Mongo Atlas? What will be its advantage over my own database?
There are pros and cons for both approaches:
Running MongoDB on AWS
Pros:
Complete control over how you run the database and how resources are allocated on the server. This could even be together with an application server on the same EC2 instance depending on your traffic and load. This might help with cost saving if your database is huge but isn't likely to see much traffic.
Cons:
You will be responsible for ensuring database availability and applying security patches as and when they are available. You may also have to setup firewalls and protect the EC2 instance and database in other ways that would be trivial to do on a hosted service like Atlas.
Data sharding and clustering can be a real pain to manage by yourself.
Running on Atlas
Pros:
Completely managed service where you don't have to be concerned about performance optimization or scalability. You pay for the services and Mongodb takes care of the rest.
You can focus on building a great application instead of spending your time on administering the database and the EC2 instance on which the database runs.
Cons:
You will be constrained by the options offered by Atlas. For most use cases this should be fine, but if you really want a specific change, it would be difficult to implement it if Mongodb doesn't already support it as a part of Atlas.
Think running your application on EC2 vs buying a server on-premise and running your application on that.
Being a managed service, costs might also be higher if your database does not see much traffic.
HOSTING yourself: You can get one or more AWS ec2 instances(which are VMs) where you can install and run Mongo DB yourself and manage it like you wanted to, making sure that you spin up more instances when the workload becomes large and there are instances up and running at all times to enable high availability.
Cost (high) - Management responsibilities (lots) - Full MongoDB functionality
MongoDB Atlas is a managed service, you don't need to worry about management tasks like scaling of your database and high availability when a single/more instances die... You pay a very low cost for it - this is run by MongoDb themselves on AWS, Azure, Google cloud;
Cost (low) - Management responsibilities (some) - Full MongoDB functionality
Now AWS has its own Mongo compatible database called DocumentDB - this is also a managed database, so you don't need to worry about scalability, high availability etc. This is only available on AWS so super simple and convinient.
Cost (low) - Management responsibilities (minimal) - Limited MongoDB functionality

MongoDB on Azure worker role

I m developing an application using SignalR to manage websockets and allow my clients to dialog between each other.
I m planning to host this back-office on an Azure worker role. As my SignalR requests carry data that is most of the time saved in the database, I m wondering if NoSQL's MongoDB instead of the classic SQL Server/Entity Framework couple should be a good approach.
Assuming that my application's data types will be strings for most of them, I think MongoDB will be a reliable and a performant solution, and it will allow me to get rid of Azure's SQL's database costs.
For information, the Azure worker role will be running on a machine with the following hardware: 1 core CPU, 3.5GB RAM and 50GB SSD storage.
Do you think I m on a good start with this architecture ?
Thanks
Do you think I m on a good start with this architecture?
In a word, no.
A user asked a similar question regarding running Redis on Worker Roles - Setting up Redis on Azure cloud service worker role - all of the content on that Q/A is relevant in the MongoDb context.
I'd suggest that you read my answer as it goes into more detail, but as an overview of why this is a bad architectural approach:
You cannot guarantee when a Worker Role will be restarted by the Azure Service Fabric.
In a real-world implementation of Mongo, you would run multiple nodes within a cluster, with a single Worker Role (as you have suggested in your question) this won't be possible.
You will need to manage your MongoDb installation within the Worker Role and they simply aren't designed for this.
If you are really fixed on using Mongo, I would suggest that you use a hosted solution such as MongoLabs (as suggested in earlier answers), or consider hosting it on Azure IaaS VM's.
If you are not fixed on using Mongo, I would sincerely suggest that you look at Azure DocumentDb (also suggested above), Microsoft's Azure NoSQL offering - I have used it in several production systems already and it is certainly a capable NoSQL solution; granted, it may not have all of the features available with MongoDb.
If you are looking at a NoSQL solution for caching of data (i.e. not long term storage), I would suggest you take a look at Azure Redis Cache, which is a very capable Redis offering.
Azure has its own native NoSQL Document database called DocumentDB, have you had a look at it? If I were you I would use DocumentDB unless there are some special requirements that you have that you have not mentioned, but from what little requirement info that you have posted DocumentDB would do just fine. I don't think that it is quite similar to MongoDB in terms of the basic functionality, see this article for a comparison between Azure DocumentDB and MongoDB.

Sandbox version for AWS RedShift

I have been using RedShift for a few months and I like it. But I need to add some tests around it and I am not sure what the most cost effective way of doing it is. I can only think of using one server RedShift cluster as Sandbox but that seems to be too costly even if I only use it during testing
Databases in Redshift cannot 'see' each other and cross-database queries are not supported. Therefore we simply have 'development', 'test' and 'production' databases on the same cluster.
When we're ready to push to production we:
take a snapshot
drop production
rename test to production
This generally works fine for use because we find Redshift to be over-provisioned on storage, i.e., filling our nodes to their max storage capacity does not provide acceptable performance.
NOTE: You cannot drop your "master" database defined when the cluster was created. If you are using that as your primary database you will have to unload your cluster and recreate it for this approach to be viable.
I got the answer from AWS RedShift forum: "There is no way of creating a sandbox version of Redshift. We'll add this to our backlog of feature requests"