Does the data in mongodb provisioned in EC2 gets replicated while Autoscaling? - mongodb

To deploy a server in Amazon Ec2, I wish to have the mongodb master database in an Ec2 instance itself and at an average I would be having around 5-6 Ec2 instances running in parallel which are scaled by amazon auto-scaling group.
As database is updated frequently and all instances are under Elastic load balancer,it is hard to predict which users data is in which database of Ec2. By following this approach, am i assured of data consistency in mongodb across the instances while scaling up and down ? If it is not the good approach please suggest alternate ways of doing it.

When using Amazon autoscaling, new EC2 instances will be created from a root AMI image (for example, with an empty database).
As data is added to your database, that data is not synced back to the AMI image. So when a second EC2 instance is launched due to a scaling event, that new EC2 instance will have it's own blank database, because it will be based on the same root AMI image (with the blank database).
The two databases will not know about each other and no syncing will occur. Also, at any time, any of the EC2 instances may be deleted due to a scale-down event. So any data on that instance may be lost.
Separate your web layer from the database layer: use autoscaling to scale your web layer, but don't use autoscaling for your data layer.
MongoDB has it's own form of clustering for load balancing and high-availability. Use it rather than rolling your own using autoscaling.

It is not a standard practice to couple your web server with the database server. Here is what I would suggest.
Implement load balancing on your web servers as well as your mongo db instances so for the sake of argument, you will have 4 web and 4 mongo db servers.
For implementing load balancing on your mongo db servers, it is up to you if you wish to go with a master-slave tier where each mongo db server is a master as well as a slave(so that all instances have data synced) or you can look into sharding.

Related

mongoDB atlas scaling with AWS regions

I have 6 EC2 Instances on different region on AWS. I have only one MongoDB server on MongoDB Atlas (AWS). I am having latency issue while updating Data to the database. what is the best way to scale my MongoDB server with My 6 EC2 instances.
You can try one way you can apply the VPC peering from EC2 to MongoDB Atlas (AWS). As MongoDB Atlas as now VPC peering function.
If you are trying to connect the MongoDB using public-facing network, then it might be slow.
AWS launched instances in a different region may create a connection using inter-region cross VPC peering.
So if you are following tradition way to whitelisting ACL's better try this it will reduce latency issue.please peer all the region's VPC (make sure there will not be any IP conflict) and try to connect using private connections.
EDIT : 1
There is no additional penalty with MongoDB in terms of using it out of region, but most database protocols are not optimized for high latency conditions. You might be much better off setting up a read replica in other regions.
You can read this: https://www.mongodb.com/blog/post/optimizing-fast-responsive-reads-cross-region-replication-mongodb-atlas
EDIT : 2
If you can't push your database to multiple regions (by using read replicas for example), then you should consider using CloudFront in front of your application(website) to allow for caching of requests in the different regions.
It won't technically improve the latency between application and database, but in terms of your user's perception of performance, it will be little speedy.

How can I deploy Mongo database on AWS?

I am building my own webapp which requires a huge database. I want to build and manage my own Mongo database on AWS rather than using Mongo Atlas. Which will be more cost saving? And whether I should go for Mongo Atlas? What will be its advantage over my own database?
There are pros and cons for both approaches:
Running MongoDB on AWS
Pros:
Complete control over how you run the database and how resources are allocated on the server. This could even be together with an application server on the same EC2 instance depending on your traffic and load. This might help with cost saving if your database is huge but isn't likely to see much traffic.
Cons:
You will be responsible for ensuring database availability and applying security patches as and when they are available. You may also have to setup firewalls and protect the EC2 instance and database in other ways that would be trivial to do on a hosted service like Atlas.
Data sharding and clustering can be a real pain to manage by yourself.
Running on Atlas
Pros:
Completely managed service where you don't have to be concerned about performance optimization or scalability. You pay for the services and Mongodb takes care of the rest.
You can focus on building a great application instead of spending your time on administering the database and the EC2 instance on which the database runs.
Cons:
You will be constrained by the options offered by Atlas. For most use cases this should be fine, but if you really want a specific change, it would be difficult to implement it if Mongodb doesn't already support it as a part of Atlas.
Think running your application on EC2 vs buying a server on-premise and running your application on that.
Being a managed service, costs might also be higher if your database does not see much traffic.
HOSTING yourself: You can get one or more AWS ec2 instances(which are VMs) where you can install and run Mongo DB yourself and manage it like you wanted to, making sure that you spin up more instances when the workload becomes large and there are instances up and running at all times to enable high availability.
Cost (high) - Management responsibilities (lots) - Full MongoDB functionality
MongoDB Atlas is a managed service, you don't need to worry about management tasks like scaling of your database and high availability when a single/more instances die... You pay a very low cost for it - this is run by MongoDb themselves on AWS, Azure, Google cloud;
Cost (low) - Management responsibilities (some) - Full MongoDB functionality
Now AWS has its own Mongo compatible database called DocumentDB - this is also a managed database, so you don't need to worry about scalability, high availability etc. This is only available on AWS so super simple and convinient.
Cost (low) - Management responsibilities (minimal) - Limited MongoDB functionality

Mongodb clone to another cluster

The idea here is, I have mongo cluster deployed in managed cloud service atlas. I have enabled Continuous Backup.
Now what I want to do is :
1) I want to use existing backup.
2) Using this existing backup I want to create similar cluster
(having same data form backup)
3) Automate this process so that every day my new cluster gets upto date from original cluster.
Note: The idea here for cloning cluster is, The original cluster is production data. I want to create a db which has similar data on which I can plug and play using any analytic tools and perform diffrent operations without affecting production data and load.
So far what I have found is to use mongorestore and mongodump.But here mongodump is putting load on production db even though my backup is enabled. I want to use same backup to clone this to another db cluster.
Deployed on Atlas, your server must have replica set.
Here are 2 solutions :
You need only reading data : connect your tools to a secondary server (ideally dedicated with priority 0 for becoming primary)
You need to read/write data : on the same server than above, play your mongodump command with --oplog option. By this way, you're dumping your data from a read-only server, preventing slowing performances of your main servers.
In this last case, what you need will find its solution in backup strategies, take a look at the doc to know more.
There's an offering for this purpose in ATLAS called analytic node.Link.
Analytic node is read replica of your database. Plus it will not interfere with your production traffic which makes it safer.
Also, you can connect BI connectors to this node and create your analytic platform.
We used redash.

Where does MongoDB Atlas fit in my nodejs app?

I have an express app using MongoDB up and running locally. I am looking at options to deploy and wasn't clear on how MongoDB atlas fit in. I planning on just deploying the express app and database to an ec2 instance. Is that alright? Or do I need a separate instance for mongo to run on? MongoDB Atlas offers M2, M5, M10 etc. as options for nodes. I am very new at backend and want to know if those would be separate from my EC2 instance or if those would be my EC2 instance running my express app for clients to connect to as well.
Mongo Atlas is a standalone hosted MongoDB instance. It's a separate server, or typically a cluster of several servers, that only runs MongoDB. You'd run your Express app on an EC2 instance and have it talk over the network to the Mongo Atlas instance on another server.
The advantage is that you don't have to worry about installing or handholding Mongo, about configuring a redundant Mongo cluster, about upgrades or backups. Generally, separating the database server from the application server also means easier longterm maintenance of both. If your Express server doesn't store any data itself, then it is entirely disposable in case of emergencies, while you can be assured* that the critical data stored in your database is well cared for.
* As far as your contract with Atlas stipulates that the data is being cared for…

where to put mongos on AWS to ensure High Availability

I have a mongos to route my queries to two different mongo clusters running on two different ec2 instances so that if one ec2 instance goes down, i have a backup.
The challenge is, where should I put my mongos query router? I do not want to put my mongos query router on a 3rd EC2 instance upstream, because EC2 instances can fail and break. I've had this happen to me. Ec2 instances do not recover on their own and spin themselves up again right?. If the ec2 instance that my mongos query router is on goes down, then all the redundancy upstream that is built for high-availability becomes irrelevant.
So is there another amazon service (like an ec2) that is small, and would only be dedicated to one server (a mongos query distributor), that can spin itself up again if it goes down due to hardware failures, or auto-grow its own RAM and disk-space to give the mongos query router more resources due to software consuming system resources?
Looks like ec2 instances can auto-recover now via
https://aws.amazon.com/blogs/aws/new-auto-recovery-for-amazon-ec2/
So having a mongos query router on a mini ec2 instance with auto-scaling and auto-recovery should be safe.
Also, though not an out-of-box solution for ec2 instances, it looks like you can now also scale up the RAM and disk-size of your ec2 instance by using custom cloud-watch alarms to trigger these actions via
http://aws.amazon.com/code/8720044071969977