Mongodb cluster with aws cloud formation and auto scaling - mongodb

I've been investigating creating my own mongodb cluster in AWS. Aws mongodb template provides some good starting points. However, it doesn't cover auto scaling or when a node goes down. For example, if I have 1 primary and 2 secondary nodes. And the primary goes down and auto scaling kicks in. How would I add the newly launched mongodb instance to the replica set?
If you look at the template, it uses an init.sh script to check if the node being launched is a primary node and waits for all other nodes to exist and creates a replica set with thier ip addresses on the primary. When the Replica set is configured initailly, all the nodes already exist.
Not only that, but my node app uses mongoose. Part of the database connection allows you to specify multiple nodes. How would I keep track of what's currently up and running (I guess I could use DynamoDB but not sure).
What's the usual flow if an instance goes down? Do people generally manually re-configure clusters if this happens?
Any thoughts? Thanks.

This is a very good question and I went through this very painful journey myself recently. I am writing a fairly extensive answer here in the hope that some of these thoughts of running a MongoDB cluster via CloudFormation are useful to others.
I'm assuming that you're creating a MongoDB production cluster as follows: -
3 config servers (micros/smalls instances can work here)
At least 1 shard consisting of e.g. 2 (primary & secondary) shard instances (minimum or large) with large disks configured for data / log / journal disks.
arbiter machine for voting (micro probably OK).
i.e. https://docs.mongodb.org/manual/core/sharded-cluster-architectures-production/
Like yourself, I initially tried the AWS MongoDB CloudFormation template that you posted in the link (https://s3.amazonaws.com/quickstart-reference/mongodb/latest/templates/MongoDB-VPC.template) but to be honest it was far, far too complex i.e. it's 9,300 lines long and sets up multiple servers (i.e. replica shards, configs, arbitors, etc). Running the CloudFormation template took ages and it kept failing (e.g. after 15 mintues) which meant the servers all terminated again and I had to try again which was really frustrating / time consuming.
The solution I went for in the end (which I'm super happy with) was to create separate templates for each type of MongoDB server in the cluster e.g.
MongoDbConfigServer.template (template to create config servers - run this 3 times)
MongoDbShardedReplicaServer.template (template to create replica - run 2 times for each shard)
MongoDbArbiterServer.template (template to create arbiter - run once for each shard)
NOTE: templates available at https://github.com/adoreboard/aws-cloudformation-templates
The idea then is to bring up each server in the cluster individually i.e. 3 config servers, 2 sharded replica servers (for 1 shard) and an arbitor. You can then add custom parameters into each of the templates e.g. the parameters for the replica server could include: -
InstanceType e.g. t2.micro
ReplicaSetName e.g. s1r (shard 1 replica)
ReplicaSetNumber e.g. 2 (used with ReplicaSetName to create name e.g. name becomes s1r2)
VpcId e.g. vpc-e4ad2b25 (not a real VPC obviously!)
SubnetId e.g. subnet-2d39a157 (not a real subnet obviously!)
GroupId (name of existing MongoDB group Id)
Route53 (boolean to add a record to an internal DNS - best practices)
Route53HostedZone (if boolean is true then ID of internal DNS using Route53)
The really cool thing about CloudFormation is that these custom parameters can have (a) a useful description for people running it, (b) special types (e.g. when running creates a prefiltered combobox so mistakes are harder to make) and (c) default values. Here's an example: -
"Route53HostedZone": {
"Description": "Route 53 hosted zone for updating internal DNS (Only applicable if the parameter [ UpdateRoute53 ] = \"true\"",
"Type": "AWS::Route53::HostedZone::Id",
"Default": "YA3VWJWIX3FDC"
},
This makes running the CloudFormation template an absolute breeze as a lot of the time we can rely on the default values and only tweak a couple of things depending on the server instance we're creating (or replacing).
As well as parameters, each of the 3 templates mentioned earlier have a "Resources" section which creates the instance. We can do cool things via the "AWS::CloudFormation::Init" section also. e.g.
"Resources": {
"MongoDbConfigServer": {
"Type": "AWS::EC2::Instance",
"Metadata": {
"AWS::CloudFormation::Init": {
"configSets" : {
"Install" : [ "Metric-Uploading-Config", "Install-MongoDB", "Update-Route53" ]
},
The "configSets" in the previous example shows that creating a MongoDB server isn't simply a matter of creating an AWS instance and installing MongoDB on it but also we can (a) install CloudWatch disk / memory metrics (b) Update Route53 DNS etc. The idea is you want to automate things like DNS / Monitoring etc as much as possible.
IMO, creating a template, and therefore a stack for each server has the very nice advantage of being able to replace a server extremely quickly via the CloudFormation web console. Also, because we have a server-per-template it's easy to build the MongoDB cluster up bit by bit.
My final bit of advice on creating the templates would be to copy what works for you from other GitHub MongoDB CloudFormation templates e.g. I used the following to create the replica servers to use RAID10 (instead of the massively more expensive AWS provisioned IOPS disks).
https://github.com/CaptainCodeman/mongo-aws-vpc/blob/master/src/templates/mongo-master.template
In your question you mentioned auto-scaling - my preference would be to add a shard / replace a broken instance manually (auto-scaling makes sense with web containers e.g. Tomcat / Apache but a MongoDB cluster should really grow slowly over time). However, monitoring is very important, especially the disk sizes on the shard servers to alert you when disks are filling up (so you can either add a new shard to delete data). Monitoring can be achieved fairly easily using AWS CloudWatch metrics / alarms or using the MongoDB MMS service.
If a node goes down e.g one of the replicas in a shard, then you can simply kill the server, recreate it using your CloudFormation template and the disks will sync across automatically. This is my normal flow if an instance goes down and generally no re-configuration is necessary. I've wasted far too many hours in the past trying to fix servers - sometimes lucky / sometimes not. My backup strategy now is run a mongodump of the important collections of the database once a day via a crontab, zip up and upload to AWS S3. This means if the nuclear option happens (complete database corruption) we can recreate the entire database and mongorestore in an hour or 2.
However, if you create a new shard (because you're running out of space) configuration is necessary. For example, if you are adding a new Shard 3 you would create 2 replica nodes (e.g. primary with name => mongo-s3r1 / secondary with name => mongo-s3r2) and 1 arbitor (e.g. with name mongo-s3r-arb) then you'd connect via a MongoDB shell to a mongos (MongoDB router) and run this command: -
sh.addShard("s3r/mongo-s3r1.internal.mycompany.com:27017,mongo-s3r2.internal.mycompany.com:27017")
NOTE: - This commands assumes you are using private DNS via Route53 (best practice). You can simply use the private IPs of the 2 replicas in the addShard command but I have been very badly burned with this in the past (e.g. serveral months back all the AWS instances were restarted and new private IPs generated for all of them. Fixing the MongoDB cluster took me 2 days as I had to reconfigure everything manually - whereas changing the IPs in Route53 takes a few seconds ... ;-)
You could argue we should also add the addShard command to another CloudFormation template but IMO this adds unnecessary complexity because it has to know about a server which has a MongoDB router (mongos) and connect to that to run the addShard command. Therefore I simply run this after the instances in a new MongoDB shard have been created.
Anyways, that's my rather rambling thoughts on the matter. The main thing is that once you have the templates in place your life becomes much easier and defo worth the effort! Best of luck! :-)

Related

How to read from specific instance of a documentdb cluster

I am having a replica lag issue with documentDB. Where I am trying to write some data from a collection and read the same at the same time. But because I am using a distributed system, I am not able to read the already written data from the replica sets.
Here's the cluster design.
.
So, is it possible to read from the primary instance in nodejs or is it possible to read from a specific instance?
How big is the replication lag? It might be worth investigating the cause for the lag, maybe bigger instances are needed or queries have to be optimized.
If your application can't tolerate eventual consistency or read after write consistency is required, then use readPreference: primaryPreferred to instruct the driver to read from the Primary instance when available. However, in this case, the replicas will not be used to scale horizontally the read traffic.
Amazon DocumentDB has other endpoints too:
reader endpoint - points to replica instances, it's found in the configuration section of the cluster (console or aws cli describe-db-clusters command)
instance endpoint - each instance has its own endpoint, it's found in the instances section (console or aws cli describe-db-instances command)
The best practice is to connect as replica set, using the readPreference parameter to adjust the preference. Instance endpoints can be useful when, for example, there's a need for large analytics queries and a bigger instance is deployed, temporarily, to run them.

MongoDB data replication in Kubernetes

I've been configuring pods in Kubernetes to hold a mongodb and golang image each with a service to load-balance. The major issue I am facing is data replication between databases. Replication controllers/replicasets do not seem to do what the name implies, but rather is a blank-slate copy instead of a replica of existing/currently running pods. I cannot seem to find any examples or clear answers on how Kubernetes addresses this, or does it even?
For example, data insertions being sent by the Go program are going to automatically load balance to one of X replicated instances of mongodb by the service. This poses problems since they will all be maintaining separate documents without any relation to one another once Kubernetes begins to balance the connections among other pods. Is there a way to address this in Kubernetes, or does it require a complete re-write of the Go code to expect data replication among numerous available databases?
Sorry, I'm relatively new to Kubernetes and couldn't seem to find much information regarding this.
You're right, a replica set is not a replica of another container, it's just a container with the same configuration spun up within the same logical unit.
A replica set (or deployment, which is the resource you should be using now) will have multiple pods, and it's up to you, the operator, to configure the mongodb part.
I would recommend reading this example of how to set up a replica set with multiple mongodb containers:
https://medium.com/google-cloud/mongodb-replica-sets-with-kubernetes-d96606bd9474#.e8y706grr

MongoDb preparing for Sharded Clusters

We are currently setting up our mongodb environment for production. At the moment we only have one dedicated mongodb database server. We will expand this in the near future with a 2nd server and I already indicated to the management that for the ideal situation we should get a 3rd server as well.
Since I already know we're going to use sharding and replication in the near future I want to be prepared for it.
The idea I have now is to start now with the Development Configuration (as mongo's documentation names it).
Whenever our second server comes available I would like to expand this setup to a configuration with 2 configuration servers en 2 shards (replica sets).
And of course when our third server comes available have the fully functional sharded cluster configuration.
While reading mongo's documentation I was getting triggered by the note that de Development setup should not be used in production.
MongoDb Development Configuration
Keeping in mind that we will add more servers soon, would it be a bad idea to already configure the Development Configuration already so we can easily add the 2nd server to the cluster when it comes available?
After setting up the 'development sharded setup' I've found my anwser. Of course i'm happy to share in case anybody runs into the same questions as I do when starting with this.
In my case, it was ok to start with the development setup untill my new servers arrived. It was a temporary situation and when my new servers arived I was able to easily expand my replicasets. There are a number of reasons why this isn't adviced for production:
To state the obvious, there is no replication yet. Since I was running shards on one machine there is a single point of failure. If the machine, or one node goes down, the cluster won't work anymore.
Now this part is interesting. After I added a second server, I did have primary and secondary nodes. Primary nodes were used for writing and secondary for reading. I've eliminated the issue that there was no replication AND my data had a higher availability. However, I noticed with the 2-member replica sets, if one member of the replicaset went down (even is this was a secondary), the primary stepped down to a secondary node as well. This had to do with the voting mechanism that MongoDb uses. See Markus' more detailed answer on this.. Since there are no more primaries in the replicaset, my cluster won't function anymore. Now, if i were to use an arbiter I could eliminate this problem as well.
When you have a 3-member replicataset, automatic failover kicks in. Whenever a node goes down, another primary is assigned automatically and the cluster will continue performing as before.
During my tests I also got to a point where one of my MongoD.exe instances stopped working due to a "Out of memory exception". I was running a cluster with 3 replicasets, meaning every machine had at least 4 mongod.exe processes running (3 for the replicaset shards and one for the configuration server replicaset). Besides having a query which wasn't optimized yet I also noticed that the WiredTiger storage engine by default can use up to 50% of ram minus one gigabyte. Perhaps it wasn't the best choise to have multiple replicaset-shards on one machine but I was able to eliminate the problem by capping the wiredtiger memory usage.
I hope this answer helps anybody who's starting to set up replication and sharding for MongoDb.

Setting up mongo replication in production

How do you setup mongodb replication in production environments? I started using cloud formation with this template but it crashes half way. I want to setup mongo so that it has one primary and two replications.
I haven't found a good tutorial for how to setup Mongo replication.
Some other questions I have are:
How does the failover work, if I have three Ec2 instances each with mongo and the primary fails. Another instance becomes the primary but how does my client PyMongo and Scala Mongo know the IP address of the new primary.
Lets say the primary goes down for 1 hour and there are 2,000 writes. When it goes back up, how does the primary gets updated. Do I need a script for this?
I am trying to do this with flask PyMongo
I ended up testing this on my local machine here is what I found.
Failover is done by the client, in the Mongo URI you specify all your replications and when PyMongo connects to it. He checks to see which one is the primary and writes to that one.
When the database goes back up they all sync to match the same records in the all the databases.
Readthedocs has step by step manual on setting up MongoDB cluster on different platforms, including AWS EC2:
https://mongodb-documentation.readthedocs.io/en/latest/ecosystem/tutorial/install-mongodb-on-amazon-ec2.html#deploy-a-multi-node-replica-set
To provide your clients with working mongo instance you can employ several different strategies. For example:
Set up Route53 failover. Route53 will monitor health instance of primary node, and change DNS record to point to secondary in case of failure.
Use service discovery. Consul, etc, ZooKeeper and doozerd are worth exploring.
In case of failing and then coming back a mongodb node will receive latest data from other nodes — that's just what replica set does.

MongoDB sharding: mongos and configuration servers together?

We want to create a MongoDB shard (v. 2.4). The official documentation recommends to have 3 config servers.
However, the policies of our company won't allow us to get 3 extra servers for this purpose. Since we have already 3 application servers (1 web node, 2 process nodes) we are considering to put the configuration servers in the same application servers, with the mongos. Availability is not critical for us.
What do you think about this configuration? Can we face some problem or is it discouraged for some reason?
Given that Availability is not critical for your use case, I would say it should be fine to place the config servers in the same application servers and mongos.
If one of the process nodes is down, you will lose: 1 x mongos, 1 application server and 1 config server. During this down time, the other two config servers will be read-only , which means there won't be balancing of shards, modification to cluster config etc. Although your other two mongos should still be operational (CRUD wise). If your web-node is down, then you have a bigger problem to deal with.
If two of the nodes are down (2 process nodes, or 1 web server and process node), again, you would have bigger problem to deal with. i.e. Your applications are probably not going to work anyway.
Having said that, please consider the capacity of these nodes to be able to handle a mongos, an application server and a config server. i.e. CPU, RAM, network connections, etc.
I would recommend to test the deployment architecture in a development/staging cluster first under your typical workload and use case.
Also see Sharded Cluster High Availability for more info.
Lastly, I would recommend to check out MongoDB v3.2 which is the current stable release. The config servers in v3.2 are modelled as a replica set, see Sharded Cluster config servers for more info.