Setting up mongo replication in production - mongodb

How do you setup mongodb replication in production environments? I started using cloud formation with this template but it crashes half way. I want to setup mongo so that it has one primary and two replications.
I haven't found a good tutorial for how to setup Mongo replication.
Some other questions I have are:
How does the failover work, if I have three Ec2 instances each with mongo and the primary fails. Another instance becomes the primary but how does my client PyMongo and Scala Mongo know the IP address of the new primary.
Lets say the primary goes down for 1 hour and there are 2,000 writes. When it goes back up, how does the primary gets updated. Do I need a script for this?
I am trying to do this with flask PyMongo

I ended up testing this on my local machine here is what I found.
Failover is done by the client, in the Mongo URI you specify all your replications and when PyMongo connects to it. He checks to see which one is the primary and writes to that one.
When the database goes back up they all sync to match the same records in the all the databases.

Readthedocs has step by step manual on setting up MongoDB cluster on different platforms, including AWS EC2:
https://mongodb-documentation.readthedocs.io/en/latest/ecosystem/tutorial/install-mongodb-on-amazon-ec2.html#deploy-a-multi-node-replica-set
To provide your clients with working mongo instance you can employ several different strategies. For example:
Set up Route53 failover. Route53 will monitor health instance of primary node, and change DNS record to point to secondary in case of failure.
Use service discovery. Consul, etc, ZooKeeper and doozerd are worth exploring.
In case of failing and then coming back a mongodb node will receive latest data from other nodes — that's just what replica set does.

Related

Mongo Replica Endpoint

I have some questions about the mongo replica
mongo replica
If I make 1 primary and 2 secondary MongoDB for replication. So I have 3 endpoints to 3 different DB and my apps can only write on primary DB. what if suddenly my primary shutdown and secondary DB take over the primary. Then how to automatically change the endpoint in my apps? should I use mongos (mongo routes)? but it needs sharding if I remember correctly.
Thank you.
All nodes in a replica set work together to have identical data. Secondary nodes may lag behind the primary, but you don't get "3 different DB". There is only one database of which copies exist on each node.
All MongoDB drivers know to monitor replica set members and discover which is the primary one automatically. You need to configure some drivers to do so by providing the replica set name, others do it automatically by default when they connect to a replica set node. Look up "connecting to replica set" in your driver documentation.
In a proper connection string you will provide all three RS members, e.g.
mongodb://mongodb0.example.com:27017,mongodb1.example.com:27017,mongodb2.example.com:27017/?replicaSet=myRepl
The client will detect the PRIMARY and will use it. I guess, most drivers will re-connect automatically if the PRIMARY node changes.
Most drivers will detect the PRIMARY automatically if you provide the ReplicaSet name, i.e.
mongodb://mongodb0.example.com:27017/?replicaSet=myRepl
would connect to the PRIMARY even if it is not mongodb0.example.com. However, if mongodb0.example.com does not run, then you don't connect at all. So, it is beneficial to provide all ReplicaSet members in the connection string.
See Connection String URI Format
mongos is needed only to connect to a Sharded Cluster.

MongoDB replica with Offline Database

I would like to have Mongo on a server, then to replicate this onto a laptop.
The laptop is needed to leave the network and still be able to read/write, and once back on the network sync these changes with the primary.
At the same time I need the VM (Primary), to still be accessible (read/write).
So when each device is not talking to each other then for them to make themselves primary.
I have set up a very basic replica, Primary on a VM and the Secondary on the machine running the VM. In all examples I have seen it recommends having 3 servers for the replica, but I only need 2!
A couple of questions:
Is this possible with Mongo? If not then any suggestions!
When I turn off the Network adapter on the VM(Primary), the secondary doesn't seem to want to become the primary.
Is it possible to run 2 instance of Mongo, and then use the other instance as the 3 member of the replica.
Any advice would be great, thanks.
MongoDB always needs even number servers. in your case it should be like one primary one secondary and one arbiter. arbiter server instance is responsible for electing new server as primary when current primary goes down.

deploying mongodb on google cloud platform?

Hello all actually for my startup i am using google cloud platform, now i am using app engine with node.js this part is working fine but now for database, as i am mongoDB i saw this for mongoDB https://console.cloud.google.com/launcher/details/click-to-deploy-images/mongodb?q=mongo now when i launched it on my server now it created three instances in my compute engine but now i don't know which is primary instance and which is secondary, also one more thing as i read that primary instance should be used for writing data and secondary for reading, now when i will query my database should i provide secondary instance url and for updating/inserting data in my mongodb database should i provide primary instance url otherwise which url should i use for CRUD operations on my mongodb database ?? also after launcing this do i have to make any changes in any conf file or in any file manually or they already done that for me ?? Also do i have to make instance groups of all three instances or not ??
Please if any one of you think i have not done any research on this or its not a valid stackoverflow question then i am so sorry google cloud platform is very much new that's why there is not much documentation on it also this is my first time here in deploying my code on servers that's why i am completely noob in this field Thanks Anyways please help me ut of here guys.
but now i don't know which is primary instance and which is secondary,
Generally the Cloud Launcher will name the primary with suffix -1 (dash one). For example by default it would create mongodb-1-server-1 instance as the primary.
Although you can also discover which one is the primary by running rs.status() on any of the instances via the mongo shell. As an example:
mongo --host <External instance IP> --port <Port Number>
You can get the list of external IPs of the instances using gcloud. For example:
gcloud compute instances list
By default you won't be able to connect straight away, you need to create a firewall rule for the compute engines to open port(s). For example:
gcloud compute firewall-rules create default-allow-mongo --allow tcp:<PORT NUMBER> --source-ranges 0.0.0.0/0 --target-tags mongodb --description "Allow mongodb access to all IPs"
Insert a sensible port number, please avoid using the default value. You may also want to limit the source IP ranges. i.e. your office IP. See also Cloud Platform: Networking
i read that primary instance should be used for writing data and secondary for reading,
Generally replication is to provide redundancy and high availability. Where the primary instance is being used to read and write, and secondaries act as replicas to provide a level of fault tolerance. i.e. the loss of primary server.
See also:
MongoDB Replication.
Replication Read Preference.
MongoDB Sharding.
now when i will query my database should i provide secondary instance url and for updating/inserting data in my mongodb database should i provide primary instance url otherwise which url should i use for CRUD operations on my mongodb database
You can provide both in MongoDB URI and the driver will figure out where to read/write. For example in your Node.js you could have:
mongodb://<instance 1>:<port 1>,<instance 2>:<port 2>/<database name>?replicaSet=<replica set name>
The default replica set name set by Cloud Launcher is rs0. Also see:
Node Driver: URI.
Node Driver: Read Preference.
also after launcing this do i have to make any changes in any conf file or in any file manually or they already done that for me ?? Also do i have to make instance groups of all three instances or not ??
This depends on your application use case, but if you are launching through click and deploy the MongoDB config should all be taken care of.
For a complete guide please follow tutorial : Deploy MongoDB with Node.js. I would also recommend to check out MongoDB security checklist.
Hope that helps.

mongodb Single Point of Failure

I am aware that mongodb has a master-slave architecture.
Therefore, I was thinking that the master would be the single point of failure in mongoDB since it takes care of all the requests and sends it to the slave nodes. However, when the master fails, then a new master is reelected from the slaves. Therefore I need some clarification on where the single point of failure lies.
Does mongoDB have a single point of failure? Is it in the master node?
Thanks,
MongoDB can be set up in a way that there is no single point of failure (at least none specific to MongoDB).
When you set up replication as suggested (which includes primary, secondary and an arbiter on a 3rd server), the secondary will take the role of the primary when it goes down. Keep in mind that this only works when the applications know both the primary and the secondary (how to make it aware depends on the driver).
When you have a sharded cluster, the mongo router process (mongos) and the config servers becomes additional possible points of failure, but you can also set up reduntant routers and config servers. To send the clients to another mongos server when theirs goes down, you need a 3rd party load-balancing solution.
For a proper production MongoDB setup with clustering, MongoDB Inc. suggests:
At least 2 mongos routers
Exactly 3 config servers
3 servers per shard (primary, secondary and arbiter), where the arbiters do not necessarily need dedicated servers and can share hardware with the routers, config servers, members of a different replica-set or app servers.

One Shard with Multiple Mongos

Can we have this type of configuration?
Two server running the following things each-
1.Mongo Config Server.
2.Mongo Router.
3.Application.
Total 4 EC2 servers-
First Server-Running the web application & mongos.
Second Server-Running the web application & mongos.
Third Server-Running the First Shard with complete DB(Say for
example Demo).
Forth Server-Running The Second Shard with complete DB(Say for
example Demo).
Both the Mongos should point to one shard named Shard1?
Yes, You can have multiple mongos instances running against a single shard. Think of the mongos instances as clients for the sharded cluster which have to run as a daemon process in order to keep metadata and heartbeats up to date.
Edit: as for having a complete DB, this is only possible for a single DB. You can have one DB on shard1 and the other DB on shard2, for example. but you can never have a single complete DB on two shards. To achieve the goal of having db1 on shard1 and db2 on shard2, you simply make the respective shard the primary shard of the respective database and don't shard any collection. Please read the docs for the movePrimary command for details.
A bit OOT:
However, running a single config server is strongly advised against, and for a good reason. If the single config server goes down or gets corrupted, your cluster will be impossible to use - and recreating the sharded cluster will not an easy task to be done. And it's going to be a lengty process. So please, use three config servers.*