I know Mongodb suggest the replication set minimum is 3
Can I use two servers to install Mongodb with 4 replication sets in order to prevent Write failure when one node down?
My idea is to install two more instances on each server to fake/get rid of it:
Mongodb 1
Replcation set 1 (Master)
Replcation set 2 (Secondary)
Mongodb 2
Replcation set 3 (Secondary)
Replcation set 4 (Arbiter)
If one server down, it still has two replication sets for it.
Can anyone can comment my idea?
I would like to know is there any issue/risk needed to consider?
Thanks!
This is a bad idea. Your suggested configuration would be counter-productive for two main reasons:
Running two instances of MongoDB on the same hosts will lead to them both running slowly and awkwardly, competing for memory, cpu, disk i/o, etc.
If one of the hosts goes down, the two nodes on the other will not be able to continue the replica set because with only two nodes running out of four, you don't have the majority necessary to become primary.
MongoDB is intended to be run effectively on multiple hosts, so it is designed for that; even if you could find a way to shoe-horn a functioning replica set onto only two hosts, you would be working against MongoDB's design and struggling to make it work effectively.
Really, your best option is to run a single mongod instance on each of your two hosts, and to run an arbiter on a third (separate) host; that kind of normal 3-host configuration is an effective and straightforward way of achieving both data replication and high availability.
Please read https://docs.mongodb.com/manual/replication/
Even number of members in a set is not recommended. The only reason to add an Arbiter is to resolve this issue and increase number of members by 1 to make it odd. It makes no sense to add 2 Arbiters at all.
Similarly there is no point to put an Arbiter to the same server with any other member. It is quite lightweight, so you can co-locate it with an application server for example.
Having 2 db servers the best replica set configuration would be:
Mongodb 1
Master
Mongodb 2
Secondary
Any other server
Arbiter
Simple answer is, no you cannot do it..
Reason is "majority". If you loose one server of those two, you don't have majority of votes online.
This is reason, why 3 is minimum. It can be 3 data bearing nodes or 2 data bearing nodes and one arbiter. All of those have one vote and if you loose one of those, replica set still have 2/3 votes what is majority. Like 3/5 or 4/7.
Values 1/2 (or 2/4) are not majority.
Related
I came across MongoDB official site explaining on having odd number of members replica set up. I also heard of the term Arbiter from the same site, which based on my understanding, it will not be elected as primary and it does participate on election (from https://docs.mongodb.com/manual/core/replica-set-arbiter/).
There is also a post related to Arbiter in Why do we need an 'arbiter' in MongoDB replication? which then relates to CAP theorem, which further gets things more complicated.
First of all, why do we need to make the number of members odd? Also, can someone explain to me what this Arbiter is and what is its role in a given replica set in simple layman English??
Thanks in advance.
In short: it is to stop the two normal nodes of the replica set getting into a split-brain situation if they lose contact with each other.
MongoDB replica sets are designed so that, if one or more members goes down or loses contact, the other members are able to keep going as long as between them they have a majority. The majority clause is important: without that, you might have a situation where the network is split in two, and the nodes on each side of the partition think that they're still carrying on the replica set, and end up with different sets of data.
So to avoid the split brain problem, the nodes of a replica set will not continue if they can't command an absolute majority. An example of this is if you have two nodes, in a replica set like this:
If they lose communication, the outcome is symmetrical:
Each one will reason the same way:
realise it has lost communication with the other
assess whether it is possible to keep the replica set going
realise that 1 node (out of 2) does not constitute a majority
revert to Secondary mode
The difference an Arbiter makes
If there is a third node, then even if the two main nodes lose contact with each other then there will still be one of them in contact with the arbiter. This allows the two main nodes to make different decisions, and keep the replica set going while avoiding the split-brain problem.
Consider the following example of a 3-node replica set:
Whichever way the network partition goes, one node will still be in contact with the arbiter; for example like this:
Node A will:
realise it can contact neither node B nor the arbiter
assess whether it is possible to keep the replica set going
realise that 1 node (out of 3) does not constitute a majority
revert to Secondary mode
Whereas node B is able to react differently:
realise it cannot contact node A, but still has contact with the arbiter
assess whether it is possible to keep the replica set going
realise that 2 nodes (out of 3) do constitute a majority
take over as Primary
This also illustrates how you should deploy an arbiter to get that benefit:
try to put the arbiter on a system independent of both the data-bearing nodes, to maximise the chance of it still being able to communicate with either throughout network problems
it doesn't need to store data, so you don't need high-spec hardware
Just 1 arbiter is enough to break the deadlock; you don't get any benefit from multiple arbiters
Take the example of a 2-member replica set: in the event of a network-partitioning, i.e., the 2 members lost touch of each other, who gets to become the primary? There will be a tie and a need for a tie-breaker. That would not be the case if we have a 3-member replica set: the group that contains two nodes will win and one of them will become primary. That is the basis of the requirement for an odd number of nodes in a replica set. As for an arbiter, it happens to be light weight so that I guess one can save money by having in place a smaller machine, since we do not expect it to hold any data, and that we just need it to be present to vote for primary.
I am looking for some best practices to be able to read enough to gauge how to decide on the number of replicas needed for Mongo. I am aware of Mongo Docs that talk about things like having odd number of nodes, and when does the need of arbiter arise, etc.
In our case the requirement for reads won’t be so high that reads will become a bottleneck. Neither are we targeting sharding at this moment. However, we are going to run mongo in a docker swarm and there could be multiple instances of certain services trying to write. Our swarm cluster won’t be very huge either most likely.
So how do I find logical answers to these:
Why not create one local mongo instance per physical node and tie it to that?
For any number of physical nodes, as long as read/write is not a bottle neck, 3 or 5 replicas are always going to be ideal for fault recovery and high availability. But why is 3 or 5 a good number. Why not 7 if I have say 10 physical nodes.
I am trying to find some good reads to be able to decide on how to arrive at a number. Any pointers?
To give you an answer, all depend on many criterias
What is your budget
How big is your data
what do you want to use your replica sets for
etc...
As an Example, In my case
We have 3 Data Centers across the country
One of them is very Small
We found our sweet spot in terms of number od nodes being 5
1 Primary + 1 Secondary in DC1
1 Arbiter in DC2
2 secondaries in DC3
I currently have 2 physical servers and one arbiter configured as a replica set. I would like to try sharing with this configuration. I know it is possible to run two mongod instances on the same server, one as master of replica 1, the other as slave of replica 2: can these two processes (master of replica 1 and slave of replica 2) point to the same database? Isn't there the danger of a sort of loop?
Hmm I am unsure if you know what replication really is.
All members in the replica set will share the same database(s), they will replicate the database(s) between them and maintain them.
Replicas are exactly that, they are copies of each other, including database.
I suggest you read: http://docs.mongodb.org/manual/replication/
Edit
There could be another meaning here as in to the same files since you mention running the master and slaves on the same node.
First off running two replicas on the same node is pointless. You will get no benefit and if anything you will get a performance problem since that IO is now taking double the strain it normally would.
So I would begin by saying that your idea would be really bad design even if it was feasible which it is not, the physical files cannot have multiple file locks on them.
I intend to check mongodb performance by running an application on 8 servers.
-1. here http://docs.mongodb.org/manual/tutorial/deploy-shard-cluster/ I read,
In production deployments, you must deploy exactly three config server
instances, each running on different servers to assure good uptime and
data safety. In test environments, you can run all three instances on
a single server.
What if I want to use optimally the resources of 8 servers (+ 1 dedicated server for the application)? Do I start 1 config server instance per server?
-2. I see here http://docs.mongodb.org/manual/core/replication-introduction/
that using replica sets with 3 mongod instances (with each mongod instance on a different server) is the way to go? Is this the optimal scenario when it comes to having 8 servers?
-3. How many replica sets would I use when I have 8 servers? 1 per server (8 servers == 8 replica sets == 3 mongod instances per server from different replica sets)?
-4. Is there any best practices documentation regarding optimization of this type?
Kind Regards,
Despot
What if I want to use optimally the resources of 8 servers (+ 1 dedicated server for the application)?
That's not an optimal way to plan, there is no way you know that you NEED 7 shards for your data.
Do I start 1 config server instance per server?
No, you are hardcoded to three.
Is this the optimal scenario when it comes to having 8 servers?
No, it is the minimum, you would ideally want more members, especially one bridging partitions; ensuring all the while you have an odd number of node on one side of your parition to ensure CAP.
Normally your replica set would consist of at least one extra member designed for backups, normally using a slaveDelay of maybe a day.
How many replica sets would I use when I have 8 servers?
Assuming (guessing) you want to use 7 shards you would have 7 replica sets, one per shard.
3 mongod instances per server from different replica sets
That would be a bad idea. You do not want to place the replica members on the same server as each other, you might as well be using no replication.
I would seriously plan more and check if you really need 7 shards, I highly doubt it.
Consider the following setup:
There a 2 physical servers which are set up as a regular mongodb replication set (including an arbiter process, so automatic failover will work correctly).
now, as far as i understand, most actual work will be done on the primary server, while the slave will mostly just do work to keep its dataset in sync.
Would it be reasonable, to introduce sharding into this setup in a way that one would set up another replication set on the same 2 servers, so that each of them has one mongod process running as primary and one process running as secondary.
The expected result would be that both servers will share the workload of actual querys/inserts while both are up. In the case of one server failing the whole setup should elegantly fail over to continue running, until the other server is restored.
Are there any downsides to this setup, except the overall overhead in setup and number of processes (mongos/configservers/arbiters)?
That would definitely work. I'd asked a question in the #mongodb IRC channel a bit ago as to whether or not it was a bad idea to run multiple mongod processes on a single machine. The answer was "as long as you have the RAM/CPU/bandwidth, go nuts".
It's worth noting that if you're looking for high-performance reads, and don't mind writes being a bit slower, you could:
Do your writes in "safe mode", where the write doesn't return until it's been propagated to N servers (in this case, where N is the number of servers in the replica set, so all of them)
Set the driver-appropriate flag in your connection code to allow reading from slaves.
This would get you a clustered setup similar to MySQL - write once on the master, but any of the slaves is eligible for a read. In a circumstance where you have many more reads than writes (say, an order of magnitude), this may be higher performance, but I don't know how it'd behave when a node goes down (since writes may stall trying to write to 3 nodes, but only 2 are up, etc - that would need testing).
One thing to note is that while both machines are up, your queries are being split between them. When one goes down, all queries will go to the remaining machine thus doubling the demands placed on it. You'd have to make sure your machines could withstand a sudden doubling of queries.
In that situation, I'd reconsider sharding in the first place, and just make it an un-sharded replica set of 2 machines (+1 arbiter).
You are missing one crucial detail: if you have a sharded setup with two physical nodes only, if one dies, all your data is gone. This is because you don't have any redundancy below the sharding layer (the recommended way is that each shard is composed of a replica set).
What you said about the replica set however is true: you can run it on two shared-nothing nodes and have an additional arbiter. However, the recommended setup would be 3 nodes: one primary and two secondaries.
http://www.markus-gattol.name/ws/mongodb.html#do_i_need_an_arbiter