MongoDB replica set initial Sync - mongodb

Doing initial sync to the secondary is a very time consuming process, I haven't found anywhere in MongoDB docs that the primary can accept write operations during the initial Sync, or if not recommended. Is it safe to keep the primary operational (for write) during this process?
Thanks

In order for a primary to accept a write there have to be at least a quorum of voting replica set members available to vote and vote for the same primary. For instance for a 3 member replica set you need at least 2.
A secondary that is in initial sync should be in the Recovering state and according to the documentation can vote http://docs.mongodb.org/manual/reference/replica-states/:
3 RECOVERING Can vote. Members either perform startup self-checks,
or transition from completing a rollback or resync.
Now should you? I think the question depends on how many members were in the set before. If you've been running with 2 data nodes and 1 arbiter, running with 1 data node only for awhile is something only you can answer - yes it's riskier but what's your alternative, being down completely?
If you have 3 data nodes and 1 is down for an initial sync I don't see much issue unless you have very high data redundancy needs.
If you are starting from only having 1 node and you are transitioning into a replica set well you are no worse off then you were before.
Above all else always make certain you have at least 3 members of your replica set, preferably with at least 2 data nodes and generally speaking an odd number of voters.

Related

How MongoDB detects majority in PSA architecture?

Consider I have a replica set with 3 nodes (2 data nodes and one arbiter (PSA)). When for some reason one of my data nodes goes down and I bring it back, during syncing with primary node, that is in state STARTUP2. At his time I will lose my change stream because my replica set has 2 data nodes but I don't have majority of nodes to read.
How can I handle this issue?
I also read this MongoDB doc. Is that possible to set primary node priority value higher than secondary node (that is syncing itself with primary node)? Can I have majority by doing this even when my secondary node is in STARTUP2 state?
There are technically two types of majority. As I called them, they're "election majority" and "data majority".
Arbiters are supposed to help with "election majority", where it helps maintain a primary availability in a PSA architecture should the S went down. However, they're not a part of "data majority".
"Data majority", in contrast, are both for voting and acknowledging majority-read and majority-write.
Changestreams by design will return documents that are committed to the "data majority" of voting nodes. This is because a write that's propagated to them will not be rolled back. It will be confusing if a changestream declared that a document was written, then it rolled back, then would have to issue a "no wait, scratch that, the write didn't happen".
Thus by its nature, arbiters are not compatible with majority-read and majority-write scenarios such as changestreams or transactions. However arbiters still has its place in a replica set, provided you know what to expect from them.
See What is the default mongod write concern in which version? for a more complete explanation of write concerns and the effect of having arbiters.
A secondary in STARTUP2 is not a secondary yet. It may vote in elections, but it won't acknowledge majority writes since it's still starting up.
In terms of changestream, since in a PSA architecture the "data majority" is practically only the PS part of PSA, none of the data bearing nodes can be offline for majority reads and writes to be maintained.
The best solution is to replace the arbiter with an actual data-bearing node. This way, you can have majority-write, majority-read, and can have one node down and still maintain majority.

Requires simple explanation on Arbiter's role in a given mongoDB replica set

I came across MongoDB official site explaining on having odd number of members replica set up. I also heard of the term Arbiter from the same site, which based on my understanding, it will not be elected as primary and it does participate on election (from https://docs.mongodb.com/manual/core/replica-set-arbiter/).
There is also a post related to Arbiter in Why do we need an 'arbiter' in MongoDB replication? which then relates to CAP theorem, which further gets things more complicated.
First of all, why do we need to make the number of members odd? Also, can someone explain to me what this Arbiter is and what is its role in a given replica set in simple layman English??
Thanks in advance.
In short: it is to stop the two normal nodes of the replica set getting into a split-brain situation if they lose contact with each other.
MongoDB replica sets are designed so that, if one or more members goes down or loses contact, the other members are able to keep going as long as between them they have a majority. The majority clause is important: without that, you might have a situation where the network is split in two, and the nodes on each side of the partition think that they're still carrying on the replica set, and end up with different sets of data.
So to avoid the split brain problem, the nodes of a replica set will not continue if they can't command an absolute majority. An example of this is if you have two nodes, in a replica set like this:
If they lose communication, the outcome is symmetrical:
Each one will reason the same way:
realise it has lost communication with the other
assess whether it is possible to keep the replica set going
realise that 1 node (out of 2) does not constitute a majority
revert to Secondary mode
The difference an Arbiter makes
If there is a third node, then even if the two main nodes lose contact with each other then there will still be one of them in contact with the arbiter. This allows the two main nodes to make different decisions, and keep the replica set going while avoiding the split-brain problem.
Consider the following example of a 3-node replica set:
Whichever way the network partition goes, one node will still be in contact with the arbiter; for example like this:
Node A will:
realise it can contact neither node B nor the arbiter
assess whether it is possible to keep the replica set going
realise that 1 node (out of 3) does not constitute a majority
revert to Secondary mode
Whereas node B is able to react differently:
realise it cannot contact node A, but still has contact with the arbiter
assess whether it is possible to keep the replica set going
realise that 2 nodes (out of 3) do constitute a majority
take over as Primary
This also illustrates how you should deploy an arbiter to get that benefit:
try to put the arbiter on a system independent of both the data-bearing nodes, to maximise the chance of it still being able to communicate with either throughout network problems
it doesn't need to store data, so you don't need high-spec hardware
Just 1 arbiter is enough to break the deadlock; you don't get any benefit from multiple arbiters
Take the example of a 2-member replica set: in the event of a network-partitioning, i.e., the 2 members lost touch of each other, who gets to become the primary? There will be a tie and a need for a tie-breaker. That would not be the case if we have a 3-member replica set: the group that contains two nodes will win and one of them will become primary. That is the basis of the requirement for an odd number of nodes in a replica set. As for an arbiter, it happens to be light weight so that I guess one can save money by having in place a smaller machine, since we do not expect it to hold any data, and that we just need it to be present to vote for primary.

Two nodes MongoDB replica set without arbiter

Is it possible to create a MongoDB replica set consisting of only 1 primary and 1 secondary member?
I would like to have delayed replica set that will copy data from primary with delay of 24 hours. I know I can put arbiter on one of the servers (primary or secondary, I know this is not advised but my only wish is to run this configuration on two servers) and it would run fine, but I want to know if it is possible to completely kick arbiter out.
It would look like this:
Short answer: don't.
Long answer: the way automatic failover works in MongoDB is that a replica set needs a qualified majority to successfully elect a new primary. Delayed members do have votes in elections. So if either of your nodes fails the replica set finds out that it doesn't have this majority and the current primary steps down even if it didn't fail. So what you essentially do is doubling the chances of making your replica set fail. An arbiter is a very cheap process, in term of RAM usage, CPU and even disk space when run with --smallfiles --no-journal --noprealloc or the equivalent options set in the config file. Note that the mentioned options are safe to use, since an arbiter essentially only checks the heartbeats of the data bearing nodes. You could put the arbiter on the application server for example.
Disclaimer: the following procedure is strongly discouraged to use. Proceed at your own risk.
You could set the votes of the delayed server to 0. This way the undelayed node will call for an election in case the delayed member fails, comes to the conclusion that it is the only node online of the replica set and that it has the majority of votes (1/1) and will continue to work as expected. This course of action needs some attention, as you will have an even number of votes again in case you add a member to the replica set later and makes it necessary to reconfigure the replica set. It also has serious implications with network fragmentation issues. Again: Use at your own risk
Yes, it is possible but not recommended. The caveat of this approach is no automatic failovers.
If you primary goes down then you will have to manually make the other server as primary.
If you are keeping you secondary only as a mirror of your primary and you are fine with manual failover then it should work for you.
More info here:
http://openmymind.net/Does-My-Replica-Set-Need-An-Arbiter/
Yes you can and all you really need to do is set the member to not be eligible for primary.
There is documentation on how to make sure a member cannot be elected as primary here: http://docs.mongodb.org/manual/tutorial/configure-secondary-only-replica-set-member/
In this case, the best option is add an arbiter. I tried before with votes, but on 2 nodes replicaset you can have some issues with sync.

Why do we need an 'arbiter' in MongoDB replication?

Assume we setup a MongoDB replication without arbiter, If the primary is unavailable, the replica set will elect a secondary to be primary. So I think it's kind of implicit arbiter, since the replica will elect a primary automatically.
So I am wondering why do we need a dedicated arbiter node? Thanks!
I created a spreadsheet to better illustrate the effect of Arbiter nodes in a Replica Set.
It basically comes down to these points:
With an RS of 2 data nodes, losing 1 server brings you below your voting minimum (which is "greater than N/2"). An arbiter solves this.
With an RS of even numbered data nodes, adding an Arbiter increases your fault tolerance by 1 without making it possible to have 2 voting clusters due to a split.
With an RS of odd numbered data nodes, adding an Arbiter would allow a split to create 2 isolated clusters with "greater than N/2" votes and therefore a split brain scenario.
Elections are explained [in poor] detail here. In that document it states that an RS can have 50 members (even number) and 7 voting members. I emphasize "states" because it does not explain how it works. To me it seems that if you have a split happen with 4 members (all voting) on one side and 46 members (3 voting) on the other, you'd rather have the 46 elect a primary and the 4 to be a read-only cluster. But, that's exactly what "limited voting" prevents. In that situation you will actually have a 4 member cluster with a primary and a 46 member cluster that is read only. Explaining how that makes sense is out of the scope of this question and beyond my knowledge.
Its necessary to have a arbiter in a replication for the below reasons:
Replication is more reliable if it has odd number of replica sets. Incase if there is even number of replica sets its better to add a arbiter in the replication.
Arbiters do not hold data in them and they are just to vote in election when there is any node failure.
Arbiter is a light weight process they do not consume much hardware resources.
Arbiters just exchange the user credentials data between the replica set which are encrypted.
Vote during elections,hearbeats and configureation data are not encrypted while communicating in between the replica sets.
It is better to run arbiter on a separate machine rather than along with any one of the replica set to retain high availability.
Hope this helps !!!
This really comes down to the CAP theorem whereby it is stated that if there are equal number of servers on either side of the partition the database cannot maintain CAP (Consistency, Availability, and Partition tolerance). An Arbiter is specifically designed to create an "imbalance" or majority on one side so that a primary can be elected in this case.
If you get an even number of nodes on either side MongoDB will not elect a primary and your set will not accept writes.
Edit
By either side I mean, for example, 2 on one side and 2 on the other. My English wasn't easy to understand there.
So really what I mean is both sides.
Edit
Wikipedia presents quite a good case for explaining CAP: http://en.wikipedia.org/wiki/CAP_theorem
Arbiters are an optional mechanism to allow voting to succeed when you have an even number of mongods deployed in a replicaset. Arbiters are light weight, meant to be deployed on a server that is NOT a dedicated mongo replica, i.e: the server's primary role is some other task, like a redis server. Since they're light they won't interfere (noticeably) with the system's resources.
From the docs :
An arbiter does not have a copy of data set and cannot become a
primary. Replica sets may have arbiters to add a vote in elections of
for primary. Arbiters allow replica sets to have an uneven number of
members, without the overhead of a member that replicates data.
http://docs.mongodb.org/manual/core/replica-set-arbiter/
http://docs.mongodb.org/manual/core/replica-set-elections/#replica-set-elections

Mongodb Roll back in replica set

Suppose you have a three node replica set. Node 1 is the primary. Node 2 is a secondary, Node 3 is a secondary running with a delay of 10 seconds. All writes to the database are issued with w=majority and j=1 (by which we mean that the getLastError call has those values set).
A write operation (could be insert or update) is initiated from your application at time=0. At time=5 seconds, the primary, Node 1, goes down for an hour and another node is elected primary.
Will there be a rollback of data when Node 1 comes back up? Choose the best answer.
Always yes
Always no
Maybe, it depends on whether Node 3 has processed the write.
Maybe, it depends on whether Node 2 has processed the write.
Any help would be greatly appreciated.
I am going to change my answer to 4 however, it should be 2 with a w=majority. You could have an edge case whereby wtimeout on a operation is returned and the operation did not get acked by the majority of the set. These problems should be very rare or almost never happen, but something to keep in mind.
Since a majority of the nodes (1 & 2) will ack the write, if node 1 goes down node 2 should have its operations and be upto speed as such node 1 should not need to rollback to node 2's state; instead node 1 will play catch up.
Journal is not so important for defining whether a rollback would exist or not.
Please read this relevant excerpt from MongoDB document: "A rollback does not occur if the write operations replicate to another member of the replica set before the primary steps down and if that member remains available and accessible to a majority of the replica set."
I think this is a Question from a Mongo DB exam, but the answer is easy to see:
Maybe, it depends on whether Node 2 has processed the write.