How to turn off Mongo (in Replica Set) properly without DB down? - mongodb

My server set up is like that:
2 x Servers . The mongoDB has replica set among both servers. Each is one node.
and then I have my node.js server connect to the MongoDB.
What happen was.. when I kill the secondary server. (shutting down the server). The MongoDB at primary still up but the Node.js Server had connection issue with MongoDB then. Even I added the server back, it didn't work. I use mongoose and connect-mongo .
So, what happened? how to shut down Mongo node properly?

If you have a replica set with 2 nodes, when one node goes down the other will demote itself to secondary. If you aren't connecting with slaveOk true, then you won't be able to read (and in either case you won't be able to write).
This is a safety measure imposed by MongoDB, which requires that a majority (meaning half plus one) of a replica set be able to see one another in order to ensure that a primary can be safely elected. If a majority cannot be seen, the nodes in the minority cannot know whether the "other half" have elected a primary. Having two primaries at the same time would be Very Bad (TM), as that could lead to conflicting updates.
In situations where you only want to run two nodes, you can also run an arbiter to break ties in the case that one node goes down or becomes otherwise invisible to the replica set. An arbiter is a normal mongod process, but does not store any data -- essentially it only participates in elections, and is idle otherwise. In a replica set with 2 "normal" nodes and one arbiter, either one of the two data-holding nodes can go down without losing a majority.
For more, see the MongoDB documentation on replica sets and the documentation on artibers.

If your primary is still primary after you take down the secondary, it's a node's driver issue. Anyway you always should have an arbiter with an even number of replica nodes, the "why" is well documented on mongodb's doc.
In case this is a node.js issue, wich version of node-mongodb-native are you using ? I had some different replicaset issues 2 month ago but there have ben fixed with the latest versions. The last replicaset issue of the driver has ben closed the 9th Sept, you shoud giv it a try with the last tagged version (V0.9.6.18 as i'm writing this)

Related

Spring Boot Mongo DB Replica Set not working as expected [duplicate]

I am investigating using MongoDB ReplicaSet for high availability.
But just discovered that in ReplicaSet with 3 nodes, if PRIMARY mongod is the only one left (that is 2 other mongod instances died or were shut down), then after several seconds it switches role to SECONDARY and accepts writes no more. That makes Replica Set worth less than single instance.
I know & understand about PRIMARY election, but the PRIMARY role is fixed to a server (by using priority set to ,say, 10) and (for example due to network problems) other servers become inaccessible, why the main server just gives up?!
Tested with 2.4.8 on Windows (mongodb-win32-x86_64-2008plus-2.4.8) and Linux (CentOS) and 2.0.x on Linux
BOUNTY STARTED:
If the replica set gives up when PRIMARY feels alone, what are alternative to ensure 100% availability? Or maybe there is special configuration needed for the case. The current implementation makes ReplicaSet fragile in case of network problems.
UPDATED:
Alas, I have not said before the scenario when #3 goes down (PRIMARY & SECONDARY are left)
and then after a while SECONDARY goes down. Then PRIMARY really just "gives up", because it is already known that #3 is unavailable for some time. This was actually tested in my test environment.
var rsconfig = {"_id":"rs4","members":[{"_id":0,"host":"localhost:27041","priority":10},{"_id":1,"host":"localhost:27042"},{"_id":2,"host":"localhost:27043","arbiterOnly":true}]}
printjson(rsconfig)
rs.initiate(rsconfig)
We initially thought to put SECONDARY and #3 (that is ARBITER) on the same server,
but because of question in title, we cannot use such configuration.
Thanks to Alan Spencer for first explaining the logic that MongoDB takes.
This is expected, since the majority of the members are down MongoDB does not assume the last remaining member is consistent.
When you have a majority of the members down there are a couple of options: http://docs.mongodb.org/manual/tutorial/reconfigure-replica-set-with-unavailable-members/
You say that when the primary is cut off from the other two nodes it should stay up, otherwise write availability is lost, but that's not necessarily the case. If the other two nodes are actually up and on the other side of the network partition, then they have elected a new primary (as two out of three are a majority) and it is that primary that is accepting new writes.
If the previous primary continued to accept writes, you would have potentially conflicting data which there is no mechanism to resolve. Since MongoDB replica set is a single primary architecture (as opposed to a multi-master system) the election mechanism assures that there cannot be two primaries at the same time.
From the point of view of two secondaries, network partition is the same as primary being unavailable, and from the primary's point of view, network partition is indistinguishable from "both other nodes are down". It steps down, because in case of network partition there may already be another primary on the other side of it, and it assures there cannot be two primaries by stepping down.
It is not the case that the "replica set" gives up when primary feels alone - the reason primary steps down when it feels alone is precisely to preserve the integrity of the replica set as a whole. It is not true that setting high priority score fixes a role to a node - a primary can only be elected via consensus among majority - all priority scores do is influence election when all other things are equal.
I highly recommend the excellent "call me maybe" series as reading to understand the challenges of write availability in a distributed system: http://aphyr.com/posts/281-call-me-maybe-carly-rae-jepsen-and-the-perils-of-network-partitions
Just to chime in on the answers. The behavior in this scenario is expected. MongoDB uses a leader election algorithm to elect the new leader. So if there is no majority you cannot elect a leader and hence no writes.
Your only option at the point where 2 nodes are down is to reconfigure your replica set as a 1 node replica set to make it writeable. You can do this using the rs.reconfig cmd with just one server. However please note that this should just be a temporary and emergency configuration. For the longer duration you should have an odd number of total nodes (3+) in your replica set configuration.
Try to use arbiters, most documents say to use just one, but in you case, you need to win the election.
From http://docs.mongodb.org/manual/core/replica-set-architectures/ :
Fault tolerance for a replica set is the number of members that can
become unavailable and still leave enough members in the set to elect
a primary. In other words, it is the difference between the number of
members in the set and the majority needed to elect a primary. Without
a primary, a replica set cannot accept write operations. Fault
tolerance is an effect of replica set size, but the relationship is
not direct.
More on elections: http://docs.mongodb.org/manual/core/replica-set-elections/
More on arbiters: http://docs.mongodb.org/manual/faq/replica-sets/#how-many-arbiters-do-replica-sets-need

Mongodb Replicaset on AZURE with an Arbiter

I want to use MongoDB with replication; I created a VM with 2 secondary nodes and 1 arbiter:
1 Primary
2 Secondary
1 Arbiter
I'm trying to understand how this system works, so I have some questions:
1) According to information "If a replica set has an even number of members, add an arbiter." I added an arbiter. But I'm not sure if I have done it correctly. Does this even number apply to secondaries or to all members in total?
2) What does this arbiter doing? I actually don't understand its job.
3) I created public IP addresses for each VM, in order to connect to them from outside. I successfully connected from my application, using this connection string:
mongodb://username:password#vm0:27017,vm1:27017,vm2:27017/dbname?replicaSet=xxx&readPreference=primaryPreferred
I didn't add the arbiter in this connection string but Should I add it or not?
4) When I shut down the primary machine, one of the secondary machine successfully became primary as I expect. There is no problem in this case; but when I shut down the second primary machine my application throws an error. The second secondary node has not become primary - why is this happening?
5) If all VMs are working but I shut down the arbiter, my application again throws an error and I cannot connect to the db. I'm trying this because I'm thinking the case of if there will be something wrong on arbiter machine and it may be shut down in the future because of the maintenance or any other problems.
Maybe because I didn't understand the role of an arbiter; I'm thinking this is wrong but why it is not converting any secondary machine to arbiter? And why when I shut down the arbiter does the whole system not work?
Thanks.
1) If you have 1 Primary and 2 Secondaries, you have 3 members in your replica set. Therefore you should not be adding an arbiter. You already have an odd number of nodes.
2) An arbiter is a node which doesn't hold data and can't be elected as Primary. It is only used to elect a new Primary if the current Primary goes down.
For example, say you have 1 Primary and 1 Secondary. The replica set has 2 members. If the primary goes down, the replica set will attempt to vote to elect a new Primary. In order for a node to be elected, it needs to win over half the votes. But if the Secondary votes for itself, it will only get 1 out of 2 votes. That's not more than half so it will not be elected. Thus the replica set will not be able to elect a new Primary and your whole replica set will go down.
To fix this, you can add an arbiter to the replica set. This is usually a much smaller machine since it doesn't need to hold data. It just has one job, voting for the Secondary to be the new Primary in the case of elections.
But, since you already have 3 data-bearing nodes, you won't want to add an arbiter. You can read more about arbiters here.
3) You can add arbiters to connection strings but in general you won't need to. Adding the data-bearing nodes is just fine. That's what people usually do.
4) You have 4 members in the replica set. You took down 2 of them. That means there are only 2 votes left. The final secondary won't be able to get more than 50% of the votes so no Primary will be elected.
In general, testing two nodes going down is overkill. You probably want a 3 member replica set. Each member should be in a different availability zone (Availability Set in Azure). If two nodes go down your replica set will be unavailable. But two nodes going down at the same time is very unlikely if all nodes are in different availability zones. So don't worry too much about more than one node going down. If that's a real concern (in most applications it really isn't), you want to make a 5 member replica set.
5) That's weird. This sounds like your replica set might be configured incorrectly. As I said, you don't need an arbiter anyway. So you could just try setting it up again without the arbiter and see if it works. Open a new question if you're still having issues. Make sure to include the output of running rs.status() in your question.

Two nodes MongoDB replica set without arbiter

Is it possible to create a MongoDB replica set consisting of only 1 primary and 1 secondary member?
I would like to have delayed replica set that will copy data from primary with delay of 24 hours. I know I can put arbiter on one of the servers (primary or secondary, I know this is not advised but my only wish is to run this configuration on two servers) and it would run fine, but I want to know if it is possible to completely kick arbiter out.
It would look like this:
Short answer: don't.
Long answer: the way automatic failover works in MongoDB is that a replica set needs a qualified majority to successfully elect a new primary. Delayed members do have votes in elections. So if either of your nodes fails the replica set finds out that it doesn't have this majority and the current primary steps down even if it didn't fail. So what you essentially do is doubling the chances of making your replica set fail. An arbiter is a very cheap process, in term of RAM usage, CPU and even disk space when run with --smallfiles --no-journal --noprealloc or the equivalent options set in the config file. Note that the mentioned options are safe to use, since an arbiter essentially only checks the heartbeats of the data bearing nodes. You could put the arbiter on the application server for example.
Disclaimer: the following procedure is strongly discouraged to use. Proceed at your own risk.
You could set the votes of the delayed server to 0. This way the undelayed node will call for an election in case the delayed member fails, comes to the conclusion that it is the only node online of the replica set and that it has the majority of votes (1/1) and will continue to work as expected. This course of action needs some attention, as you will have an even number of votes again in case you add a member to the replica set later and makes it necessary to reconfigure the replica set. It also has serious implications with network fragmentation issues. Again: Use at your own risk
Yes, it is possible but not recommended. The caveat of this approach is no automatic failovers.
If you primary goes down then you will have to manually make the other server as primary.
If you are keeping you secondary only as a mirror of your primary and you are fine with manual failover then it should work for you.
More info here:
http://openmymind.net/Does-My-Replica-Set-Need-An-Arbiter/
Yes you can and all you really need to do is set the member to not be eligible for primary.
There is documentation on how to make sure a member cannot be elected as primary here: http://docs.mongodb.org/manual/tutorial/configure-secondary-only-replica-set-member/
In this case, the best option is add an arbiter. I tried before with votes, but on 2 nodes replicaset you can have some issues with sync.

MongoDB ReplicaSet - PRIMARY role falls to SECONDARY when only PRIMARY is left

I am investigating using MongoDB ReplicaSet for high availability.
But just discovered that in ReplicaSet with 3 nodes, if PRIMARY mongod is the only one left (that is 2 other mongod instances died or were shut down), then after several seconds it switches role to SECONDARY and accepts writes no more. That makes Replica Set worth less than single instance.
I know & understand about PRIMARY election, but the PRIMARY role is fixed to a server (by using priority set to ,say, 10) and (for example due to network problems) other servers become inaccessible, why the main server just gives up?!
Tested with 2.4.8 on Windows (mongodb-win32-x86_64-2008plus-2.4.8) and Linux (CentOS) and 2.0.x on Linux
BOUNTY STARTED:
If the replica set gives up when PRIMARY feels alone, what are alternative to ensure 100% availability? Or maybe there is special configuration needed for the case. The current implementation makes ReplicaSet fragile in case of network problems.
UPDATED:
Alas, I have not said before the scenario when #3 goes down (PRIMARY & SECONDARY are left)
and then after a while SECONDARY goes down. Then PRIMARY really just "gives up", because it is already known that #3 is unavailable for some time. This was actually tested in my test environment.
var rsconfig = {"_id":"rs4","members":[{"_id":0,"host":"localhost:27041","priority":10},{"_id":1,"host":"localhost:27042"},{"_id":2,"host":"localhost:27043","arbiterOnly":true}]}
printjson(rsconfig)
rs.initiate(rsconfig)
We initially thought to put SECONDARY and #3 (that is ARBITER) on the same server,
but because of question in title, we cannot use such configuration.
Thanks to Alan Spencer for first explaining the logic that MongoDB takes.
This is expected, since the majority of the members are down MongoDB does not assume the last remaining member is consistent.
When you have a majority of the members down there are a couple of options: http://docs.mongodb.org/manual/tutorial/reconfigure-replica-set-with-unavailable-members/
You say that when the primary is cut off from the other two nodes it should stay up, otherwise write availability is lost, but that's not necessarily the case. If the other two nodes are actually up and on the other side of the network partition, then they have elected a new primary (as two out of three are a majority) and it is that primary that is accepting new writes.
If the previous primary continued to accept writes, you would have potentially conflicting data which there is no mechanism to resolve. Since MongoDB replica set is a single primary architecture (as opposed to a multi-master system) the election mechanism assures that there cannot be two primaries at the same time.
From the point of view of two secondaries, network partition is the same as primary being unavailable, and from the primary's point of view, network partition is indistinguishable from "both other nodes are down". It steps down, because in case of network partition there may already be another primary on the other side of it, and it assures there cannot be two primaries by stepping down.
It is not the case that the "replica set" gives up when primary feels alone - the reason primary steps down when it feels alone is precisely to preserve the integrity of the replica set as a whole. It is not true that setting high priority score fixes a role to a node - a primary can only be elected via consensus among majority - all priority scores do is influence election when all other things are equal.
I highly recommend the excellent "call me maybe" series as reading to understand the challenges of write availability in a distributed system: http://aphyr.com/posts/281-call-me-maybe-carly-rae-jepsen-and-the-perils-of-network-partitions
Just to chime in on the answers. The behavior in this scenario is expected. MongoDB uses a leader election algorithm to elect the new leader. So if there is no majority you cannot elect a leader and hence no writes.
Your only option at the point where 2 nodes are down is to reconfigure your replica set as a 1 node replica set to make it writeable. You can do this using the rs.reconfig cmd with just one server. However please note that this should just be a temporary and emergency configuration. For the longer duration you should have an odd number of total nodes (3+) in your replica set configuration.
Try to use arbiters, most documents say to use just one, but in you case, you need to win the election.
From http://docs.mongodb.org/manual/core/replica-set-architectures/ :
Fault tolerance for a replica set is the number of members that can
become unavailable and still leave enough members in the set to elect
a primary. In other words, it is the difference between the number of
members in the set and the majority needed to elect a primary. Without
a primary, a replica set cannot accept write operations. Fault
tolerance is an effect of replica set size, but the relationship is
not direct.
More on elections: http://docs.mongodb.org/manual/core/replica-set-elections/
More on arbiters: http://docs.mongodb.org/manual/faq/replica-sets/#how-many-arbiters-do-replica-sets-need

What is the advantage to explicitly connecting to a Mongo Replica Set?

Obviously, I know why to use a replica set in general.
But, I'm confused about the difference between connecting directly to the PRIMARY mongo instance and connecting to the replica set. Specifically, if I am connecting to Mongo from my node.js app using Mongoose, is there a compelling reason to use connectSet() instead of connect()? I would assume that the failover benefits would still be present with connect(), but perhaps this is where I am wrong...
The reason I ask is that, in mongoose, the connectSet() method seems to be less documented and well-used. Yet, I cannot imagine a scenario where you would NOT want to connect to the set, since it is recommended to always run Mongo on a 3x+ replica set...
If you connect only to the primary then you get failover (that is, if the primary fails, there will be a brief pause until a new master is elected). Replication within the replica set also makes backups easier. A downside is that all writes and reads go to the single primary (a MongoDB replica set only has one primary at a time), so it can be a bottleneck.
Allowing connections to slaves, on the other hand, allows you to scale for reads (not for writes - those still have to go the primary). Your throughput is no longer limited by the spec of the machine running the primary node but can be spread around the slaves. However, you now have a new problem of stale reads; that is, there is a chance that you will read stale data from a slave.
Now think hard about how your application behaves. Is it read-heavy? How much does it need to scale? Can it cope with stale data in some circumstances?
Incidentally, the point of a minimum 3 members in the replica set is to offer resiliency and safe replication, not to provide multiple nodes to connect to. If you have 3 nodes and you lose one, you still have enough nodes to elect a new primary and have replication to a backup node.