mongodb replicate-set with 3 servers but only 2 locations - mongodb

I have 3 x mongodb nodes configured as one shard (1 x primary and 2 x secondary) but I only have 2 data centers. If I host 1 x node in DataCenterA and 2 x nodes in DataCenterB and DataCenterB is down. Can I have anyway to get the node in DataCenterA to perform both read write instead of in read-only mode and once other nodes are back online the clusters can be retained?
Understand that the best practice is to have a third location to host each of node in one location but if I only have 2 x locations available do I have any way to make this option work?
Thanks a lot.

Yes you can by removing the node in data center A from the replica set and starting it back as a standalone node. You can add the node back once the other servers in the replica set are up.

Related

MongoDB failover when 2 nodes of a 3 node replicaset go down

I need to setup a mongo replicaset on two data centers.
For the sake of testing, I setup a replicaset of 3 nodes, thinking of putting 2 nodes on the local site - Primary and a secondary, and on the other site another standby.
However, if I take down the Primary and one of the standby's, the remaining standby stays as secondary, and is not promoted to become a Primary, like I expected.
Reading about it in other questions here, looke like the only solution is to use an arbiter on a third site, which is quite problematic.
As a temporary solution - is there a way to force this standalone secondary to become a primary?
In order to elect a PRIMARY the majority of all members must be up.
2 out of 3 nodes is not the majority. Typically the data center itself does not crash, usually you "only" lose the connection to a data center.
You can to following.
Put 2 nodes in first data center, and 1 node it second data center. In this setup the first data center acts as primary and must not fail! The second data center may fail.
Another setup is to put one node in each data center and an ARBITER on - a different site. This "different site" does not need to be a full-blown data center, the MongoDB ARBITER process is a very light process and does not store any data, so it could be a small host somewhere in your IT network. Of course, it must have connection to both data centers.

How to achieve replication of partial data for following use case

I want to build a system with the following data replication requirements.
In the image attached:
Node 1 has 2 entities Entity 1 and Entity 2.
Each entity has multiple rows of data say (Row1, Row2, Row3)
Node 2 and Node 3 are a full replica of Node1 and possibly in the same data center.
Node 4 sits in a different place altogether and has only Row 1 from Entity1 and Entity2.
Node 5 sits in another place and has only Row2 from Entity 1 and Entity2.
The idea is Node4 and Node5 will be in the geographic vicinity of the consumer system and the consumer can communicate with local copies in Node 4 and Node5 if the network is down.
On a normal business day - Its acceptable to limit all writes to Node1 and allow Node 4 or Node 5 to do the write only when Node 1 is down.
I am not sure which Database can support this without extensive management through code.
Data Model Replication
So far I have found this:
Cassandra can do keyspace based replication but it might be tricky as I have 2000+ remote locations for partial data. I can think of having to say 200 keyspaces with 10 locations sharing same keyspace, thus creating less overhead, even though data copied to local nodes will not always be useful to them.
Mongodb has an open request for this feature (https://jira.mongodb.org/browse/SERVER-1559)
Couchbase has XDCR based filtering, which looks like a potential solution.
Can you please suggest if my understanding is correct?
Yes, Couchbase XDCR is a viable solution. You could
1. set up Node 1, Node 4, and Node 5 as three separate data clusters
2. set up a uni-directional XDCR from Node 1 to Node 4 with a filtering expression that matches only Row 1
3. set up a uni-directional XDCR from Node 1 to Node 5 with a filtering expression that matches only Row 2.
For more information, please refer to https://docs.couchbase.com/server/6.0/learn/clusters-and-availability/xdcr-overview.html.
XDCR filtering is at: https://docs.couchbase.com/server/6.0/learn/clusters-and-availability/xdcr-filtering.html

Mongodb two nodes deployment

I know Mongodb suggest the replication set minimum is 3
Can I use two servers to install Mongodb with 4 replication sets in order to prevent Write failure when one node down?
My idea is to install two more instances on each server to fake/get rid of it:
Mongodb 1
Replcation set 1 (Master)
Replcation set 2 (Secondary)
Mongodb 2
Replcation set 3 (Secondary)
Replcation set 4 (Arbiter)
If one server down, it still has two replication sets for it.
Can anyone can comment my idea?
I would like to know is there any issue/risk needed to consider?
Thanks!
This is a bad idea. Your suggested configuration would be counter-productive for two main reasons:
Running two instances of MongoDB on the same hosts will lead to them both running slowly and awkwardly, competing for memory, cpu, disk i/o, etc.
If one of the hosts goes down, the two nodes on the other will not be able to continue the replica set because with only two nodes running out of four, you don't have the majority necessary to become primary.
MongoDB is intended to be run effectively on multiple hosts, so it is designed for that; even if you could find a way to shoe-horn a functioning replica set onto only two hosts, you would be working against MongoDB's design and struggling to make it work effectively.
Really, your best option is to run a single mongod instance on each of your two hosts, and to run an arbiter on a third (separate) host; that kind of normal 3-host configuration is an effective and straightforward way of achieving both data replication and high availability.
Please read https://docs.mongodb.com/manual/replication/
Even number of members in a set is not recommended. The only reason to add an Arbiter is to resolve this issue and increase number of members by 1 to make it odd. It makes no sense to add 2 Arbiters at all.
Similarly there is no point to put an Arbiter to the same server with any other member. It is quite lightweight, so you can co-locate it with an application server for example.
Having 2 db servers the best replica set configuration would be:
Mongodb 1
Master
Mongodb 2
Secondary
Any other server
Arbiter
Simple answer is, no you cannot do it..
Reason is "majority". If you loose one server of those two, you don't have majority of votes online.
This is reason, why 3 is minimum. It can be 3 data bearing nodes or 2 data bearing nodes and one arbiter. All of those have one vote and if you loose one of those, replica set still have 2/3 votes what is majority. Like 3/5 or 4/7.
Values 1/2 (or 2/4) are not majority.

Replica set architecture - Arbiter requirement

What should be the number of replica set members required to handle Disaster Recovery ( DR ) situation effectively.Currently we are using 3 node replica set ( 1 primary , 1 secondary in same region and 1 secondary in DR region ).
We are planning to add 2 arbiters to it to increase its fault tolerance.
Is it a good practice to use more than one arbiter instance ?
Would it be better to create the arbiter instance in DR zone ?
As JJussi points out, adding more than one arbiter will not help at all, but it might be useful to add further nodes (data-bearing and/or arbiter) to achieve maximum resilience and availability.
Your current arrangement is like this:
If your datacentre in region 1 goes down, then the node in the DR region won't be able to step up to primary, because it could not command a majority:
Even if you added a further data-bearing node and an arbiter, you would run into the same problem if they were in the same two regions.
Instead, what I recommend you configure is your existing two nodes in region 1, add a fourth data-bearing node to the DR region, but also add an arbiter but make sure the arbiter is in a different region again:
That way, even if the datacentre goes down in region 1 or the DR region, the nodes in the other region will be able to — with the help of the arbiter —
command a majority, and continue working:
Arbiters don't increase fault tolerance, because they don't hold data. You don't have any need to add arbiters to the current setup, because you already have an odd count of votes. You current node count (three) is perfect for DR, especially if all three nodes are in different data centers, even two of those are in same geographical region.
Of course you can always add one node (and then you need arbiter) to some other region, but normally three separated nodes is perfect DR situation. If all your current nodes are in USA, you could have "half" (enough to majority) of nodes located to Europe...

What is recommended configurations of MongoDb replicaset (2 DCs) for automatic primary selection upon DC failure

I need to distribute mongo nodes over 2 data centers.
I am bit confused by the fault-tolerance table :
Number of Members = 4
Majority Required to Elect a New Primary = 3
Does the number 4 mean I need total 5 voting members or can I have 3 voting members + 1 priority 0 hidden member ?
For example :
DC1 : P, H (priority=0)
DC2 : S, S
If DC1 is down , will DC2 elect a primary ?
If DC2 is down , do I need to convert H to arbitrer or it will remain Primary ?
Essentially, it would be great if someone can provide few recommended configuration of replicaset for 2 DCs that ensures automatic primary selection (selection with minimum manual effort) upon DC failure.
Thanks in advance,
Kaniska
Does the number 4 mean I need total 4 voting members or can I have 3 voting members + 1 priority 0 hidden member
A priority 0 member is a voting member. Priority 0 means the node cannot become primary (and it can't trigger elections).
Your sample setup shows only 4 total nodes split evenly between two data centers. The majority is still 3 and so, if either data center goes down, the replica set will be unhealthy and become read-only (no primary).
To create a 2 DC setup where failure of one DC will cause an automated failover to a primary in the other DC, there must be a node outside of either data center. With all nodes split between two data centers (and an odd number of nodes), at least one data center has a majority of the nodes. If that data center goes down, the replica set cannot automatically recover a healthy state while the majority-holding data center is down. Given this situation, a common pattern is to split nodes evenly between two data centers and have an arbiter outside of either.