Mongodb architecture and failover with two data centres - mongodb

I’m trying to figure out whether there is a way to seamlessly failover a mongo replicaset where most of the mongodb nodes live in the primary data centre. My current limitation is 2 data centres and third datacentre is out of the question. The issue I have is that if data centre 1 goes down, the secondary node in data centre 2 will not be promoted to primary without manual intervention.
Data centre 1 (Primary):
Mongo Node (Primary)
Mongo Node (Arbiter)
Data centre 2 (Secondary):
Mongo Node (Secondary)
I've looked at mongodb whitepapers but they state manual intervention is required to make the mongodb instance in dc2 primary if dc1 is lost.
My question is whether there is an architecture out there or configuration that will make it possible to lose data centre 1 and still have the ability to have a data centre 2 takeover with write enabled without manual intervention/reconfiguration. Is this possible without going down a 3 data centre architecture path. Is it possible to keep two 3 member replica sets at each site synchronised and potentially do the failover at a network level for the connecting applications?
Thanks.

If you go with 2 data centers to me easiest solution is to cover only fail in Primary. Good news is if Slave is dead - you only need to wait.
If access to Primary fails you need to callback procedure that will force Slave to Primary. This switch will cause downtime in your application if you don't spent more time to create a gateway that will buffer queries and waits for callback from the switch. In that way you will have only slowness with increase timeout.
After Primary is live again you need to connect back to it (because your Slave node is not reliable) - this will cause again downtime- you need another process that checks if Primary is alive (from data center 2) and if it is trigger event and proceed with callback.
Manual intervention to force Slave as Primary can be wrapped to script.
To me here best solution is to go with 3rd data center where arbiter will stay. The effort to skip that and put application logic there is not worthy. Automatic failover in Mongo works very well and its reliable. You may have lots of problems if you go with application logic to achieve that with 2 data centers ... I rather go with their recommendation.

First, as you have noticed, you cannot do automatic fail over with only two nodes. Second, money is not real issue when you think that "third" data center. You may ask why or "how so"?
You need arbiter, as you know. Arbiter don't need resources really, any small Linux machine will do fine. Small VPS machines don't cost that much. Here you can find machine 1 x 2.40 GHz, 512 MB, 20 GB with only 1,24€/month. From here you get beefier machine with 1.99€/month.
Actually both those places could run quite big mongodb with those "tiny" machines.

Related

Sharding with replication

Sharding with replication]1
I have a multi tenant database with 3 tables(store,products,purchases) in 5 server nodes .Suppose I've 3 stores in my store table and I am going to shard it with storeId .
I need all data for all shards(1,2,3) available in nodes 1 and 2. But node 3 would contain only shard for store #1 , node 4 would contain only shard for store #2 and node 5 for shard #3. It is like a sharding with 3 replicas.
Is this possible at all? What database engines can be used for this purpose(preferably sql dbs)? Did you have any experience?
Regards
I have a feeling you have not adequately explained why you are trying this strange topology.
Anyway, I will point out several things relating to MySQL/MariaDB.
A Galera cluster already embodies multiple nodes (minimum of 3), but does not directly support "sharding". You can have multiple Galera clusters, one per "shard".
As with my comment about Galera, other forms of MySQL/MariaDB can have replication between nodes of each shard.
If you are thinking of having a server with all data, but replicate only parts to readonly Replicas, there are settings for replicate_do/ignore_database. I emphasize "readonly" because changes to these pseudo-shards cannot easily be sent back to the Primary server. (However see "multi-source replication")
Sharding is used primarily when there is simply too much traffic to handle on a single server. Are you saying that the 3 tenants cannot coexist because of excessive writes? (Excessive reads can be handled by replication.)
A tentative solution:
Have all data on all servers. Use the same Galera cluster for all nodes.
Advantage: When "most" or all of the network is working all data is quickly replicated bidirectionally.
Potential disadvantage: If half or more of the nodes go down, you have to manually step in to get the cluster going again.
Likely solution for the 'disadvantage': "Weight" the nodes differently. Give a height weight to the 3 in HQ; give a much smaller (but non-zero) weight to each branch node. That way, most of the branches could go offline without losing the system as a whole.
But... I fear that an offline branch node will automatically become readonly.
Another plan:
Switch to NDB. The network is allowed to be fragile. Consistency is maintained by "eventual consistency" instead of the "[virtually] synchronous replication" of Galera+InnoDB.
NDB allows you to immediately write on any node. Then the write is sent to the other nodes. If there is a conflict one of the values is declared the "winner". You choose which algorithm for determining the winner. An easy-to-understand one is "whichever write was 'first'".

MongoDB failover when 2 nodes of a 3 node replicaset go down

I need to setup a mongo replicaset on two data centers.
For the sake of testing, I setup a replicaset of 3 nodes, thinking of putting 2 nodes on the local site - Primary and a secondary, and on the other site another standby.
However, if I take down the Primary and one of the standby's, the remaining standby stays as secondary, and is not promoted to become a Primary, like I expected.
Reading about it in other questions here, looke like the only solution is to use an arbiter on a third site, which is quite problematic.
As a temporary solution - is there a way to force this standalone secondary to become a primary?
In order to elect a PRIMARY the majority of all members must be up.
2 out of 3 nodes is not the majority. Typically the data center itself does not crash, usually you "only" lose the connection to a data center.
You can to following.
Put 2 nodes in first data center, and 1 node it second data center. In this setup the first data center acts as primary and must not fail! The second data center may fail.
Another setup is to put one node in each data center and an ARBITER on - a different site. This "different site" does not need to be a full-blown data center, the MongoDB ARBITER process is a very light process and does not store any data, so it could be a small host somewhere in your IT network. Of course, it must have connection to both data centers.

MongoDB load balancing in multiple AWS instances

We're using amazon web service for a business application which is using node.js server and mongodb as database. Currently the node.js server is runing on a EC2 medium instance. And we're keeping our mongodb database in a separate micro instance. Now we want to deploy replica set in our mongodb database, so that if the mongodb gets locked or unavailble, we still can run our database and get data from it.
So we're trying to keep each member of the replica set in separate instances, so that we can get data from the database even if the instance of the primary memeber shuts down.
Now, I want to add load balancer in the database, so that the database works fine even in huge traffic load at a time. In that case I can read balance the database by adding slaveOK config in the replicaSet. But it'll not load balance the database if there is huge traffic load for write operation in the database.
To solve this problem I got two options till now.
Option 1: I've to shard the database and keep each shard in separate instance. And under each shard there will be a reaplica set in the same instance. But there is a problem, as the shard divides the database in multiple parts, so each shard will not keep same data within it. So if one instance shuts down, we'll not be able to access the data from the shard within that instance.
To solve this problem I'm trying to divide the database in shards and each shard will have a replicaSet in separate instances. So even if one instance shuts down, we'll not face any problem. But if we've 2 shards and each shard has 3 members in the replicaSet then I need 6 aws instances. So I think it's not the optimal solution.
Option 2: We can create a master-master configuration in the mongodb, that means all the database will be primary and all will have read/write access, but I would also like them to auto-sync with each other every so often, so they all end up being clones of each other. And all these primary databases will be in separate instance. But I don't know whether mongodb supports this structure or not.
I've not got any mongodb doc/ blog for this situation. So, please suggest me what should be the best solution for this problem.
This won't be a complete answer by far, there is too many details and I could write an entire essay about this question as could many others however, since I don't have that kind of time to spare, I will add some commentary about what I see.
Now, I want to add load balancer in the database, so that the database works fine even in huge traffic load at a time.
Replica sets are not designed to work like that. If you wish to load balance you might in fact be looking for sharding which will allow you to do this.
Replication is for automatic failover.
In that case I can read balance the database by adding slaveOK config in the replicaSet.
Since, to stay up to date, your members will be getting just as many ops as the primary it seems like this might not help too much.
In reality instead of having one server with many connections queued you have many connections on many servers queueing for stale data since member consistency is eventual, not immediate unlike ACID technologies, however, that being said they are only eventually consistent by 32-odd ms which means they are not lagging enough to give decent throughput if the primary is loaded.
Since reads ARE concurrent you will get the same speed whether you are reading from the primary or secondary. I suppose you could delay a slave to create a pause of OPs but that would bring back massively stale data in return.
Not to mention that MongoDB is not multi-master as such you can only write to one node a time makes slaveOK not the most useful setting in the world any more and I have seen numerous times where 10gen themselves recommend you use sharding over this setting.
Option 2: We can create a master-master configuration in the mongodb,
This would require you own coding. At which point you may want to consider actually using a database that supports http://en.wikipedia.org/wiki/Multi-master_replication
This is since the speed you are looking for is most likely in fact in writes not reads as I discussed above.
Option 1: I've to shard the database and keep each shard in separate instance.
This is the recommended way but you have found the caveat with it. This is unfortunately something that remains unsolved that multi-master replication is supposed to solve, however, multi-master replication does add its own ship of plague rats to Europe itself and I would strongly recommend you do some serious research before you think as to whether MongoDB cannot currently service your needs.
You might be worrying about nothing really since the fsync queue is designed to deal with the IO bottleneck slowing down your writes as it would in SQL and reads are concurrent so if you plan your schema and working set right you should be able to get a massive amount of OPs.
There is in fact a linked question around here from a 10gen employee that is very good to read: https://stackoverflow.com/a/17459488/383478 and it shows just how much throughput MongoDB can achieve under load.
It will grow soon with the new document level locking that is already in dev branch.
Option 1 is the recommended way as pointed out by #Sammaye but you would not need 6 instances and can manage it with 4 instances.
Assuming you need below configuration.
2 shards (S1, S2)
1 copy for each shard (Replica set secondary) (RS1, RS2)
1 Arbiter for each shard (RA1, RA2)
You could then divide your server configuration like below.
Instance 1 : Runs : S1 (Primary Node)
Instance 2 : Runs : S2 (Primary Node)
Instance 3 : Runs : RS1 (Secondary Node S1) and RA2 (Arbiter Node S2)
Instance 4 : Runs : RS2 (Secondary Node S2) and RA1 (Arbiter Node S1)
You could run arbiter nodes along with your secondary nodes which would help you in election during fail-overs.

Two Datacenters, connectivity breaks, both continue writing, connectivity returns, sync?

We have two datacenters, and are writing data to Mongo from both datacenters. The collection is sharded and we have the primary for one shard in datacenter A and primary for the other in datacenter B. Occasionally, connectivity between the datacenters fails.
We'd like to be able to continue writing IN BOTH DATACENTERS. The data we're writing won't conflict - they're both just adding documents, or updating documents that won't be updated in two places.
Then, when connectivity returns (sometimes in seconds, or even minutes), we'd like the database to cope with this situation nicely and have all the data updated automatically.
Can someone please advise if this is possible? It doesn't say much in the docs about what happens when you divide a replica set to two independent DB's, then get both to become master until you reconnect them. What happens? How do I set this up?
I don't see why this wouldn't work the way you already have it set up presuming that your secondaries are in the same data center as your primary.
In other words, if primary and secondaries for shard A are in data center A and primary and secondaries for shard B are in data center B, then you are already writing in both data centers.
If you now lose connectivity between the two data centers, then clients from data center A won't be able to read or write to shard B and clients in data center B won't be able to write to shard A but both data center clients will continue writing to the shard that's at the same data center as they are.
So, that's simple then - keep the majority of the replica set at the same data center and you will continue writing to that shard as long as that data center is up.
I have a feeling though that you expect that somehow magically clients from a disconnected data center will stash away their writes for the other data center's shards somewhere - that cannot happen - they cannot see the other data center. So when connectivity returns, there is nothing for the DB to cope with (other than the fact that there were a bunch of writes that failed during disconnected phase).
It's not possible to "divide a replica set" to have two primaries in the same set.
So you have two replica sets, which shards on one key using mongos as a router.
One solution is if first part of a sharding key is set to start with 'A' or 'B', which means if a new record starts with A it gets routed to the first set, and if it's B it gets routed to the 2nd set.
This is the only way you can control where mongos will try to put the data.
Connectivity problems between mongos and the different replicasets doesn't matter as long as your new entries doesn't have a sharding key which match the broken replicaset.
You could for example let the mongo client in datacenter A that writes data to always start the sharding key with A. Which means if datacenter B is down, only records with A is created.
Clients on both centers will have access to reads from both shards as long as they are up.
Mongos should run close to each client, so you will have these on both locations which will each have access to the sharding configuration
Replica set node cannot become master unless it sees at least half of the nodes. This means that if there's no link to the datacenter where primary is, you can't write to that shard.
I can see how one can implement a home-grown solution for this. But it requires some effort.
You might also want to look at CouchDB. It's also schemaless and JSON-based and it can handle your situation well (it was basically built for situations like this).

mongoDB replication+sharding on 2 servers reasonable?

Consider the following setup:
There a 2 physical servers which are set up as a regular mongodb replication set (including an arbiter process, so automatic failover will work correctly).
now, as far as i understand, most actual work will be done on the primary server, while the slave will mostly just do work to keep its dataset in sync.
Would it be reasonable, to introduce sharding into this setup in a way that one would set up another replication set on the same 2 servers, so that each of them has one mongod process running as primary and one process running as secondary.
The expected result would be that both servers will share the workload of actual querys/inserts while both are up. In the case of one server failing the whole setup should elegantly fail over to continue running, until the other server is restored.
Are there any downsides to this setup, except the overall overhead in setup and number of processes (mongos/configservers/arbiters)?
That would definitely work. I'd asked a question in the #mongodb IRC channel a bit ago as to whether or not it was a bad idea to run multiple mongod processes on a single machine. The answer was "as long as you have the RAM/CPU/bandwidth, go nuts".
It's worth noting that if you're looking for high-performance reads, and don't mind writes being a bit slower, you could:
Do your writes in "safe mode", where the write doesn't return until it's been propagated to N servers (in this case, where N is the number of servers in the replica set, so all of them)
Set the driver-appropriate flag in your connection code to allow reading from slaves.
This would get you a clustered setup similar to MySQL - write once on the master, but any of the slaves is eligible for a read. In a circumstance where you have many more reads than writes (say, an order of magnitude), this may be higher performance, but I don't know how it'd behave when a node goes down (since writes may stall trying to write to 3 nodes, but only 2 are up, etc - that would need testing).
One thing to note is that while both machines are up, your queries are being split between them. When one goes down, all queries will go to the remaining machine thus doubling the demands placed on it. You'd have to make sure your machines could withstand a sudden doubling of queries.
In that situation, I'd reconsider sharding in the first place, and just make it an un-sharded replica set of 2 machines (+1 arbiter).
You are missing one crucial detail: if you have a sharded setup with two physical nodes only, if one dies, all your data is gone. This is because you don't have any redundancy below the sharding layer (the recommended way is that each shard is composed of a replica set).
What you said about the replica set however is true: you can run it on two shared-nothing nodes and have an additional arbiter. However, the recommended setup would be 3 nodes: one primary and two secondaries.
http://www.markus-gattol.name/ws/mongodb.html#do_i_need_an_arbiter