Distributed nodes to sync data without a single point of failure - real-time

Recently I've been reading about methods to horizontally scale a server.
The majority of methods, if not all, require a central server to sync the data (Redis is widely used here). That means that multiple nodes rely on one central server, one point of failure, to have data synced between them.
I know that I can have multiple instances of Redis and those nodes can connect to the backup Redis servers and keep syncing the data just in case the main Redis server fails, but I dont like the idea to have "2 layers" (Server layer and Redis layer) to achieve this. Is there any piece of software, or method, to have multiple nodes be synced between them only being connected between them?
I mean, like having the 3 Redis servers installed in the same 3 Server machines. If one servers goes down, it goes down along with the Redis server, so the other 2 Server machines connect to the fallback Redis server installed on their own machines, but instead of this hackish thingy, have a piece of software that simply connect between the nodes, sync the data and such.
I've been thinking on a system, but one problem arises. A problem that can be extended to the "having multiple Redis instances" thing.
Imagine that I have a system with X Redis servers in different datacenters, and also have X servers in different datacenters too. I don't want them in the same datacenter to avoid single point of failures. Now, let's say that half of the servers lose the connection to the other half, not because of machines going down, but because of connection failures (ISP problem and such). The slaves are going to think that the master Redis server is dead because the connection dropped and they cannot reconnect, and the same with part of the servers that also lost the connection, that are going to think that the master Redis is dead, and connect and sync data using the slaves.
Now we have an scenario of one master Redis server with X servers syncing data between them and some slave Redis servers with other X servers syncing data between them.
With this, all the distributed system to not have a single point of failure is a failure by itself, because still has a single point of failure that, instead of make people lose data, will make the system have unsynced garbage data.
And as it will be realtime data, having a "save timestamps a resync data when connection recovers in a few minutes" is not a solution.
What do you think? Any solution?

Related

Hosting Replicaset and Config servers MongoDB on other servers?

Should I keep the replicasets and config servers on separate servers? Or have one replicaset and one config server on one server? Can I have all replicasets on one server and all config servers on another one server? (Does this defeat the purpose of sharding?)
The purpose of sharding is distributing load on multiple servers. The purpose of replication is (mostly) redundancy by allowing one server to take the place of another when that server goes offline for some reason. Obviously, it does not make much sense in either case to run multiple instances on the same server. So yes, it would defeat the purpose of sharding.
However, when you only have two servers and have to choose between replication and sharding, you can get the best of both worlds by creating two shards where each shard has a secondary which runs on the server of the primary of the other shard. That way you have both the performance-improvement when everything is OK but don't lose access to half your data when one server goes down.
Regarding the config servers: MongoDB recommends to make them a separate replica-set which runs on separate servers. But when you are on a budget, it should technically be possible to put that replica-set on the same hardware which runs the actual database. The config servers are only required when a mongos process (re-)starts or when a chunk migration happens and are relatively idle the rest of the time. Unfortunately a chunk migration is also a phase where the involved shards are very busy, so running the config servers on the same hardware will make performance drops during chunk migrations even worse.

In a containerized cluster, should mongodb servers be running on a worker or a core service?

I'm trying to implement an architecture that's similar to the coreos's production architecture (shown below)
Should I run the database as a central service or one or more of the workers?
I figured the database needs some kind of replication, which makes me think that putting it in the worker cluster makes more sense, but I'm just not sure.
This should be run as a worker. The central services are the basic things that come with CoreOS (mainly etcd). The workers host your applications, the database being one of them. You do have a persistence issue because your database will have state to remember between restarts. So, there is a bigger issue of how do you make that persistence? One was to do it is use a host file and give the database an affinity to that host and mount the host file. Another thing you might consider is running more than one database (if your db technology supports that) and replicate that database so you have two (or more) copies in different workers. (non-affinity). If your database creates transaction logs that can be applied to a backup, you can manage those transaction logs in a worker.
Another thing to consider is not using a container for your database. The database is a weird animal, its care and feeding is not like the rest of the applications. So it is reasonable (in my opinion) to have your database managed and maintained outside the scope of your cluster (but still reachable by the cluster).

Can I keep two mongo databases synced?

I have an app that can run in offline mode. If offline it uses a local mongo database, if it has a data connection it will use a remote mongo database.
Is there an easy way to sync these two databases and make sure they both have the union of their collections and documents?
EDIT: Effectively there are two databases that could both have insertions and deletions happening on them that aren't happening on the other. At fixed points in time I would like to have both databases show the union of them both.
For example over a period of time.
DB1.insert(A)
DB1.insert(B)
DB2.insert(C)
DB1.remove(A)
RUN SYNC
DB1 = DB2 = {B, C}
EDIT2: Been doing some reading. It's not the intended purpose but could they be set up as slaves replica sets of the remote and used that way? Problem is that I think replicas need to have a replica hosts must be accessible by way of resolvable DNS. Not sure how the remote could access local host.
You could use replica set but MongoDB doesn’t support master-master replication. Let's assume if you have setup like this:
two nodes with priority 1 which will be used as remote servers
single arbiter to ensure majority if one of remotes dies
5 local dbs with priority set as 0
When your application goes offline, it will stay secondary so you won't be able to perform writes. When you go online it will sync changes from remote dbs but you still need some way of syncing local changes. One of dealing with could be using local fallback db which will be used for writes when you are offline. When you go online, you push all new records to master. A little bit trickier could be dealing with updates but it is doable.
Another problem is that it won't scale up if you'll need to add more applications. If I remember correctly, there is a 12 nodes per replica set limit. For small cluster DNS resolution could be solved by using ssh tunnels.
Another way of dealing with a problem could be using small restful service and document timestamps. Whenever app is online it can periodically push local inserts to remote and pull data from remote db.

Going from a single zookeeper server to a clustered configuration

I have a single server that I now want to replicate and go for higher availability. One of the elements in my software stack if Zookeeper, so it seems natural to go to a clustered configuration on it.
However, I have data on my single server, and I couldn't find any guide on going to a clustered setup. I tried setting up two independent instances and then going to a clustered configuration, but only data present on the elected master was preserved.
So, how can I safely go from a single server setup to a clustered setup without losing data?
If you go from 1 server straight to 3 servers, you may lose data, as the 2 new servers are sufficient to form a quorum, and elect one of themselves as a leader, ignoring the old server, and losing all data on that machine.
If you grow your cluster from 1 to 2, when the two servers start up, then a quorum can't form without the old server being involved, and data will not be lost. When the cluster finishes starting, all data will be synced to both servers.
Then you can grow your cluster from 2 to 3, and again a quorum can't form without at least 1 server that has a copy of the database, and again when the cluster finishes starting all data will be synced to all three servers.

How the primary server down will be handled automatically in mongodb replication

I never have my hands on coding. I got a doubt regarding mongodb replica sets
below is the situation
I have an alert monitoring application.
It is using mongodb with replica set with 3 nodes.
Applications Java code base keep connecting to the primary and doing some transactions.
Now my question is that,
if the primary server is down, how will it effect the application server.
I mean, would the app server writes error saying connection failed like errors.
OR
the replica set will pick one of the slaves automatically as master and provides the application server to do its activity. How will it happen...?
Thanks & Regards,
UDAY
The replica set will try to pick another server as the new primary. If you have three nodes, and one goes down, the other two will negotiate which one becomes the new master. If two go down, or somehow communication between the remaining breaks down, there will be no new master until the situation is recovered.
The official drivers support this automatic fail-over, as does the mongos routing server if you use it. So the application code does not need to do anything here.
I am not sure if there will be connection errors during the brief period of time this fail-over negotiation takes (you will probably get errors for a few seconds).