After reading the official documentation for the MongoDB sharding architecture I have not found out why you need to have one or three config servers, and not another number.
The MongoDB documentation on Config Servers says:
"If one or two config servers become unavailable, the cluster’s metadata becomes read only. You can still read and write data from the shards, but no chunk migrations or splits will occur until all three servers are available."
Hence the reflection: one server is equivalent to a single point of failure, but with two servers we have the same behavior as three, right?
So, why absolutely three servers and not only two or more, in example?
Because the doc says also:
Config servers do not run as replica sets.
Config Server Protocols
MongoDB 3.0 and earlier only support a single type of config server deployment protocol which is referred to as the legacy SCCC (Sync Cluster Connection Configuration) as of MongoDB 3.2. An SCCC deployment has either 1 config server (development only) or 3 config servers (production).
MongoDB 3.2 deprecates the SCCC protocol and supports a new deployment type: Config Servers as Replica Sets (CSRS). A CSRS deployment has the same limits as a standard replica set, which can have 1 config server (development only) or up to 50 servers (production) as at MongoDB 3.2. A minimum of 3 CSRS servers is recommended for high availability in a production deployment, but additional servers may be useful for geographically distributed deployments.
SCCC (Sync Cluster Connection Configuration)
With SCCC, the config servers are updated using a two-phase commit protocol which requires consensus from multiple servers for a transaction. You can use a single config server for testing/development purposes, but in production usage you should always have 3. A practical answer for why you cannot use only 2 (or more than 3) servers in MongoDB is that the MongoDB code base only supports 1 or 3 config servers for an SCCC configuration.
Three servers provide a stronger guarantee of consistency than two servers, and allows for maintenance activity (for example, backups) on one config server while still having two servers available for your mongos to query. More than three servers would increase the time required to commit data across all servers.
The metadata for your sharded cluster needs to be identical across all config servers, and is maintained by the MongoDB sharding implementation. The metadata includes the essential details of which shards currently hold ranges of documents (aka chunks). In a SCCC configuration, config servers are not a replica set, so if one or more config servers are offline then the config data will be read only -- otherwise there is no means for the data to propagate to the offline config servers when they are back online.
Clearly 1 config server provides no redundancy or backup. With 2 config servers, a potential failure scenario is where the servers are available but the data on the servers does not agree (for example, one of the servers had some data corruption). With 3 config servers you can improve on the previous scenario: 2/3 servers might be consistent and you could identify the odd server out.
CSRS (Config Servers as Replica Sets)
MongoDB 3.2 deprecates the use of three mirrored mongod instances for config servers, and starting in 3.2 config servers are (by default) deployed as a replica set. Replica set config servers must use the WiredTiger 3.2+ storage engine (or another storage engine that supports the new readConcern read isolation semantics). CSRS also disallows some non-default replica set configuration options (e.g. arbiterOnly, buildIndexes, and slaveDelay) that are unsuitable for the sharded cluster metadata use case.
The CSRS deployment improves consistency and availability for config servers, since MongoDB can take advantage of the standard replica set read and write protocols for sharding config data. In addition, this allows a sharded cluster to have more than 3 config servers since a replica set can have up to 50 members (as at MongoDB 3.2).
With a CSRS deployment, write availability depends on maintaining a quorum of members that can see the current primary for a replica set. For example, a 3-node replica set would require 2 available members to maintain a primary. Additional members can be added for improved fault tolerance, subject to the same election rules as a normal replica set. A readConcern of majority is used by mongos to ensure that sharded cluster metadata can only be read once it is committed to a majority of replica set members and a readPreference of nearest is used to route requests to the nearest config server.
Related
Currently we are working with standalone mongodb without any replication or sharding, Now we are considering moving to replica-set for production purposes.
Will an application written for standalone mongodb will work for replica-set or sharded replica-set without any changes or are there some standalone/replica-set specific features in mongodb ?
Provided the MongoDB uses the default ports (27017 for standalone mongod and mongos) you don't need to touch your client application at all, it will work in either case.
Of course, when you connect to a MongoDB then a sharded cluster has more options, but the defaults are fine.
Will an application written for standalone mongodb will work for
replica-set or sharded replica-set without any changes or are there
some standalone/replica-set specific features in mongodb ?
Here are some things to think about when an application is to run on a replica-set or a sharded cluster. In addition, replica-sets and sharded clusters has some features not available in standalone deployment (see the Transactions and Change Streams topic at the bottom).
Replica Sets
A replica-set is cluster with multiple database servers - with replicated data on each server. The topology of the replica-set has one primary node (or member) and remaining members are secondaries (there can be other special purpose nodes like arbiters).
The data redundancy and failover features of replica-sets give your applications additional capabilties - for example, an application always runs even if a server is down.
The data is always written to the primary and read from it, by default. You can configure that the data can be read from the secondary nodes also from your application - this the Read Preference. This configuration can be used by the applications accessing a replica-set in some scenarios (see Read Preference Use Cases). This is for replica-sets and has no usage for standalone deployment.
Also, see Replica Set Read and Write Semantics:
From the perspective of a client application, whether a MongoDB
instance is running as a single server (i.e. “standalone”) or a
replica set is transparent. However, MongoDB provides additional read
and write configurations for replica sets.
Then, there are some things like, the Connection String URI, which uses different format for replica-set and sharded clusters - this is used by the applications to connect.
Sharded Cluster
The application should not be run in sharded cluster deployment as it is. It will require design level changes - and will affect the queries. Sharding is about distributing the data among shards. Note that in sharded cluster each shard is a replica-set. A sharded database can have sharded and un-sharded collections. Sharded collections are the distributed data.
To create a sharded collection, you must figure a shard key - this is the most important aspect of your application accessing a sharded collection. Shard key determines how the queries access particular shard to get the data. So, your application must take into consideration the shard key - the queries need to be created with shard key usage. Shard key affects the performance of your application queries, primarily.
Also, in the sharded cluster environment the application accesses the database via a mongos router - not the servers directly.
There are many other finer aspects when working with sharded databases and accessing for applications - the topic is too broad to discuss here. Changing from standalone to sharded cluster is an architectural change. Some aspects that can affect the application due to migrating from standalone to a replica-set also apply here (as each shard is a replica-set).
Also, see Operational Restrictions in Sharded Clusters - these are specific to sharded clusters and not applicable to standalone deployments.
Transactions and Change Streams
Features like transactions and change streams are available with replica-sets and sharded clusters only (and not on single standalone servers). This gives additional capabilites to your applications and can solve complex business logic and scenarios.
We want to create a MongoDB shard (v. 2.4). The official documentation recommends to have 3 config servers.
However, the policies of our company won't allow us to get 3 extra servers for this purpose. Since we have already 3 application servers (1 web node, 2 process nodes) we are considering to put the configuration servers in the same application servers, with the mongos. Availability is not critical for us.
What do you think about this configuration? Can we face some problem or is it discouraged for some reason?
Given that Availability is not critical for your use case, I would say it should be fine to place the config servers in the same application servers and mongos.
If one of the process nodes is down, you will lose: 1 x mongos, 1 application server and 1 config server. During this down time, the other two config servers will be read-only , which means there won't be balancing of shards, modification to cluster config etc. Although your other two mongos should still be operational (CRUD wise). If your web-node is down, then you have a bigger problem to deal with.
If two of the nodes are down (2 process nodes, or 1 web server and process node), again, you would have bigger problem to deal with. i.e. Your applications are probably not going to work anyway.
Having said that, please consider the capacity of these nodes to be able to handle a mongos, an application server and a config server. i.e. CPU, RAM, network connections, etc.
I would recommend to test the deployment architecture in a development/staging cluster first under your typical workload and use case.
Also see Sharded Cluster High Availability for more info.
Lastly, I would recommend to check out MongoDB v3.2 which is the current stable release. The config servers in v3.2 are modelled as a replica set, see Sharded Cluster config servers for more info.
Technically is it supported to begin with just one shard for a shard cluster? So we can be ready for adding additional one(s) at anytime, at the same time save the cost of additional shard(s) before we really need it(them)?
To go further, is it possible to have a shard running on one single instance, instead of having to be based off of a 3 instance replica set?
From here, sharding is:
A database architecture that partitions data by key ranges and
distributes the data among two or more database instances.
A shard will be either a replica set or a standalone mongod instance. It is possible for you to use a single machine by using different ports to establish distinct communication endpoints for the config, mongod and mongos processes on the single machine. Also, yes, you may add a shard at a later time when you need to expand.
However, the point of providing sharding is to support horizontal scaling. Additionally, the point of sharded clustering is to provide failover and redundancy support. By using a single shard on a single server, you are losing the benefits of scaling and certainly failover.
The recommended production architecture includes:
Three config servers on separate machines for each sharded cluster.
Two or more replica sets as shards.
One or more query routers (mongos); typically, one mongos instance per application server.
Peruse the Sharded Cluster Requirements section in the documentation to get a feel for whether or not your environment needs sharding and sharded clusters since there is complexity in establishing such an architecture.
I am aware that mongodb has a master-slave architecture.
Therefore, I was thinking that the master would be the single point of failure in mongoDB since it takes care of all the requests and sends it to the slave nodes. However, when the master fails, then a new master is reelected from the slaves. Therefore I need some clarification on where the single point of failure lies.
Does mongoDB have a single point of failure? Is it in the master node?
Thanks,
MongoDB can be set up in a way that there is no single point of failure (at least none specific to MongoDB).
When you set up replication as suggested (which includes primary, secondary and an arbiter on a 3rd server), the secondary will take the role of the primary when it goes down. Keep in mind that this only works when the applications know both the primary and the secondary (how to make it aware depends on the driver).
When you have a sharded cluster, the mongo router process (mongos) and the config servers becomes additional possible points of failure, but you can also set up reduntant routers and config servers. To send the clients to another mongos server when theirs goes down, you need a 3rd party load-balancing solution.
For a proper production MongoDB setup with clustering, MongoDB Inc. suggests:
At least 2 mongos routers
Exactly 3 config servers
3 servers per shard (primary, secondary and arbiter), where the arbiters do not necessarily need dedicated servers and can share hardware with the routers, config servers, members of a different replica-set or app servers.
3 instances for config servers
1 instance for webserver & mongos
1 instance for shard 1
then when i need to start more shards i can just add more instances?
also, what is a replica set? if i had say 3 servers to shard 1 then is that a replica set?
A Replica Set is a set of computers that are clones of each other. (i.e.: replicas) Within a given set there is an elected master. By default reads and writes go to this elected master and the replicas just "tail" the changes to be up-to-date copies. If the master fails, a new one is elected and the system just keeps going. The documentation is here.
So you ask about scaling with MongoDB. There are two types of scaling:
Read Scaling: use Replica Sets (see here)
Write Scaling: use Sharding
The minimum config for Replica Sets is
- 2 full replicas
- 1 arbiter (lightweight process, breaks ties when voting)
The minimum config for Sharding is
- 1 config server
- 1 mongod process (only one shard)
- 1 or more mongos (generaly on app server)
However, you probably don't want to run like this in production. Running only a single DB, means that you only have one source for the data which can result in large down-times or total data loss. This is generally solved by using replica sets.
Additionally, the config server is quite important. MongoDB supports 1 or 3 config servers. Most production deployments use 3. Note that config servers and arbiters are very lightweight and can live on other boxes or on Amazon micro instances.
Most production deployments with sharding also involve replica sets. In fact, they usually start as replica sets.
then when i need to start more shards i can just add more instances?
From a sharding perspective it should be as easy as:
- start new shard server
- run the addshard command from a mongos
Note that when you add a shard, you will need to allow for time and resources as data migrates between shards and everything re-balances.