I created a MongoDB Cluster with 3 shards, each shards contain 3 mongod-processes. My Cluster also contains 3 mongos and 3 config servers
In the connection string I put the 3 mongos
mongodb://user:pass#mongos1:27017,mongos2:27017,mongos3:27017/mydatabase
In the picture you can see that dbShard_2 has 1.19GB of data while the others are almost empty with only 4KB. But on the charts you can see that there is also read/write operation on all shards. Is everything fine or did I have made some wrong configurations? shall I worry?
I left the Cloud Manager do the whole configuration for me, I didn't set these by myself manually.
Here you can check my sharding status
mongos> db.printShardingStatus();
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("XXXXX")
}
shards:
{ "_id" : "dbShard_0", "host" : "dbShard_0/dbnode-0.x-app.com:27000,dbnode-1.x-app.com:27000,dbnode-2.x-app.com:27000" }
{ "_id" : "dbShard_1", "host" : "dbShard_1/dbnode-0.x-app.com:27001,dbnode-2.x-app.com:27001,dbnode-2.x-app.com:27002" }
{ "_id" : "dbShard_2", "host" : "dbShard_2/dbnode-0.x-app.com:27002,dbnode-1.x-app.com:27001,dbnode-2.x-app.com:27003" }
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "mydatabase-staging", "partitioned" : false, "primary" : "dbShard_2" }
{ "_id" : "mydatabase", "partitioned" : false, "primary" : "dbShard_2" }
{ "_id" : "test", "partitioned" : false, "primary" : "dbShard_0" }
Cluster
Related
So I have a sharded cluster with 2 config servers, 2 shards each with 2 replicas and 2 mongos instances, everything running on different VMs.
However, after configuring all of it, I finally tried to interact with the database which is empty with a simple show dbs query from the mongos instance, but it threw me the following error (after thinking for like 1 min):
uncaught exception: Error: listDatabases failed:{
"ok" : 0,
"errmsg" : "Could not find host matching read preference { mode: \"primary\" } for set rep",
"code" : 133,
"codeName" : "FailedToSatisfyReadPreference",
"operationTime" : Timestamp(1648722327, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1648722327, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
Everything seems to be well configured and when I do sh.status() from the mongos instance it identifies the shards and replicas as such:
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("62421dd6b5f9640f309faca0")
}
shards:
{ "_id" : "rep", "host" : "rep/192.168.86.136:26000,192.168.86.141:26001", "state" : 1 }
{ "_id" : "repb", "host" : "repb/192.168.86.142:26002,192.168.86.143:26003", "state" : 1 }
active mongoses:
"4.4.8" : 2
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 5
Last reported error: Empty host component parsing HostAndPort from ""
Time of Reported error: Thu Mar 31 2022 11:06:39 GMT+0100 (WEST)
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "config", "primary" : "config", "partitioned" : true }
config.system.sessions
shard key: { "_id" : 1 }
unique: false
balancing: true
chunks:
rep 919
repb 105
too many chunks to print, use verbose if you want to force print
{ "_id" : "testdb", "primary" : "rep", "partitioned" : false, "version" : { "uuid" : UUID("2e584dcd-25ea-4ba4-805c-b40928e26511"), "lastMod" : 1 } }
Maybe a firewall issue.
Every node in your cluster must be able to reach any other node via according port. See
Simple HTTP/TCP health check for MongoDB
Try this script to check each member of each replica set:
const MONGO_PASSWROD = '*******'
const AUTH_SOURCE = 'admin'
const user = db.runCommand({ connectionStatus: 1 }).authInfo.authenticatedUsers.shift().user;
const map = db.adminCommand("getShardMap").map;
for (let rs of Object.keys(map)) {
let uri = map[rs].split("/");
let connectionString = `mongodb://${user}:${MONGO_PASSWROD}#${uri[1]}/admin?replicaSet=${uri[0]}&authSource=${AUTH_SOURCE}`;
let replicaSet = Mongo(connectionString).getDB("admin");
for (let member of replicaSet.adminCommand({ replSetGetStatus: 1 }).members) {
if (!replicaSet.hello().hosts.includes(member.name)) continue;
printjsononeline({ replicaSet: rs, host: member.name, stateStr: member.stateStr, health: member.health });
if (member.health != 1 || !Array("PRIMARY", "SECONDARY").includes(member.stateStr))
print(`Member state of ${member.name} is '${member.stateStr}'`);
}
}
Turns out I configured the replica set wrongly, so all I had to do was recreate the volumes of all VMs and configure it all again from scratch. Now it works as it should.
MongoDB sharding cluster uses a "primary shard" to hold collection data in DBs in which sharding has been enabled (with sh.enableSharding()) but the collection itself has not been yet enabled (with sh.shardCollection()). The mongos process choses automatically the primary shard, except if the user state it explicitly as parameter of sh.enableSharding()
However, what happens in DBs where sh.enableSharding() has not been executed yet? Is there some "global primary" for these cases? How can I know which one it is? sh.status() doesn't show information about it...
I'm using MongoDB 4.2 version.
Thanks!
The documentation says:
The mongos selects the primary shard when creating a new database by picking the shard in the cluster that has the least amount of data.
If enableSharding is called on a database which already exists, the above quote would define the location of the database prior to sharding being enabled on it.
sh.status() shows where the database is stored:
MongoDB Enterprise mongos> use foo
switched to db foo
MongoDB Enterprise mongos> db.foo.insert({a:1})
WriteResult({ "nInserted" : 1 })
MongoDB Enterprise mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("5eade78756d7ba8d40fc4317")
}
shards:
{ "_id" : "shard01", "host" : "shard01/localhost:14442,localhost:14443", "state" : 1 }
{ "_id" : "shard02", "host" : "shard02/localhost:14444,localhost:14445", "state" : 1 }
active mongoses:
"4.3.6" : 2
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "config", "primary" : "config", "partitioned" : true }
{ "_id" : "foo", "primary" : "shard02", "partitioned" : false, "version" : { "uuid" : UUID("ff618243-f4b9-4607-8f79-3075d14d737d"), "lastMod" : 1 } }
{ "_id" : "test", "primary" : "shard01", "partitioned" : false, "version" : { "uuid" : UUID("4d76cf84-4697-4e8c-82f8-a0cfad87be80"), "lastMod" : 1 } }
foo is not partitioned and stored in shard02.
If enableSharding is called on a database which doesn't yet exist, the database is created and, the primary shard is specified, the specified shard is used as the primary shard. Test code here.
I'd like to make a mongo cluster with multiple mongo routers(mongos) with just one shard(mongod) like this figure.
So I made two mongo routers named 'mongorouter-1', 'mongorouter-2', and also made one shard named 'mongod'.
In mongorouter-1 I added 'mongod' well with this command.
sh.addShard("mongod:27017")
It works well, but In mongorouter-2 this command put an error, like
mongos> sh.addShard("mongod:27017")
{
"ok" : 0,
"errmsg" : "E11000 duplicate key error collection: admin.system.version index: _id_ dup key: { : \"shardIdentity\" }",
"code" : 11000,
"codeName" : "DuplicateKey",
"operationTime" : Timestamp(1558591937, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1558591937, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
In mongorouter-1, sh.status is this
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("5ce6683cc490bfc9325389cb")
}
shards:
{ "_id" : "shard0000", "host" : "mongod:27017", "state" : 1 }
active mongoses:
"4.0.6" : 1
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "config", "primary" : "config", "partitioned" : true }
and in mongorouter-2, sh.status is this
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("5ce668176a4dcc52fd230ac9")
}
shards:
active mongoses:
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "config", "primary" : "config", "partitioned" : true }
I don't know how to make multiple mongo routers connected to just one shard.
If you know the solution, help me. Thanks in advance.
I have mongoDb replica set , One primary one secondary and an arbiter to vote. I'm planning to implement sharding as the data is expected to grow exponentially.I find difficult in following mongoDb document for sharding. Could someone explain it clearly to set it up. Thanks in advance.
If you could do accomplish replicaset, sharding is pretty simple. Pretty much repeating the mongo documentation in fast forward here:
Below is for a sample setup: 3 configDB and 3 shards
For the below example you can run all of it one machine to see it all working.
If you need three shards setup three replica sets. (Assuming 3 Primary's are 127:0.0.1:27000, 127.0.0.1:37000, 127.0.0.1:47000)
Run 3 instances mongod as three config servers. (Assuming: 127.0.0.1:27020, 127.0.0.1:27021, 127.0.0.1:270122)
Start mongos (note the s in mongos) letting it know where your config servers are. (ex: 127.0.0.1:27023)
Connect to mongos from mongo shell and add the three primary mongod's of your 3 replica sets as the shards.
Enable sharding for your DB.
If required enable sharding for a collection.
Select a shard key if required. (Very Important you do it right the first time!!!)
Check the shard status
Pump data; connect to individual mongod primarys and see the data distributed across the three shards.
#start mongos with three configs:
mongos --port 27023 --configdb localhost:27017,localhost:27018,localhost:27019
mongos> sh.addShard("127.0.0.1:27000");
{ "shardAdded" : "shard0000", "ok" : 1 }
mongos> sh.addShard("127.0.0.1:37000");
{ "shardAdded" : "shard0001", "ok" : 1 }
mongos> sh.addShard("127.0.0.1:47000");
{ "shardAdded" : "shard0002", "ok" : 1 }
mongos> sh.enableSharding("db_to_shard");
{ "ok" : 1 }
mongos> use db_to_shard;
switched to db db_to_shard
mongos>
mongos> sh.shardCollection("db_to_shard.coll_to_shard", {collId: 1, createdDate: 1} );
{ "collectionsharded" : "db_to_shard.coll_to_shard", "ok" : 1 }
mongos> show databases;
admin (empty)
config 0.063GB
db_to_shard 0.078GB
mongos> sh.status();
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("557003eb4a4e61bb2ea0555b")
}
shards:
{ "_id" : "shard0000", "host" : "127.0.0.1:27000" }
{ "_id" : "shard0001", "host" : "127.0.0.1:37000" }
{ "_id" : "shard0002", "host" : "127.0.0.1:47000" }
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "test", "partitioned" : false, "primary" : "shard0000" }
{ "_id" : "db_to_shard", "partitioned" : true, "primary" : "shard0000" }
db_to_shard.coll_to_shard
shard key: { "collId" : 1, "createdDate" : 1 }
chunks:
shard0000 1
{ "collId" : { "$minKey" : 1 }, "createdDate" : { "$minKey" : 1 } } -->> { "collId" : { "$maxKey" : 1 }, "createdDate" : { "$maxKey" : 1 } } on : shard0000 Timestamp(1, 0)
I am trying to shard MongoDB. I am done with Sharding configuration, but I am not sure how to verify if sharding is functional.
How do i check whether my data is get sharded? Is there a query to verify/validate the shards?
You can also execute a simple command on your mongos router :
> use admin
> db.printShardingStatus();
which should output informations about your shards, your sharded dbs and your sharded collection as mentioned in the mongodb documentation
sharding version: { "_id" : 1, "version" : 2 }
shards:
{ "_id" : ObjectId("4bd9ae3e0a2e26420e556876"), "host" : "localhost:30001" }
{ "_id" : ObjectId("4bd9ae420a2e26420e556877"), "host" : "localhost:30002" }
{ "_id" : ObjectId("4bd9ae460a2e26420e556878"), "host" : "localhost:30003" }
databases:
{ "name" : "admin", "partitioned" : false,
"primary" : "localhost:20001",
"_id" : ObjectId("4bd9add2c0302e394c6844b6") }
my chunks
{ "name" : "foo", "partitioned" : true,
"primary" : "localhost:30002",
"sharded" : { "foo.foo" : { "key" : { "_id" : 1 }, "unique" : false } },
"_id" : ObjectId("4bd9ae60c0302e394c6844b7") }
my chunks
foo.foo { "_id" : { $minKey : 1 } } -->> { "_id" : { $maxKey : 1 } }
on : localhost:30002 { "t" : 1272557259000, "i" : 1 }
MongoDB has detailed documentation on Sharding here ...
http://www.mongodb.org/display/DOCS/Sharding+Introduction
To anwser you question (I think), see the portion on the config Servers ...
Each config server has a complete copy
of all chunk information. A two-phase
commit is used to ensure the
consistency of the configuration data
among the config servers.
Basically, it is the config server's job to make sure everything get sharded ... correctly.
Also, there are system collections you can query ...
db.runCommand( { listshards : 1 } );
Lots of help in the prez below too ...
http://www.slideshare.net/mongodb/mongodb-sharding-internals
http://www.10gen.com/video/mongosv2010/sharding
If you just want to check whether you are conencted to a sharded cluster or not:
db.isMaster() can be used to detect that you are connected to a sharding router (mongos).
If db.isMaster().msg is "isdbgrid", you are connected to a sharded instance.
db.isMaster() can be run without authentication.
For checking the details of the shards, sh.status() also works, which has the same output as db.printShardingStatus(); works.