I have 3 nodes for sharding and configserver (sharding servers run on standard port 27017 and configserver running on port 27019)
stage-mongo1-tmp, stage-mongo2-tmp, and stage-mongo3-tmp
and a query router
stage-query0-mongo
in my current setup.
Sharding is working perfect as expected.
--- Sharding Status ---
sharding version: {
"_id" : 1,
"version" : 3,
"minCompatibleVersion" : 3,
"currentVersion" : 4,
"clusterId" : ObjectId("5321a5cc8a18e5280f7c9d5a")
}
shards:
{ "_id" : "shard0000", "host" : "stage-mongo1-tmp:27017" }
{ "_id" : "shard0001", "host" : "stage-mongo2-tmp:27017" }
{ "_id" : "shard0002", "host" : "stage-mongo3-tmp:27017" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "testdb", "partitioned" : true, "primary" : "shard0000" }
testdb.testcollection
shard key: { "_id" : "hashed" }
chunks:
shard0001 28
shard0002 31
shard0000 28
too many chunks to print, use verbose if you want to force print
Now, I was enabling replica set on these nodes. I logged in to stage-mongo1-tmp, and ran
rs.initiate()
and added stage-mongo2-tmp and stage-mongo3-tmp as the replica members as
rs.add("stage-mongo2-tmp")
Log files says replication enabled and elected one primary.
rs.conf()
was showing good output
[rsBackgroundSync] replSet syncing to: stage-mongo1-tmp:27017
[rsSync] oplog sync 2 of 3
[rsSyncNotifier] replset setting oplog notifier to stage-mongo1-tmp:27017
[rsSync] replSet initial sync building indexes
[rsSync] replSet initial sync cloning indexes for : ekl_omniscient
[rsSync] build index ekl_omniscient.serviceability { _id: "hashed" }
[rsSync] build index done. scanned 571242 total records. 3.894 secs
replSet RECOVERING
replSet initial sync done
replSet SECONDARY
However, when I test the High-availability by taking one node down, mongos on the query node is returning error saying
mongos> show dbs;
Thu Mar 13 20:17:04.394 listDatabases failed:{
"code" : 11002,
"ok" : 0,
"errmsg" : "exception: socket exception [CONNECT_ERROR] for stage-mongo1-tmp:27017"} at src/mongo/shell/mongo.js:46
When I connect to one of the other node, one has automatically elected as primary. But still, my queries are returning errors.
What am I doing wrong here in replica set? Why is it not high-available? Do I need to add more servers to make it high-available? I am looking for a minimum set of servers to implement this.
Figured out. We add shard for the replica set.
sh.addShard("rs0/:port,..)
Once this is done, we need to enable sharding on db level and collection level. This would enable sharding and replica.
Related
MongoDB sharding cluster uses a "primary shard" to hold collection data in DBs in which sharding has been enabled (with sh.enableSharding()) but the collection itself has not been yet enabled (with sh.shardCollection()). The mongos process choses automatically the primary shard, except if the user state it explicitly as parameter of sh.enableSharding()
However, what happens in DBs where sh.enableSharding() has not been executed yet? Is there some "global primary" for these cases? How can I know which one it is? sh.status() doesn't show information about it...
I'm using MongoDB 4.2 version.
Thanks!
The documentation says:
The mongos selects the primary shard when creating a new database by picking the shard in the cluster that has the least amount of data.
If enableSharding is called on a database which already exists, the above quote would define the location of the database prior to sharding being enabled on it.
sh.status() shows where the database is stored:
MongoDB Enterprise mongos> use foo
switched to db foo
MongoDB Enterprise mongos> db.foo.insert({a:1})
WriteResult({ "nInserted" : 1 })
MongoDB Enterprise mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("5eade78756d7ba8d40fc4317")
}
shards:
{ "_id" : "shard01", "host" : "shard01/localhost:14442,localhost:14443", "state" : 1 }
{ "_id" : "shard02", "host" : "shard02/localhost:14444,localhost:14445", "state" : 1 }
active mongoses:
"4.3.6" : 2
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "config", "primary" : "config", "partitioned" : true }
{ "_id" : "foo", "primary" : "shard02", "partitioned" : false, "version" : { "uuid" : UUID("ff618243-f4b9-4607-8f79-3075d14d737d"), "lastMod" : 1 } }
{ "_id" : "test", "primary" : "shard01", "partitioned" : false, "version" : { "uuid" : UUID("4d76cf84-4697-4e8c-82f8-a0cfad87be80"), "lastMod" : 1 } }
foo is not partitioned and stored in shard02.
If enableSharding is called on a database which doesn't yet exist, the database is created and, the primary shard is specified, the specified shard is used as the primary shard. Test code here.
sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("5dfa6c3cb121a735f9ad8f6e")
}
shards:
{ "_id" : "s0", "host" : "s0/localhost:37017,localhost:37018,localhost:37019", "state" : 1 }
{ "_id" : "s1", "host" : "s1/localhost:47017,localhost:47018,localhost:47019", "state" : 1 }
{ "_id" : "s2", "host" : "s2/localhost:57017,localhost:57018,localhost:57019", "state" : 1 }
active mongoses:
"4.2.1" : 1
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Balancer active window is set between 00:00 and 23:59 server local time
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "apple", "primary" : "s2", "partitioned" : true, "version" : { "uuid" : UUID("20431f1a-ddb1-4fac-887e-c4c5db01b211"), "lastMod" : 1 } }
apple.user
shard key: { "userId" : 1 }
unique: false
balancing: true
chunks:
undefined undefined
too many chunks to print, use verbose if you want to force print
{ "_id" : "config", "primary" : "config", "partitioned" : true }
config.system.sessions
shard key: { "_id" : 1 }
unique: false
balancing: true
chunks:
undefined undefined
too many chunks to print, use verbose if you want to force print
After starting balancing using sh.startBalancer() when I see the status the balancer running status is still false.
Is there anything need to configure while creating shard?
After starting balancing using sh.startBalancer() when I see the
status the balancer running status is still false.
Balancer is a process which is responsible for evenly distributing chunks across a sharded cluster. It is an automatic process. By default, the balancer is enabled. It runs on the primary of the config server replica-set (mongos in 3.4 version or earlier).
The balancer runs only when needed. The balancer process checks the chunk distribution across the cluster and looks for certain migration thresholds. It identifies which shard has too many chunks in the cluster. If it detects an imbalance it starts a Balancer Round. It moves the chunks across shards in the cluster in an attempt to achieve an even data distribution.
From the sh.status output in the post, the balancer is enabled and not running.
balancer:
Currently enabled: yes
Currently running: no
NOTE: The balancer will run automatically when even chunk distribution is needed.
You can manually start and stop the balancer any time; the commands sh.startBalancer() enable the balancer and sh.stopBalancer() disables the balancer temporarily when needed.
sh.getBalancerState() tells if the balancer is enabled and or not. sh.enableBalancing() does not start balancing. Rather, it allows balancing of a collection the next time the balancer runs.
Reference: See Sharded Cluster Balancer.
I am having a problem with my sharded cluster.
I setup a new cluster, with 1 router, 2 replica set shards(2 nodes each), and a single 3 cluster config cluster.
I believe I setup everything correctly, created collections, added indexes, but when I go to insert or query data into the collections, I get the error:
Error: error: {
"ok" : 0,
"errmsg" : "None of the hosts for replica set configReplSet could be contacted.",
"code" : 71
}
configReplSet is my config replica set. It is accessible from the box, I was able to use a mongo shell to log into the primary of the RS.
Any help into what would cause this error would be greatly appreciated.
Here is my sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("58a761728dfc0e1090b7c592")
}
shards:
{ "_id" : "rs0", "host" : "rs0/mdbshard-b1:27017,mdbshard-b1rep:27017" }
{ "_id" : "rs1", "host" : "rs1/mdbshard-b2rep:27018,mdbshard-b2:27017" }
active mongoses:
"3.2.12" : 1
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
and my shard router config file
replication:
localPingThresholdMs: 15
sharding:
autoSplit: true
configDB: "configReplSet/mdbcfg-b1:27019,mdbcfg-b2:27019,mdbcfg-b3:27019"
chunkSize: 64
processManagement:
fork: true
systemLog:
destination: file
path: "/var/log/mongodb/mongodb.log"
logAppend: true
Please let me know if you need any other information, I would be happy to provide it.
Check if this helps Relevant Error
I have mongoDb replica set , One primary one secondary and an arbiter to vote. I'm planning to implement sharding as the data is expected to grow exponentially.I find difficult in following mongoDb document for sharding. Could someone explain it clearly to set it up. Thanks in advance.
If you could do accomplish replicaset, sharding is pretty simple. Pretty much repeating the mongo documentation in fast forward here:
Below is for a sample setup: 3 configDB and 3 shards
For the below example you can run all of it one machine to see it all working.
If you need three shards setup three replica sets. (Assuming 3 Primary's are 127:0.0.1:27000, 127.0.0.1:37000, 127.0.0.1:47000)
Run 3 instances mongod as three config servers. (Assuming: 127.0.0.1:27020, 127.0.0.1:27021, 127.0.0.1:270122)
Start mongos (note the s in mongos) letting it know where your config servers are. (ex: 127.0.0.1:27023)
Connect to mongos from mongo shell and add the three primary mongod's of your 3 replica sets as the shards.
Enable sharding for your DB.
If required enable sharding for a collection.
Select a shard key if required. (Very Important you do it right the first time!!!)
Check the shard status
Pump data; connect to individual mongod primarys and see the data distributed across the three shards.
#start mongos with three configs:
mongos --port 27023 --configdb localhost:27017,localhost:27018,localhost:27019
mongos> sh.addShard("127.0.0.1:27000");
{ "shardAdded" : "shard0000", "ok" : 1 }
mongos> sh.addShard("127.0.0.1:37000");
{ "shardAdded" : "shard0001", "ok" : 1 }
mongos> sh.addShard("127.0.0.1:47000");
{ "shardAdded" : "shard0002", "ok" : 1 }
mongos> sh.enableSharding("db_to_shard");
{ "ok" : 1 }
mongos> use db_to_shard;
switched to db db_to_shard
mongos>
mongos> sh.shardCollection("db_to_shard.coll_to_shard", {collId: 1, createdDate: 1} );
{ "collectionsharded" : "db_to_shard.coll_to_shard", "ok" : 1 }
mongos> show databases;
admin (empty)
config 0.063GB
db_to_shard 0.078GB
mongos> sh.status();
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("557003eb4a4e61bb2ea0555b")
}
shards:
{ "_id" : "shard0000", "host" : "127.0.0.1:27000" }
{ "_id" : "shard0001", "host" : "127.0.0.1:37000" }
{ "_id" : "shard0002", "host" : "127.0.0.1:47000" }
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "test", "partitioned" : false, "primary" : "shard0000" }
{ "_id" : "db_to_shard", "partitioned" : true, "primary" : "shard0000" }
db_to_shard.coll_to_shard
shard key: { "collId" : 1, "createdDate" : 1 }
chunks:
shard0000 1
{ "collId" : { "$minKey" : 1 }, "createdDate" : { "$minKey" : 1 } } -->> { "collId" : { "$maxKey" : 1 }, "createdDate" : { "$maxKey" : 1 } } on : shard0000 Timestamp(1, 0)
I try to test sharding in MongoDB. For example, I use host1.com and host2.com instead real server names.
So I created config server at host1.com:
mongod --dbpath /path/to/configdb/ --configsvr
Started mongos at the same machine:
mongos --configdb host1.com --port 27020
And started mongod at two machines (host1.com and host2.com):
mongod --dbpath /path/to/test_shard_db/ --shardsvr
I added shards, enabled sharding for database test and collection test with shard key {'name': 1} (collection has only this field and _id for test) as explained in tutorial . But after all this operations all my data writes only to one shard, which is primary for database.
Here is config:
Sharding status:
mongos> db.printShardingStatus()
--- Sharding Status ---
sharding version: { "_id" : 1, "version" : 3 }
shards:
{ "_id" : "shard0000", "host" : "host1.com:27018", "maxSize" : NumberLong(1) }
{ "_id" : "shard0001", "host" : "host2.com:27018", "maxSize" : NumberLong(10) }
databases:
...
{ "_id" : "test", "partitioned" : true, "primary" : "shard0000" }
test.test chunks:
shard0001 1
{ "name" : { $minKey : 1 } } -->> { "name" : { $maxKey : 1 } } on : shard0001 Timestamp(1000, 0)
Collection stats:
mongos> db.printCollectionStats()
test
{
"sharded" : false,
"primary" : "shard0000",
"size" : 203535788,
...
}
Balancer status:
mongos> sh.isBalancerRunning()
true
So why all data in collection reside only at one shard though I added more than 1 megabyte of data? And why db.printCollectionStats() show me that test database "sharded" : false. What I did wrong?
The default chunk size is 64MB so you have room to grow before a split will occur. You can split the shard key range yourself beforehand which can allow writes to go to multiple shards from the start. See the MongoDB Split Chunks documentation for more info.
On the difference between chunk size and maxSize:
maxSize will limit the volume of data on a given shard. When reached the balancer will look to move chunks to a shard where maxSize has not been reached. A chunk is a collection of documents that all fall within a section of the shard key range. The MongoDB balancer will move data between shards at the chunk level to balance. When a chunk approaches the maxSize value, it will be split into 2 which may result in a move.