mongoDB sharding example - mongodb

Newbie using mongo 2.0.1 32-bit on windows tried testing out shards as follows:
(4) processes: 2 shards + config srver + mongos w tiny chunksize
mongod.exe --shardsvr --port 10001 --dbpath <folder1> > shard1.log
mongod.exe --shardsvr --port 10002 --dbpath <folder2> > shard2.log
mongod.exe --configsvr --port 20000 --dbpath <configfolder> > config.log
mongos.exe --configdb localhost:20000 --chunkSize 1 > mongos.log
I ran the shell and set up 2 shards:
mongos> use admin
switched to dbadmin
mongos> db.runCommand( { addshard : "localhost:10001" } );
{ "shardAdded" : "shard0000", "ok" : 1 }
mongos> db.runCommand( { addshard : "localhost:10002" } );
{ "shardAdded" : "shard0001", "ok" : 1 }
Then I enabled sharding for a test database (dbTest) and collection (cTest):
mongos> db.runCommand( { enablesharding : "dbTest" } );
{ "ok" : 1 }
mongos> db.runCommand( { shardcollection : "dbTest.cTest", key : { Name : 1 } } );
{ "collectionssharded" : "dbTest.cTest", "ok" : 1 }
Finally I populated the cTest collection (indexed by Name) with 1,000,005 sample records:
mongos> use dbTest
switched to db dbTest
db.cTest.drop();
db.cTest.ensureIndex({ Name : 1 });
db.cTest.save({Name: "Frank", Age:56, Job: "Accountant", State: "NY"});
db.cTest.save({Name: "Bill" , Age:23, State: "CA"});
db.cTest.save({Name: "Janet", Age:34, Job: "Dancer" });
db.cTest.save({Name: "Andy", Age:44 });
db.cTest.save({Name: "Zach", Age:23, Job: "Fireman", State: "CA"});
i=1;
while(i<=1000)
{
j=1;
while (j<=1000)
{
db.cTest.save({Name:"Person("+i+","+j+")", Age:i+j});
j = j+1
};
i=i+1;
};
HOWEVER ...
It appears that nothing actually got sharded. In the config database, db.chunks.count() is zero, and I can see from windows explorer file sizes that all the data went into the physical file setup setup for the first shard, and none to the second.
Can anyone spot what I've done wrong, and also provide some tips on how to admin & debug this type of thing & see what's going on ?
Thanks

Once you "shardcollection", don't drop it. It will remove metadata about sharded collection.

Related

MongoDB error when sharding a collection: ns not found

I've setup a simple server configuration for testing sharding functionnalities purpose and i get the error above.
My configuration is pretty simple: one config server, one shard server and one mongos (respectively in 127.0.0.1:27019, 127.0.0.1:27018, 127.0.0.1:27017).
Everything looks to work well until i try to shard a collection, the command gives me the following:
sh.shardCollection("test.test", { "test" : 1 } )
{
"ok" : 0,
"errmsg" : "ns not found",
"code" : 26,
"codeName" : "NamespaceNotFound",
"operationTime" : Timestamp(1590244259, 5),
"$clusterTime" : {
"clusterTime" : Timestamp(1590244259, 5),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
The config server and shard server outputs show no errors:
2020-05-23T10:39:46.629-0400 I SHARDING [conn11] about to log metadata event into changelog: { _id: "florent-Nitro-AN515-53:27018-2020-05-23T10:39:46.629-0400-5ec935b2bec982e313743b1a", server: "florent-Nitro-AN515-53:27018", shard: "rs0", clientAddr: "127.0.0.1:58242", time: new Date(1590244786629), what: "shardCollection.start", ns: "test.test", details: { shardKey: { test: 1.0 }, collection: "test.test", uuid: UUID("152add6f-e56b-40c4-954c-378920eceede"), empty: false, fromMapReduce: false, primary: "rs0:rs0/127.0.0.1:27018", numChunks: 1 } }
2020-05-23T10:39:46.620-0400 I SHARDING [conn25] distributed lock 'test' acquired for 'shardCollection', ts : 5ec935b235505bcc59eb60c5
2020-05-23T10:39:46.622-0400 I SHARDING [conn25] distributed lock 'test.test' acquired for 'shardCollection', ts : 5ec935b235505bcc59eb60c7
2020-05-23T10:39:46.637-0400 I SHARDING [conn25] distributed lock with ts: 5ec935b235505bcc59eb60c7' unlocked.
2020-05-23T10:39:46.640-0400 I SHARDING [conn25] distributed lock with ts: 5ec935b235505bcc59eb60c5' unlocked.
Of course the collection exists on primary shard:
rs0:PRIMARY> db.test.stats()
{
"ns" : "test.test",
"size" : 216,
"count" : 6,
"avgObjSize" : 36,
"storageSize" : 36864,
"capped" : false,
...
}
I have no idea what could be wrong here, i'd much appreciate any help :)
EDIT:
Here is the detail about steps i follom to run servers, i probably misunderstand something :
Config server:
sudo mongod --configsvr --replSet rs0 --port 27019 --dbpath /srv/mongodb/cfg
mongo --port 27019
Then in mongo shell
rs.initiate(
{
_id: "rs0",
configsvr: true,
members: [
{ _id : 0, host : "127.0.0.1:27019" }
]
}
)
Sharded server:
sudo mongod --shardsvr --replSet rs0 --dbpath /srv/mongodb/shrd1/ --port 27018
mongo --port 27018
Then in shell:
rs.initiate(
{
_id: "rs0",
members: [
{ _id : 0, host : "127.0.0.1:27018" }
]
}
)
db.test.createIndex({test:1})
Router:
sudo mongos --configdb rs0/127.0.0.1:27019
mongo
Then in shell:
sh.addShard('127.0.0.1:27018')
sh.enableSharding('test')
sh.shardCollection('test.test', {test:1})
That error happens sometimes when some routers have out of date ideas of what databases/collections exist in the sharded cluster.
Try running https://docs.mongodb.com/manual/reference/command/flushRouterConfig/ on each mongos (i.e. connect to each mongos sequentially by itself and run this command on it).
I just misunderstood one base concept: config servers and shard servers are distinct and independant mongodb instances, so each must be part of distinct replicasets .
So replacing
sudo mongod --configsvr --replSet rs0 --port 27019 --dbpath /srv/mongodb/cfg
with
sudo mongod --configsvr --replSet rs0Config --port 27019 --dbpath /srv/mongodb/cfg
makes the configuration work.

Error while creating Replica set - MongoDb

I am trying to create replica but unable to proceed
Script to create 3 mongod instance :
sudo mkdir -p /data/rs1 /data/rs2 /data/rs3
sudo mongod --replSet rs1 --logpath "1.log" --dbpath /data/rs1 --port 27017 --fork
sudo mongod --replSet rs2 --logpath "2.log" --dbpath /data/rs2 --port 27018 --fork
sudo mongod --replSet rs3 --logpath "3.log" --dbpath /data/rs3 --port 27019 --fork
This executes successfully but after this i try to provide rs1 information about rs2 and rs3 via below script :
init_replica.js :
config = {
_id:"rs1",members:[
{_id:0,host:"grit-lenevo-pc:27017",priority:0,slaveDelay:5},
{_id:1,host:"grit-lenevo-pc:27018"},
{_id:2,host:"grit-lenevo-pc:27019"}]
}
rs.initiate(config)
rs.status()
Now when i try to run :
mongo --port 27018 < init_replica.js
I am getting :
MongoDB shell version: 3.2.8
connecting to: 127.0.0.1:27018/test
{
"_id" : "rs1",
"members" : [
{
"_id" : 0,
"host" : "grit-lenevo-pc:27017",
"priority" : 0,
"slaveDelay" : 5
},
{
"_id" : 1,
"host" : "grit-lenevo-pc:27018"
},
{
"_id" : 2,
"host" : "grit-lenevo-pc:27019"
}
]
}
{
"ok" : 0,
"errmsg" : "Attempting to initiate a replica set with name rs1, but command line reports rs2; rejecting",
"code" : 93
}
{
"info" : "run rs.initiate(...) if not yet done for the set",
"ok" : 0,
"errmsg" : "no replset config has been received",
"code" : 94
}
bye
Note : The same command works fine if i try below command :
mongo --port 27017 < init_replica.js
Following tutorials : M101 Mongo Db For Java Developers
It's right about there:
"Attempting to initiate a replica set with name rs1, but command line reports rs2; rejecting"
You should supply all members with the same replica set name as the seed (s1). For the second member:
sudo mongod --replSet rs1 ...
and not
sudo mongod --replSet rs2 ...
Same principal goes for third member
I had a similar name mismatch issue, but it was a bit more subtle.
In my mongo.conf I used "rs0" (quoted) for the RS name, and then ran rs.initiate({_id : "rs0"...}), which failed with
"Attempting to initiate a replica set with name rs0, but command line reports \"rs0\"; rejecting"
It took a while to notice the extra quotes - don't use them in the RS name in mongo.conf.

Replica Set Error Code 76

In ref to mongo dba course trying to create replica set as asked shown by instructor in El Capitano (Single machine only), I get following error. I have three members:
(mongodb was installed using homebrew)
Step I: Setting up config
cfg ={ _id :"abc", members:[{_id:0, host:"localhost:27001"}, {_id:1, host:"localhost:27002"}, {_id:2, host:"localhost:27003"}] }
{
"_id" : "abc",
"members" : [
{
"_id" : 0,
"host" : "localhost:27001"
},
{
"_id" : 1,
"host" : "localhost:27002"
},
{
"_id" : 2,
"host" : "localhost:27003"
}
]
}
STEP II: Initialize the Config.
rs.reconfig(cfg)
2015-10-05T11:34:27.082-0400 E QUERY Error: Could not retrieve replica set config: { "ok" : 0, "errmsg" : "not running with --replSet", "code" : 76 }
at Function.rs.conf (src/mongo/shell/utils.js:1017:11)
at Function.rs.reconfig (src/mongo/shell/utils.js:969:22)
at (shell):1:4 at src/mongo/shell/utils.js:1017
Make sure you have the replSetName name configured in /etc/mongod.conf
replication:
replSetName: "somename"
Then restart your mongod.
sudo service mongod stop
sudo service mongod start
sudo service mongod restart
You are not running replica set with the repl set name.The solution is to set a replication set name in the mongod config file using the paramater --replSet.
eg) --replSet=test_replica
Once changes are done in config file restart the server.

Mongodb Sharding not working - what is causing Collection not sharded

I am trying to set up mongodb sharding with two nodes. I have enabled 3 configuration process and a router process. I am extracting data (With 50 columns - 650 MB - _id as the key) from SQL server and putting in mongodb. In the pentaho configuration I have enabled "Use all Replica sets" and enter the primary node's host name and the port. When I run the transformation, all the data are getting into primary node and the other node is not getting data. When I entered, db.table.getShardDistribution(), I get the following message "Collection not sharded".
Also the status of is.BalancerRunning() gives me false status. I am very sure that background process balancer is not working here.
Mean while i tried to insert a sample test records 10,00,000 records with name as the key , the sharding setup was working fine and each shard got data distributed.
So,I am missing something or doing something wrong while I run pentaho transformation to populate data in mongodb. Any help is appreciated.
My set up
C:\Mongodb\bin\mongod.exe --shardsvr --port 10001 --dbpath C:\Mongodb\shard1 > C:\Mongodb\Log\shard1.log
C:\Mongodb\bin\mongod.exe --shardsvr --port 10002 --dbpath C:\Mongodb\shard2 > C:\Mongodb\Log\shard2.log
C:\Mongodb\bin\mongod.exe --configsvr --port 20000 --dbpath C:\Mongodb\configdb > C:\Mongodb\Log\config.log
C:\Mongodb\bin\mongos.exe --configdb 10.231.34.105:
--chunkSize 1 > C:\Mongodb\Log\mongos.log
mongos> use admin
switched to dbadmin
mongos> db.runCommand( { addshard : "10.231.34.105:40001" } );
{ "shardAdded" : "shard0000", "ok" : 1 }
mongos> db.runCommand( { addshard : "10.231.34.106:40002" } );
{ "shardAdded" : "shard0001", "ok" : 1 }
mongos> db.runCommand( { enablesharding : "dbTest" } );
{ "ok" : 1 }
mongos> db.runCommand( { shardcollection : "dbTest.cTest", key : { Date_ID: 1 } } );
{ "collectionssharded" : "dbTest.cTest", "ok" : 1 }
mongos> use dbTest;
db.cTest.ensureIndex({ Date_D : 1 });```
I am not sure from your post if you shard database and collections.
Once you setup shards did you enable sharding for database. like below
db.runCommand({
enablesharding : "dbname"
});
Do db.stats() to confirm.
After enable sharding for collections. like below.
db.runCommand({
shardcollection : "collection_name",
key : {
shardKey : "hashed"
}
});
Then to confirm db.collections.stats()

Sharding in MongoDB

I try to test sharding in MongoDB. For example, I use host1.com and host2.com instead real server names.
So I created config server at host1.com:
mongod --dbpath /path/to/configdb/ --configsvr
Started mongos at the same machine:
mongos --configdb host1.com --port 27020
And started mongod at two machines (host1.com and host2.com):
mongod --dbpath /path/to/test_shard_db/ --shardsvr
I added shards, enabled sharding for database test and collection test with shard key {'name': 1} (collection has only this field and _id for test) as explained in tutorial . But after all this operations all my data writes only to one shard, which is primary for database.
Here is config:
Sharding status:
mongos> db.printShardingStatus()
--- Sharding Status ---
sharding version: { "_id" : 1, "version" : 3 }
shards:
{ "_id" : "shard0000", "host" : "host1.com:27018", "maxSize" : NumberLong(1) }
{ "_id" : "shard0001", "host" : "host2.com:27018", "maxSize" : NumberLong(10) }
databases:
...
{ "_id" : "test", "partitioned" : true, "primary" : "shard0000" }
test.test chunks:
shard0001 1
{ "name" : { $minKey : 1 } } -->> { "name" : { $maxKey : 1 } } on : shard0001 Timestamp(1000, 0)
Collection stats:
mongos> db.printCollectionStats()
test
{
"sharded" : false,
"primary" : "shard0000",
"size" : 203535788,
...
}
Balancer status:
mongos> sh.isBalancerRunning()
true
So why all data in collection reside only at one shard though I added more than 1 megabyte of data? And why db.printCollectionStats() show me that test database "sharded" : false. What I did wrong?
The default chunk size is 64MB so you have room to grow before a split will occur. You can split the shard key range yourself beforehand which can allow writes to go to multiple shards from the start. See the MongoDB Split Chunks documentation for more info.
On the difference between chunk size and maxSize:
maxSize will limit the volume of data on a given shard. When reached the balancer will look to move chunks to a shard where maxSize has not been reached. A chunk is a collection of documents that all fall within a section of the shard key range. The MongoDB balancer will move data between shards at the chunk level to balance. When a chunk approaches the maxSize value, it will be split into 2 which may result in a move.