Mongodb Sharding not working - what is causing Collection not sharded - mongodb

I am trying to set up mongodb sharding with two nodes. I have enabled 3 configuration process and a router process. I am extracting data (With 50 columns - 650 MB - _id as the key) from SQL server and putting in mongodb. In the pentaho configuration I have enabled "Use all Replica sets" and enter the primary node's host name and the port. When I run the transformation, all the data are getting into primary node and the other node is not getting data. When I entered, db.table.getShardDistribution(), I get the following message "Collection not sharded".
Also the status of is.BalancerRunning() gives me false status. I am very sure that background process balancer is not working here.
Mean while i tried to insert a sample test records 10,00,000 records with name as the key , the sharding setup was working fine and each shard got data distributed.
So,I am missing something or doing something wrong while I run pentaho transformation to populate data in mongodb. Any help is appreciated.
My set up
C:\Mongodb\bin\mongod.exe --shardsvr --port 10001 --dbpath C:\Mongodb\shard1 > C:\Mongodb\Log\shard1.log
C:\Mongodb\bin\mongod.exe --shardsvr --port 10002 --dbpath C:\Mongodb\shard2 > C:\Mongodb\Log\shard2.log
C:\Mongodb\bin\mongod.exe --configsvr --port 20000 --dbpath C:\Mongodb\configdb > C:\Mongodb\Log\config.log
C:\Mongodb\bin\mongos.exe --configdb 10.231.34.105:
--chunkSize 1 > C:\Mongodb\Log\mongos.log
mongos> use admin
switched to dbadmin
mongos> db.runCommand( { addshard : "10.231.34.105:40001" } );
{ "shardAdded" : "shard0000", "ok" : 1 }
mongos> db.runCommand( { addshard : "10.231.34.106:40002" } );
{ "shardAdded" : "shard0001", "ok" : 1 }
mongos> db.runCommand( { enablesharding : "dbTest" } );
{ "ok" : 1 }
mongos> db.runCommand( { shardcollection : "dbTest.cTest", key : { Date_ID: 1 } } );
{ "collectionssharded" : "dbTest.cTest", "ok" : 1 }
mongos> use dbTest;
db.cTest.ensureIndex({ Date_D : 1 });```

I am not sure from your post if you shard database and collections.
Once you setup shards did you enable sharding for database. like below
db.runCommand({
enablesharding : "dbname"
});
Do db.stats() to confirm.
After enable sharding for collections. like below.
db.runCommand({
shardcollection : "collection_name",
key : {
shardKey : "hashed"
}
});
Then to confirm db.collections.stats()

Related

MongoDB error when sharding a collection: ns not found

I've setup a simple server configuration for testing sharding functionnalities purpose and i get the error above.
My configuration is pretty simple: one config server, one shard server and one mongos (respectively in 127.0.0.1:27019, 127.0.0.1:27018, 127.0.0.1:27017).
Everything looks to work well until i try to shard a collection, the command gives me the following:
sh.shardCollection("test.test", { "test" : 1 } )
{
"ok" : 0,
"errmsg" : "ns not found",
"code" : 26,
"codeName" : "NamespaceNotFound",
"operationTime" : Timestamp(1590244259, 5),
"$clusterTime" : {
"clusterTime" : Timestamp(1590244259, 5),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
The config server and shard server outputs show no errors:
2020-05-23T10:39:46.629-0400 I SHARDING [conn11] about to log metadata event into changelog: { _id: "florent-Nitro-AN515-53:27018-2020-05-23T10:39:46.629-0400-5ec935b2bec982e313743b1a", server: "florent-Nitro-AN515-53:27018", shard: "rs0", clientAddr: "127.0.0.1:58242", time: new Date(1590244786629), what: "shardCollection.start", ns: "test.test", details: { shardKey: { test: 1.0 }, collection: "test.test", uuid: UUID("152add6f-e56b-40c4-954c-378920eceede"), empty: false, fromMapReduce: false, primary: "rs0:rs0/127.0.0.1:27018", numChunks: 1 } }
2020-05-23T10:39:46.620-0400 I SHARDING [conn25] distributed lock 'test' acquired for 'shardCollection', ts : 5ec935b235505bcc59eb60c5
2020-05-23T10:39:46.622-0400 I SHARDING [conn25] distributed lock 'test.test' acquired for 'shardCollection', ts : 5ec935b235505bcc59eb60c7
2020-05-23T10:39:46.637-0400 I SHARDING [conn25] distributed lock with ts: 5ec935b235505bcc59eb60c7' unlocked.
2020-05-23T10:39:46.640-0400 I SHARDING [conn25] distributed lock with ts: 5ec935b235505bcc59eb60c5' unlocked.
Of course the collection exists on primary shard:
rs0:PRIMARY> db.test.stats()
{
"ns" : "test.test",
"size" : 216,
"count" : 6,
"avgObjSize" : 36,
"storageSize" : 36864,
"capped" : false,
...
}
I have no idea what could be wrong here, i'd much appreciate any help :)
EDIT:
Here is the detail about steps i follom to run servers, i probably misunderstand something :
Config server:
sudo mongod --configsvr --replSet rs0 --port 27019 --dbpath /srv/mongodb/cfg
mongo --port 27019
Then in mongo shell
rs.initiate(
{
_id: "rs0",
configsvr: true,
members: [
{ _id : 0, host : "127.0.0.1:27019" }
]
}
)
Sharded server:
sudo mongod --shardsvr --replSet rs0 --dbpath /srv/mongodb/shrd1/ --port 27018
mongo --port 27018
Then in shell:
rs.initiate(
{
_id: "rs0",
members: [
{ _id : 0, host : "127.0.0.1:27018" }
]
}
)
db.test.createIndex({test:1})
Router:
sudo mongos --configdb rs0/127.0.0.1:27019
mongo
Then in shell:
sh.addShard('127.0.0.1:27018')
sh.enableSharding('test')
sh.shardCollection('test.test', {test:1})
That error happens sometimes when some routers have out of date ideas of what databases/collections exist in the sharded cluster.
Try running https://docs.mongodb.com/manual/reference/command/flushRouterConfig/ on each mongos (i.e. connect to each mongos sequentially by itself and run this command on it).
I just misunderstood one base concept: config servers and shard servers are distinct and independant mongodb instances, so each must be part of distinct replicasets .
So replacing
sudo mongod --configsvr --replSet rs0 --port 27019 --dbpath /srv/mongodb/cfg
with
sudo mongod --configsvr --replSet rs0Config --port 27019 --dbpath /srv/mongodb/cfg
makes the configuration work.

How to replace node in sharded replica set?

I got sharded mongodb setup with two replica sets:
mongos> db.runCommand( { listShards : 1 } )
{
"shards" : [
{
"_id" : "rs01",
"host" : "rs01/10.133.250.140:27017,10.133.250.154:27017"
},
{
"_id" : "rs02",
"host" : "rs02/10.133.242.7:27017,10.133.242.8:27017"
}
],
"ok" : 1
}
Node 10.133.250.140 just went down, and I replaced it with another one (ip-address changed). Replica set reconfiguration was pretty easy, just rs.remove() and rs.add()
Now I have to update host config for shard rs01. What is proper way to do it?
Many times you would need to modify the host string for a shard. The simplest way to change host string is to run an update operation.
Connect to mongos and do this -
> use config
> db.shards.update({ "_id" : "rs01"},{$set : { "host" : "rs01/newip:27017,anothernewip:27017"} })
You might need to restart all mongos.
Hope this helps :-)
Well, removing problem shard and adding it again seems the only option.

Can't find some data on Mongos while it's exists on shard

I have 3 Mongos instances and 6 Mongod instances behind it. There are 2 auto-sharding shard and each one has 3 replica-set.
Today I find that I can't find some data on my system, but I can find it on RockMongo. I try to find it on mongos but nothing can be found. But the result of count() told me the data is still there.
mongos> db.video.find({ _id: ObjectId('51a0e7625c8e87cc6a000027') })
mongos> db.video.count({ _id: ObjectId('51a0e7625c8e87cc6a000027') })
1
mongos> db.runCommand({ count: "video", query: { _id: ObjectId('51a0e7625c8e87cc6a000027') } })
{ "shards" : { "s1" : 0, "s2" : 1 }, "n" : 1, "ok" : 1 }
I connect to shard2 and find the record, but many fields was lost. Meanwhile, the record shows up in RockMongo had all fields.
shard2:PRIMARY> db.video.find({ _id: ObjectId('51a0e7625c8e87cc6a000027') })
{ "_id" : ObjectId("51a0e7625c8e87cc6a000027"), "comment" : 78, "like" : 142, "scores" : { "total" : 37042292210.73388, "popular" : 72980.66026813157, "total_play" : 8737, "week_play" : 71 }, "views" : 8739 }
Then I found that the data count shows up in RockMongo was 24w+, but the result return by running db.xx.count() on mongos was 23w+. Some data lost on Mongos!
I have tried dump the collection and restore to another server and everything is ok. There must be some thing between mongos and mongod, what should I do now?
In the end, I found that the primary of shard2 lost some data, but else replica sets of shard2 keep full data. I shutdown primary and create a new replica set. Everything is fine now.

Sharding in MongoDB

I try to test sharding in MongoDB. For example, I use host1.com and host2.com instead real server names.
So I created config server at host1.com:
mongod --dbpath /path/to/configdb/ --configsvr
Started mongos at the same machine:
mongos --configdb host1.com --port 27020
And started mongod at two machines (host1.com and host2.com):
mongod --dbpath /path/to/test_shard_db/ --shardsvr
I added shards, enabled sharding for database test and collection test with shard key {'name': 1} (collection has only this field and _id for test) as explained in tutorial . But after all this operations all my data writes only to one shard, which is primary for database.
Here is config:
Sharding status:
mongos> db.printShardingStatus()
--- Sharding Status ---
sharding version: { "_id" : 1, "version" : 3 }
shards:
{ "_id" : "shard0000", "host" : "host1.com:27018", "maxSize" : NumberLong(1) }
{ "_id" : "shard0001", "host" : "host2.com:27018", "maxSize" : NumberLong(10) }
databases:
...
{ "_id" : "test", "partitioned" : true, "primary" : "shard0000" }
test.test chunks:
shard0001 1
{ "name" : { $minKey : 1 } } -->> { "name" : { $maxKey : 1 } } on : shard0001 Timestamp(1000, 0)
Collection stats:
mongos> db.printCollectionStats()
test
{
"sharded" : false,
"primary" : "shard0000",
"size" : 203535788,
...
}
Balancer status:
mongos> sh.isBalancerRunning()
true
So why all data in collection reside only at one shard though I added more than 1 megabyte of data? And why db.printCollectionStats() show me that test database "sharded" : false. What I did wrong?
The default chunk size is 64MB so you have room to grow before a split will occur. You can split the shard key range yourself beforehand which can allow writes to go to multiple shards from the start. See the MongoDB Split Chunks documentation for more info.
On the difference between chunk size and maxSize:
maxSize will limit the volume of data on a given shard. When reached the balancer will look to move chunks to a shard where maxSize has not been reached. A chunk is a collection of documents that all fall within a section of the shard key range. The MongoDB balancer will move data between shards at the chunk level to balance. When a chunk approaches the maxSize value, it will be split into 2 which may result in a move.

mongoDB sharding example

Newbie using mongo 2.0.1 32-bit on windows tried testing out shards as follows:
(4) processes: 2 shards + config srver + mongos w tiny chunksize
mongod.exe --shardsvr --port 10001 --dbpath <folder1> > shard1.log
mongod.exe --shardsvr --port 10002 --dbpath <folder2> > shard2.log
mongod.exe --configsvr --port 20000 --dbpath <configfolder> > config.log
mongos.exe --configdb localhost:20000 --chunkSize 1 > mongos.log
I ran the shell and set up 2 shards:
mongos> use admin
switched to dbadmin
mongos> db.runCommand( { addshard : "localhost:10001" } );
{ "shardAdded" : "shard0000", "ok" : 1 }
mongos> db.runCommand( { addshard : "localhost:10002" } );
{ "shardAdded" : "shard0001", "ok" : 1 }
Then I enabled sharding for a test database (dbTest) and collection (cTest):
mongos> db.runCommand( { enablesharding : "dbTest" } );
{ "ok" : 1 }
mongos> db.runCommand( { shardcollection : "dbTest.cTest", key : { Name : 1 } } );
{ "collectionssharded" : "dbTest.cTest", "ok" : 1 }
Finally I populated the cTest collection (indexed by Name) with 1,000,005 sample records:
mongos> use dbTest
switched to db dbTest
db.cTest.drop();
db.cTest.ensureIndex({ Name : 1 });
db.cTest.save({Name: "Frank", Age:56, Job: "Accountant", State: "NY"});
db.cTest.save({Name: "Bill" , Age:23, State: "CA"});
db.cTest.save({Name: "Janet", Age:34, Job: "Dancer" });
db.cTest.save({Name: "Andy", Age:44 });
db.cTest.save({Name: "Zach", Age:23, Job: "Fireman", State: "CA"});
i=1;
while(i<=1000)
{
j=1;
while (j<=1000)
{
db.cTest.save({Name:"Person("+i+","+j+")", Age:i+j});
j = j+1
};
i=i+1;
};
HOWEVER ...
It appears that nothing actually got sharded. In the config database, db.chunks.count() is zero, and I can see from windows explorer file sizes that all the data went into the physical file setup setup for the first shard, and none to the second.
Can anyone spot what I've done wrong, and also provide some tips on how to admin & debug this type of thing & see what's going on ?
Thanks
Once you "shardcollection", don't drop it. It will remove metadata about sharded collection.