How to replace node in sharded replica set? - mongodb

I got sharded mongodb setup with two replica sets:
mongos> db.runCommand( { listShards : 1 } )
{
"shards" : [
{
"_id" : "rs01",
"host" : "rs01/10.133.250.140:27017,10.133.250.154:27017"
},
{
"_id" : "rs02",
"host" : "rs02/10.133.242.7:27017,10.133.242.8:27017"
}
],
"ok" : 1
}
Node 10.133.250.140 just went down, and I replaced it with another one (ip-address changed). Replica set reconfiguration was pretty easy, just rs.remove() and rs.add()
Now I have to update host config for shard rs01. What is proper way to do it?

Many times you would need to modify the host string for a shard. The simplest way to change host string is to run an update operation.
Connect to mongos and do this -
> use config
> db.shards.update({ "_id" : "rs01"},{$set : { "host" : "rs01/newip:27017,anothernewip:27017"} })
You might need to restart all mongos.
Hope this helps :-)

Well, removing problem shard and adding it again seems the only option.

Related

No common protocol found when add shard in local network

I'm trying to build a small cluster using mongodb sharding. I tried with everything in localhost and it works perfect. But when I try on my local network where there are two nodes, node1 and node2, it does not work. In both nodes, mongod are started to serve as shard. In node1, config server and mongos are started. All listening 0.0.0.0 with exclusively allocated ports.
I can connect and do things with both nodes. When I use mongo to login mongos in node1, I can add Node1 mongod as shard but when I try to add Node2, an error occurs:
mongos> sh.addShard("<ip of node2 in local network>")
{ "ok" : 0, "errmsg" : "No common protocol found.", "code" : 126 }
I did some searching but few documentation is about this error.
mongo addShard "No common protocol found" errmsg 126 shows the same error but it does not seem helpful.
Couple of things to check
a) Are you using the same version of Mongod on all machines.
b) Are you using same kind of storageEngine on all machines.
Our problem should have been obvious, but wasn't.
We simply forgot to configure ports here, so the :27000 was missing:
db.shards.updateOne({ "_id" : "shard0000" }, { $set : { "host" : "oururl.foo:27000" } })
db.shards.updateOne({ "_id" : "shard0001" }, { $set : { "host" : "oururl.foo:27000" } })
db.shards.updateOne({ "_id" : "shard0002" }, { $set : { "host" : "oururl.foo:27000" } })
db.shards.updateOne({ "_id" : "shard0003" }, { $set : { "host" : "oururl.foo:27000" } })

How to update _id in Mongodb Replica Set configuration?

I had 5 mongo members in Replica Set. After I deleted 3 from it.
How can I change "_id" in others members to values "0", "1" and "2"?
rs.conf()
{
"_id" : "rs0",
"version" : 151261,
"members" : [
{
"_id" : 3,
"host" : "mongodb3:27017"
},
{
"_id" : 4,
"host" : "mongodb4:27017"
},
{
"_id" : 5,
"host" : "ok:27017",
"arbiterOnly" : true
}
]
}
Directly editing the replica set configuration may not be an elegant way. Instead use the rs.remove(hostname) command to remove a member from replica set , this way you need not have to bring down the primary during reconfiguration which will automatically assign ascending order values to "_id" field.
Try dropping the slaves collection as described here: http://docs.mongodb.org/manual/tutorial/troubleshoot-replica-sets/#duplicate-key-error-on-local-slaves
The master will recreate the collection the next time it is required.
You could try this in the Mongo console:
conf = rs.conf()
conf.members[0]._id = 0
conf.members[1]._id = 1
conf.members[2]._id = 2
rs.reconfig(conf)

mongo replica set member is fatal

I'm trying to configure a replica set of 3 members on 3 different Linux machines. Im running the mongod with replset in the config file. I mistakley set 'rs.initialize' in 2 machines, and now I have found the primary and tried to add the other instances, but it says that the config file is not from the same version.
How can I remove the replica set and start everything back from scratch?
If the following is true:
Is this a brand-new deployment.
There is no data you need to keep.
You can do the following:
Shutdown all 3 mongods.
Remove all files and directories from the
"dbpath" partition in all 3 machines
Restart all 3 mongods Connect
to one of the mongodds and submit the following command
config = { "_id": "rs0", "members" : [
{ "_id" : 0, "host" : "##Your DNS NAME:PORTNUMBER#" },
{ "_id" : 1, "host" : "##Your DNS NAME:PORTNUMBER#" },
{ "_id" : 2, "host" : "##Your DNS NAME:PORTNUMBER#" } ]
}
rs.initiate(config)

exception: hosts cannot switch between localhost and hostname

I created a replication set.
I added localhost in the set in the beginning, but when I try to edit the member with the actual hostname. I get error "exception: hosts cannot switch between localhost and hostname"
I need to get rid of localhost:27017 because, otherwise, it doesn't let me enter any other member as hostname (i.e. non-localhost address)
my-rs0:PRIMARY> cfg=rs.conf();
{
"_id" : "my-rs0",
"version" : 1,
"members" : [
{
"_id" : 0,
"host" : "localhost:27017"
}
]
}
my-rs0:PRIMARY> cfg.members[0].host="my-server04:27017"
my-rs0:PRIMARY> cfg
{
"_id" : "my-rs0",
"version" : 1,
"members" : [
{
"_id" : 0,
"host" : "my-server04:27017"
}
]
}
using rs.reconfig(cfg);
my-rs0:PRIMARY> rs.reconfig(cfg);
{
"errmsg" : "exception: hosts cannot switch between localhost and hostname",
"code" : 13645,
"ok" : 0
}
no luck with rs.add("my-server04:27017") or rs.remove("localhost:27017") as well.
my-rs0:PRIMARY> rs.add("my-server04:27017");
{
"errmsg" : "exception: can't use localhost in repl set member names except when using it for all members",
"code" : 13393,
"ok" : 0
}
I have tried all the reconfiguration methods mentioned here Replica Set Reconfig steps
But, none fixing above issue. Already spent hours, I am really frustrated.
I had the same problem and I fixed it without dropping any database. Just edited the host field of the member in the local.system.replset collection to match the local ip and then restarted mongod. Everything worked perfect.
It looks like you'll need to scrap your replicaset and start over.
I believe that when you initiated your Replica Set, you explicitly passed it a config document that references your MongoDB instance using localhost.
As I was investigating this, I brought up a replica set. When I initiated the replica set using rs.initiate() (without passing a config document) it used host name by default.
rs.initiate()
rs.conf()
{
"_id" : "stack1",
"version" : 1,
"members" : [
{
"_id" : 0,
"host" : "MY-HOSTNAME:28001"
}
]
}
This post describes the need to complete clear out your database files to create a fresh replica set.
Once I did this, I initiated a new replica set in the by passing a configuration document:
cfg = {
"_id" : "stack1",
"version" : 1,
"members" : [
{
"_id" : 0,
"host" : "localhost:28001"
}
]
}
rs.initiate(cfg)
rs.conf()
{
"_id" : "stack1",
"version" : 1,
"members" : [
{
"_id" : 0,
"host" : "localhost:28001"
}
]
}
Long story short, you'll need to delete all of the files in your --dbpath directory and re-create the replica set, without explicitly specifying "localhost" as your hostname.
I did according to the docs:
Restarted MongDB on another port (e.g. 37107) to prevent user connections to it.
Then started a shell on it:
$ mongo --port 37017
Then updated the configuration:
use local
cfg = db.system.replset.findOne( { "_id": "my-rs0" } )
cfg.members[0].host = "my-server04:27017"
db.system.replset.update( { "_id": "my-rs0" } , cfg )
Then restarted MongoDB on the original port.

Can't find some data on Mongos while it's exists on shard

I have 3 Mongos instances and 6 Mongod instances behind it. There are 2 auto-sharding shard and each one has 3 replica-set.
Today I find that I can't find some data on my system, but I can find it on RockMongo. I try to find it on mongos but nothing can be found. But the result of count() told me the data is still there.
mongos> db.video.find({ _id: ObjectId('51a0e7625c8e87cc6a000027') })
mongos> db.video.count({ _id: ObjectId('51a0e7625c8e87cc6a000027') })
1
mongos> db.runCommand({ count: "video", query: { _id: ObjectId('51a0e7625c8e87cc6a000027') } })
{ "shards" : { "s1" : 0, "s2" : 1 }, "n" : 1, "ok" : 1 }
I connect to shard2 and find the record, but many fields was lost. Meanwhile, the record shows up in RockMongo had all fields.
shard2:PRIMARY> db.video.find({ _id: ObjectId('51a0e7625c8e87cc6a000027') })
{ "_id" : ObjectId("51a0e7625c8e87cc6a000027"), "comment" : 78, "like" : 142, "scores" : { "total" : 37042292210.73388, "popular" : 72980.66026813157, "total_play" : 8737, "week_play" : 71 }, "views" : 8739 }
Then I found that the data count shows up in RockMongo was 24w+, but the result return by running db.xx.count() on mongos was 23w+. Some data lost on Mongos!
I have tried dump the collection and restore to another server and everything is ok. There must be some thing between mongos and mongod, what should I do now?
In the end, I found that the primary of shard2 lost some data, but else replica sets of shard2 keep full data. I shutdown primary and create a new replica set. Everything is fine now.