Mongodb shard balancer stuck in state 0 - mongodb

I've made sharding on one collection with 2 shards.
Mongodb version is 2.6.4.
Everything looks ok but 100% of data is on one shard.
When I do:
use config
db.locks.find( { _id : "balancer" } ).pretty()
I get:
{
"_id" : "balancer",
"state" : 0,
"who" : "ip-10-0-11-128:27018:1424099612:1804289383:Balancer:846930886",
"ts" : ObjectId("553a1223e4d292075ec2a8a6"),
"process" : "ip-10-0-11-128:27018:1424099612:1804289383",
"when" : ISODate("2015-04-24T09:51:31.498Z"),
"why" : "doing balance round"
}
So balancer is stuck in state 0. I have tried to restart it but it is still in state 0.
Also:
sh.isBalancerRunning()
> false
But:
sh.getBalancerState()
> true
Errors in my log file:
2015-04-24T10:15:47.921+0000 [Balancer] scoped connection to 10.0.11.128:20000,10.0.11.159:20000,10.0.11.240:20000 not being returned to the pool
2015-04-24T10:15:47.921+0000 [Balancer] caught exception while doing balance: error checking clock skew of cluster 10.0.11.128:20000,10.0.11.159:20000,10.0.11.240:20000 :: caused by :: 13650 clock skew of the cluster 10.0.11.128:20000,10.0.11.159:20000,10.0.11.240:20000 is too far out of bounds to allow distributed locking.
Anyone knows how to fix this?
Thanks,
Ivan

From a quick look at the code, it looks like we can tolerate about 30
seconds of skew between servers in the cluster. On linux, we
recommend people use ntp to keep skew to a minimum (ntp usually keeps
it to within a second or two). ntp is usually already installed on
most linux distros.
Mongodb user group
using the following command to do an one time ntp synx
ntpdate pool.ntp.org

Related

Weird operation in db.currentOp() output in MongoDB

Just after starting my MongoDB server (standalone instance, version 4.2.2) if I run db.currentOp() I see this operation:
{
"type" : "op",
"host" : "menzo:27017",
"desc" : "waitForMajority",
"active" : true,
"currentOpTime" : "2020-05-06T16:16:33.077+0200",
"opid" : 2,
"op" : "none",
"ns" : "",
"command" : {
},
"numYields" : 0,
"waitingForLatch" : {
"timestamp" : ISODate("2020-05-06T14:02:55.895Z"),
"captureName" : "WaitForMaorityService::_mutex"
},
"locks" : {
},
"waitingForLock" : false,
"lockStats" : {
},
"waitingForFlowControl" : false,
"flowControlStats" : {
}
}
It seems that this operation is always there, no matter how long it passes. In addition, it is a weird operation in some aspects:
It has a very log opid number (2)
It's op is "none"
It doesn't have the usual secs_running or microsecs_running parameters
It mentions "majority" in some literals, but I'm not running a replica set but an standalone instance
I guess it should be some kind of internal operation (maybe a kind of "waiting thread"?) but I haven't found documentation about it in the currentOp command documentation.
Do anybody knows about this operation and/or could point to documentation where it is described, please? Thanks in advance!
Wait for majority service is defined here. Looking at the history of this file, it appears to have been added as part of work on improving startup performance.
Reading the ticket description, it seems that during startup, multiple operations may need to wait for a majority commit. Previously each may have created a separate thread for waiting; with the majority wait service, there is only one thread which is waiting for the most recent required commit.
Therefore:
It's op is "none"
The thread isn't performing an operation as defined in the docs.
It has a very log opid number (2)
This is because this thread is started when the server starts.
It mentions "majority" in some literals, but I'm not running a replica set but an standalone instance
It is possible to instruct mongod to run in replica set mode and point it at a data directory created by a standalone node, and vice versa. In these cases you'd probably expect the process to preserve the information already in the database, such as transactions that are pending (or need to be aborted). Hence the startup procedure may perform operations not intuitively falling under the operation mode requested.

Replica configuration - MongoDB

On mongo 3.2.17 I have the following output when running rs.initiate(). I need "ok" equal to 1. I don't know how to modified the configuration. Any suggestion?
{
"info2" : "no configuration specified. Using a default configuration for
the set",
"me" : "vpsxxxxxx:27017",
"info" : "try querying local.system.replset to see current
configuration",
"ok" : 0,
"errmsg" : "already initialized",
"code" : 23
}
You are getting this error because you have already initialized replication on your machine. This would work on a fresh instance. In your case try using reconfig instead of initiate
rs.reconfig(config, {force: true})
You can use force option when reconfiguring replica set. Make sure you have at least 3 nodes: 2 full nodes and 1 arbiter (minimum supported configuration) or 3 full nodes (minimum recommended configuration) so that primary node can be elected.

How to know the existence of replica set in sharded environment from JAVA client

I want to set
mongoClient.setWriteConcern(WriteConcern.REPLICAS_SAFE);
only if replica set is present.
But in sharded environment when I do:
mongoClient.getReplicaSetStatus();
It returns null even though I have replica set.
To mongo client I am passing mongos IP.
Most MongoDB drivers, in particular Java driver which you are using will throw an exception if you try to set REPLICA_ACKNOWLEDGED writeConcern when it's not possible to get an acknowledgement from two or more nodes.
From the docs:
WriteConcern.REPLICA_ACKNOWLEDGED Tries to write to two separate nodes. [...] will
throw an exception if two writes are not possible.
See the following for more details:
http://docs.mongodb.org/manual/reference/write-concern/
http://docs.mongodb.org/ecosystem/drivers/java-replica-set-semantics/
In my testing with mongo shell, if you provide REPLICA_ACKNOWLEDGED (formerly called REPLICA_SAFE) concern to 'getlasterror' command, you will get an error when you are not communicating with a replica set. When talking to mongos process, the error will be:
{
"singleShard" : "localhost:30001",
"n" : 0,
"connectionId" : 3,
"wnote" : "no replication has been enabled, so w=2.0 won't work",
"err" : "norepl",
"ok" : 1
}
It is not the case that the client will hang forever without wtimeout being specified, that would only be the case if there is a replica set but two nodes are not available for writes indefinitely.
Note that using "majority" as w value for write concern will work correctly through mongos - note the difference in writeConcern responses:
mongos> db.coll.insert({}); db.runCommand({getlasterror:1,w:"majority"})
{
"singleShard" : "localhost:30001",
"n" : 0,
"connectionId" : 3,
"err" : null,
"ok" : 1
}
First verify that your replica set has a PRIMARY using the mongo shell command rs.status()
Then if that worked, verify that you are connecting to the database correctly:
MongoClient mongoClient = new MongoClient( "hostname" , 27017 );
If both of those are true then there should be no reason mongoClient.getReplicaSetStatus() should return NULL. It should be returning a ReplicaSetStatus object.

MongoRestore taking more time and cpu than expected

I am just restoring a database which I dumped few minutes back to make some changes. Mongorestore taking around 100% CPU and much more time than expected. I thought, it may be due to indexes I created, but, the problem is same while restoring even a single collection. A collection is about 314MB in size and has about 185000 documents.
Usually, this thing does not happen. It might be due to less disk space on my system, but that too is 11GB.
Can anyone help me, what problem it could be?
Note: I'm doing things from mongo client. No driver included.
As you indicated in the comments that this was related to logging, then I would suggest taking a couple of steps:
First, use log rotation. This can be done via a command to the database or by sending the SIGUSR1 signal to the process, so it is very easy to script or enable as a cron job on a regular basis. More information here:
http://www.mongodb.org/display/DOCS/Logging#Logging-Rotatingthelogfiles
Second, verify your logging level. Starting with -v = log level 1; -vv = log leve 2 etc. You can adjust it both at start up and during runtime. For the runtime adjustment, you use the setParameter command:
// connect to the database or mongos
use admin;
// check the log level
db.runCommand({getParameter : 1, logLevel: 1})
{ "logLevel" : 0, "ok" : 1 }
// set it higher
db.runCommand({setParameter : 1, logLevel: 1})
// back to default levels
{ "was" : 0, "ok" : 1 }
db.runCommand({setParameter : 1, logLevel: 0})
{ "was" : 1, "ok" : 1 }
Finally you can also run with --quiet to cut down on some of the messaging also.

MongoDB sharding IP Changes

Recently we made live the mongodb sharding concept and its working fine in production server. But we have configured the public IP address instead of internal IP. So we have to change the internal ip in mongodb db sharding.
Please clarify whether its possible or not. If possible means, please share your input.
public ip example:
conf = {_id : "data1",members : [{_id : 0, host : "10.17.18.01:10001", votes : 2},{_id : 1, host : "10.17.19.02:10002", votes : 1},{_id:2, host: "10.17.19.03:10003", votes : 3, arbiterOnly: true}]}
internal ip example
conf = {_id : "data1",members : [{_id : 0, host : "20.17.18.01:10001", votes : 2},{_id : 1, host : "20.17.19.02:10002", votes : 1},{_id:2, host: "20.17.19.03:10003", votes : 3, arbiterOnly: true}]}
whether it will work. Pls suggest.
Regards,
Kumaran
You said you're trying to update the IPs in the sharding system, but the config documents you provided as an example look like a replica set configuration. If it's actually your replica set configuration you want to update, you should just be able to remove the entry for the old IP address from the replica set configuration, then add the node back in with the new IP. See http://www.mongodb.org/display/DOCS/Replica+Set+Configuration and http://www.mongodb.org/display/DOCS/Reconfiguring+when+Members+are+Up for more details.
If it's actually the sharding configuration you want to update, it will be a bit more complicated.
Throwing in an answer, even though this is a dated question for anyone else who might view this.
I would recommend using host names / host entries on your servers to handle local and external ips. However, to update the hosts in your case, you would have to change the replica set config.
Log into the Primary in the replica set then do the following:
> cfg = rs.conf()
> cfg.members[0].host = "[new host1]:[port]"
> cfg.members[1].host = "[new host2]:[port]"
> cfg.members[2].host = "[new host3]:[port]"
cfg.members is obviously a zero-index array, you can reuse that for how every many replicas you have.
> rs.reconfig( cfg )
From there, you would want to re-add your shards with the newly specified hosts.
from inside mongos.
simply use the following command to update the IPs of the shard servers:
db.shards.update({_id: <<"shard name">>} , {$set: {"host" : "newIP:27018"}})
Example:
db.shards.update({_id: "shard000"} , {$set: {"host" : "172.31.1.1:27018"}})
172.31.1.1 is the private ip address of your shard server in the private network.
avoid using a dynamic ip address.
If you want to do any modification in the Shard configuration then you should
use config
db.shards.update( { _id : } , { $set : { ... } } )
Please make sure that you restart your config server and mongos after making this change.