MongoRestore taking more time and cpu than expected - mongodb

I am just restoring a database which I dumped few minutes back to make some changes. Mongorestore taking around 100% CPU and much more time than expected. I thought, it may be due to indexes I created, but, the problem is same while restoring even a single collection. A collection is about 314MB in size and has about 185000 documents.
Usually, this thing does not happen. It might be due to less disk space on my system, but that too is 11GB.
Can anyone help me, what problem it could be?
Note: I'm doing things from mongo client. No driver included.

As you indicated in the comments that this was related to logging, then I would suggest taking a couple of steps:
First, use log rotation. This can be done via a command to the database or by sending the SIGUSR1 signal to the process, so it is very easy to script or enable as a cron job on a regular basis. More information here:
http://www.mongodb.org/display/DOCS/Logging#Logging-Rotatingthelogfiles
Second, verify your logging level. Starting with -v = log level 1; -vv = log leve 2 etc. You can adjust it both at start up and during runtime. For the runtime adjustment, you use the setParameter command:
// connect to the database or mongos
use admin;
// check the log level
db.runCommand({getParameter : 1, logLevel: 1})
{ "logLevel" : 0, "ok" : 1 }
// set it higher
db.runCommand({setParameter : 1, logLevel: 1})
// back to default levels
{ "was" : 0, "ok" : 1 }
db.runCommand({setParameter : 1, logLevel: 0})
{ "was" : 1, "ok" : 1 }
Finally you can also run with --quiet to cut down on some of the messaging also.

Related

Weird operation in db.currentOp() output in MongoDB

Just after starting my MongoDB server (standalone instance, version 4.2.2) if I run db.currentOp() I see this operation:
{
"type" : "op",
"host" : "menzo:27017",
"desc" : "waitForMajority",
"active" : true,
"currentOpTime" : "2020-05-06T16:16:33.077+0200",
"opid" : 2,
"op" : "none",
"ns" : "",
"command" : {
},
"numYields" : 0,
"waitingForLatch" : {
"timestamp" : ISODate("2020-05-06T14:02:55.895Z"),
"captureName" : "WaitForMaorityService::_mutex"
},
"locks" : {
},
"waitingForLock" : false,
"lockStats" : {
},
"waitingForFlowControl" : false,
"flowControlStats" : {
}
}
It seems that this operation is always there, no matter how long it passes. In addition, it is a weird operation in some aspects:
It has a very log opid number (2)
It's op is "none"
It doesn't have the usual secs_running or microsecs_running parameters
It mentions "majority" in some literals, but I'm not running a replica set but an standalone instance
I guess it should be some kind of internal operation (maybe a kind of "waiting thread"?) but I haven't found documentation about it in the currentOp command documentation.
Do anybody knows about this operation and/or could point to documentation where it is described, please? Thanks in advance!
Wait for majority service is defined here. Looking at the history of this file, it appears to have been added as part of work on improving startup performance.
Reading the ticket description, it seems that during startup, multiple operations may need to wait for a majority commit. Previously each may have created a separate thread for waiting; with the majority wait service, there is only one thread which is waiting for the most recent required commit.
Therefore:
It's op is "none"
The thread isn't performing an operation as defined in the docs.
It has a very log opid number (2)
This is because this thread is started when the server starts.
It mentions "majority" in some literals, but I'm not running a replica set but an standalone instance
It is possible to instruct mongod to run in replica set mode and point it at a data directory created by a standalone node, and vice versa. In these cases you'd probably expect the process to preserve the information already in the database, such as transactions that are pending (or need to be aborted). Hence the startup procedure may perform operations not intuitively falling under the operation mode requested.

Mongodb shard balancer stuck in state 0

I've made sharding on one collection with 2 shards.
Mongodb version is 2.6.4.
Everything looks ok but 100% of data is on one shard.
When I do:
use config
db.locks.find( { _id : "balancer" } ).pretty()
I get:
{
"_id" : "balancer",
"state" : 0,
"who" : "ip-10-0-11-128:27018:1424099612:1804289383:Balancer:846930886",
"ts" : ObjectId("553a1223e4d292075ec2a8a6"),
"process" : "ip-10-0-11-128:27018:1424099612:1804289383",
"when" : ISODate("2015-04-24T09:51:31.498Z"),
"why" : "doing balance round"
}
So balancer is stuck in state 0. I have tried to restart it but it is still in state 0.
Also:
sh.isBalancerRunning()
> false
But:
sh.getBalancerState()
> true
Errors in my log file:
2015-04-24T10:15:47.921+0000 [Balancer] scoped connection to 10.0.11.128:20000,10.0.11.159:20000,10.0.11.240:20000 not being returned to the pool
2015-04-24T10:15:47.921+0000 [Balancer] caught exception while doing balance: error checking clock skew of cluster 10.0.11.128:20000,10.0.11.159:20000,10.0.11.240:20000 :: caused by :: 13650 clock skew of the cluster 10.0.11.128:20000,10.0.11.159:20000,10.0.11.240:20000 is too far out of bounds to allow distributed locking.
Anyone knows how to fix this?
Thanks,
Ivan
From a quick look at the code, it looks like we can tolerate about 30
seconds of skew between servers in the cluster. On linux, we
recommend people use ntp to keep skew to a minimum (ntp usually keeps
it to within a second or two). ntp is usually already installed on
most linux distros.
Mongodb user group
using the following command to do an one time ntp synx
ntpdate pool.ntp.org

How to know the existence of replica set in sharded environment from JAVA client

I want to set
mongoClient.setWriteConcern(WriteConcern.REPLICAS_SAFE);
only if replica set is present.
But in sharded environment when I do:
mongoClient.getReplicaSetStatus();
It returns null even though I have replica set.
To mongo client I am passing mongos IP.
Most MongoDB drivers, in particular Java driver which you are using will throw an exception if you try to set REPLICA_ACKNOWLEDGED writeConcern when it's not possible to get an acknowledgement from two or more nodes.
From the docs:
WriteConcern.REPLICA_ACKNOWLEDGED Tries to write to two separate nodes. [...] will
throw an exception if two writes are not possible.
See the following for more details:
http://docs.mongodb.org/manual/reference/write-concern/
http://docs.mongodb.org/ecosystem/drivers/java-replica-set-semantics/
In my testing with mongo shell, if you provide REPLICA_ACKNOWLEDGED (formerly called REPLICA_SAFE) concern to 'getlasterror' command, you will get an error when you are not communicating with a replica set. When talking to mongos process, the error will be:
{
"singleShard" : "localhost:30001",
"n" : 0,
"connectionId" : 3,
"wnote" : "no replication has been enabled, so w=2.0 won't work",
"err" : "norepl",
"ok" : 1
}
It is not the case that the client will hang forever without wtimeout being specified, that would only be the case if there is a replica set but two nodes are not available for writes indefinitely.
Note that using "majority" as w value for write concern will work correctly through mongos - note the difference in writeConcern responses:
mongos> db.coll.insert({}); db.runCommand({getlasterror:1,w:"majority"})
{
"singleShard" : "localhost:30001",
"n" : 0,
"connectionId" : 3,
"err" : null,
"ok" : 1
}
First verify that your replica set has a PRIMARY using the mongo shell command rs.status()
Then if that worked, verify that you are connecting to the database correctly:
MongoClient mongoClient = new MongoClient( "hostname" , 27017 );
If both of those are true then there should be no reason mongoClient.getReplicaSetStatus() should return NULL. It should be returning a ReplicaSetStatus object.

MongoDB logging all queries

The question is as basic as it is simple... How do you log all queries in a "tail"able log file in mongodb?
I have tried:
setting the profiling level
setting the slow ms parameter starting
mongod with the -vv option
The /var/log/mongodb/mongodb.log keeps showing just the current number of active connections...
You can log all queries:
$ mongo
MongoDB shell version: 2.4.9
connecting to: test
> use myDb
switched to db myDb
> db.getProfilingLevel()
0
> db.setProfilingLevel(2)
{ "was" : 0, "slowms" : 1, "ok" : 1 }
> db.getProfilingLevel()
2
> db.system.profile.find().pretty()
Source: http://docs.mongodb.org/manual/reference/method/db.setProfilingLevel/
db.setProfilingLevel(2) means "log all operations".
I ended up solving this by starting mongod like this (hammered and ugly, yeah... but works for development environment):
mongod --profile=1 --slowms=1 &
This enables profiling and sets the threshold for "slow queries" as 1ms, causing all queries to be logged as "slow queries" to the file:
/var/log/mongodb/mongodb.log
Now I get continuous log outputs using the command:
tail -f /var/log/mongodb/mongodb.log
An example log:
Mon Mar 4 15:02:55 [conn1] query dendro.quads query: { graph: "u:http://example.org/people" } ntoreturn:0 ntoskip:0 nscanned:6 keyUpdates:0 locks(micros) r:73163 nreturned:6 reslen:9884 88ms
Because its google first answer ...
For version 3
$ mongo
MongoDB shell version: 3.0.2
connecting to: test
> use myDb
switched to db
> db.setLogLevel(1)
http://docs.mongodb.org/manual/reference/method/db.setLogLevel/
MongoDB has a sophisticated feature of profiling. The logging happens in system.profile collection. The logs can be seen from:
db.system.profile.find()
There are 3 logging levels (source):
Level 0 - the profiler is off, does not collect any data. mongod always writes operations longer than the slowOpThresholdMs threshold to its log. This is the default profiler level.
Level 1 - collects profiling data for slow operations only. By default slow operations are those slower than 100 milliseconds.
You can modify the threshold for “slow” operations with the slowOpThresholdMs runtime option or the setParameter command. See the Specify the Threshold for Slow Operations section for more information.
Level 2 - collects profiling data for all database operations.
To see what profiling level the database is running in, use
db.getProfilingLevel()
and to see the status
db.getProfilingStatus()
To change the profiling status, use the command
db.setProfilingLevel(level, milliseconds)
Where level refers to the profiling level and milliseconds is the ms of which duration the queries needs to be logged. To turn off the logging, use
db.setProfilingLevel(0)
The query to look in the system profile collection for all queries that took longer than one second, ordered by timestamp descending will be
db.system.profile.find( { millis : { $gt:1000 } } ).sort( { ts : -1 } )
I made a command line tool to activate the profiler activity and see the logs in a "tail"able way --> "mongotail":
$ mongotail MYDATABASE
2020-02-24 19:17:01.194 QUERY [Company] : {"_id": ObjectId("548b164144ae122dc430376b")}. 1 returned.
2020-02-24 19:17:01.195 QUERY [User] : {"_id": ObjectId("549048806b5d3db78cf6f654")}. 1 returned.
2020-02-24 19:17:01.196 UPDATE [Activation] : {"_id": "AB524"}, {"_id": "AB524", "code": "f2cbad0c"}. 1 updated.
2020-02-24 19:17:10.729 COUNT [User] : {"active": {"$exists": true}, "firstName": {"$regex": "mac"}}
...
But the more interesting feature (also like tail) is to see the changes in "real time" with the -f option, and occasionally filter the result with grep to find a particular operation.
See documentation and installation instructions in: https://github.com/mrsarm/mongotail
(also runnable from Docker, specially if you want to execute it from Windows https://hub.docker.com/r/mrsarm/mongotail)
if you want the queries to be logged to mongodb log file, you have to set both
the log level and the profiling, like for example:
db.setLogLevel(1)
db.setProfilingLevel(2)
(see https://docs.mongodb.com/manual/reference/method/db.setLogLevel)
Setting only the profiling would not have the queries logged to file, so you can only get it from
db.system.profile.find().pretty()
Once profiling level is set using db.setProfilingLevel(2).
The below command will print the last executed query.
You may change the limit(5) as well to see less/more queries.
$nin - will filter out profile and indexes queries
Also, use the query projection {'query':1} for only viewing query field
db.system.profile.find(
{
ns: {
$nin : ['meteor.system.profile','meteor.system.indexes']
}
}
).limit(5).sort( { ts : -1 } ).pretty()
Logs with only query projection
db.system.profile.find(
{
ns: {
$nin : ['meteor.system.profile','meteor.system.indexes']
}
},
{'query':1}
).limit(5).sort( { ts : -1 } ).pretty()
The profiler data is written to a collection in your DB, not to file. See http://docs.mongodb.org/manual/tutorial/manage-the-database-profiler/
I would recommend using 10gen's MMS service, and feed development profiler data there, where you can filter and sort it in the UI.
I think that while not elegant, the oplog could be partially used for this purpose: it logs all the writes - but not the reads...
You have to enable replicatoon, if I'm right. The information is from this answer from this question: How to listen for changes to a MongoDB collection?
Setting profilinglevel to 2 is another option to log all queries.
db.setProfilingLevel(2,-1)
This worked! it logged all query info in mongod log file
I recommend checking out mongosniff. This can tool can do everything you want and more. Especially it can help diagnose issues with larger scale mongo systems and how queries are being routed and where they are coming from since it works by listening to your network interface for all mongo related communications.
http://docs.mongodb.org/v2.2/reference/mongosniff/
I wrote a script that will print out the system.profile log in real time as queries come in. You need to enable logging first as stated in other answers. I needed this because I'm using Windows Subsystem for Linux, for which tail still doesn't work.
https://github.com/dtruel/mongo-live-logger
db.adminCommand( { getLog: "*" } )
Then
db.adminCommand( { getLog : "global" } )
This was asked a long time ago but this may still help someone:
MongoDB profiler logs all the queries in the capped collection system.profile. See this: database profiler
Start mongod instance with --profile=2 option that enables logging all queries
OR if mongod instances is already running, from mongoshell, run db.setProfilingLevel(2) after selecting database. (it can be verified by db.getProfilingLevel(), which should return 2)
After this, I have created a script which utilises mongodb's tailable cursor to tail this system.profile collection and write the entries in a file.
To view the logs I just need to tail it:tail -f ../logs/mongologs.txt.
This script can be started in background and it will log all the operation on the db in the file.
My code for tailable cursor for the system.profile collection is in nodejs; it logs all the operations along with queries happening in every collection of MyDb:
const MongoClient = require('mongodb').MongoClient;
const assert = require('assert');
const fs = require('fs');
const file = '../logs/mongologs'
// Connection URL
const url = 'mongodb://localhost:27017';
// Database Name
const dbName = 'MyDb';
//Mongodb connection
MongoClient.connect(url, function (err, client) {
assert.equal(null, err);
const db = client.db(dbName);
listen(db, {})
});
function listen(db, conditions) {
var filter = { ns: { $ne: 'MyDb.system.profile' } }; //filter for query
//e.g. if we need to log only insert queries, use {op:'insert'}
//e.g. if we need to log operation on only 'MyCollection' collection, use {ns: 'MyDb.MyCollection'}
//we can give a lot of filters, print and check the 'document' variable below
// set MongoDB cursor options
var cursorOptions = {
tailable: true,
awaitdata: true,
numberOfRetries: -1
};
// create stream and listen
var stream = db.collection('system.profile').find(filter, cursorOptions).stream();
// call the callback
stream.on('data', function (document) {
//this will run on every operation/query done on our database
//print 'document' to check the keys based on which we can filter
//delete data which we dont need in our log file
delete document.execStats;
delete document.keysExamined;
//-----
//-----
//append the log generated in our log file which can be tailed from command line
fs.appendFile(file, JSON.stringify(document) + '\n', function (err) {
if (err) (console.log('err'))
})
});
}
For tailable cursor in python using pymongo, refer the following code which filters for MyCollection and only insert operation:
import pymongo
import time
client = pymongo.MongoClient()
oplog = client.MyDb.system.profile
first = oplog.find().sort('$natural', pymongo.ASCENDING).limit(-1).next()
ts = first['ts']
while True:
cursor = oplog.find({'ts': {'$gt': ts}, 'ns': 'MyDb.MyCollection', 'op': 'insert'},
cursor_type=pymongo.CursorType.TAILABLE_AWAIT)
while cursor.alive:
for doc in cursor:
ts = doc['ts']
print(doc)
print('\n')
time.sleep(1)
Note: Tailable cursor only works with capped collections. It cannot be used to log operations on a collection directly, instead use filter: 'ns': 'MyDb.MyCollection'
Note: I understand that the above nodejs and python code may not be of much help for some. I have just provided the codes for reference.
Use this link to find documentation for tailable cursor in your languarge/driver choice Mongodb Drivers
Another feature that i have added after this logrotate.
Try out this package to tail all the queries (without oplog operations): https://www.npmjs.com/package/mongo-tail-queries
(Disclaimer: I wrote this package exactly for this need)

Need list of config servers MongoDB

I need to grab (within the C# driver for MongoDB) a list of all the config servers connected to my instance of Mongo-s. Or, failing that, I would settle for a way to grab ALL the servers and a way to go through them one by one telling which are configsvr and which are something else. I was thinking of the getShardMap command, but I still have no idea how to look at a server (programmatically) and decide if it's a configsvr or not.
Thanks.
mongos> db.runCommand("getShardMap")
{
"map" : {
"node2:27021" : "node2:27021",
"node3:27021" : "node3:27021",
"node4:27021" : "node4:27021",
"config" : "node2:27019,node3:27019,node4:27019",
"shard0000" : "node2:27021",
"shard0001" : "node3:27021",
"shard0002" : "node4:27021"
},
"ok" : 1
}
getShardMap command gives the config string that is passed to mongos server. You can parse the string to get the list of config servers.
The only way I can think of to get this info is to run the getCmdLineOpts command on a mongos and look at the --configdb argument it was passed. I'm not sure how you run admin commands in the C# driver, but I'd imagine it's something like:
db.RunCommand("getCmdLineOpts");