mongodb 2.4.1 failing with 'process out of memory' - mongodb

I have the following mongo versions
db version v2.4.1
MongoDB shell version: 2.4.1,
and
db version v2.2.1-rc1, pdfile version 4.5,
MongoDB shell version: 2.2.1-rc1
installed on 64-bit windows 7 machine.
I have a collection having 10001000 (10 million+) records, when I use V 2.4.1 to aggregate, it fails with the following
error:
Fatal error in CALL_AND_RETRY_2
Allocation failed - process out of memory
However when I use V 2.2.1-rc1, to aggregate the same collection, it works fine and gives result in around 1 minute.
Sample document of the collection that is being aggregated:
{
"_id" : ObjectId("516bdd1c39b10c722792e007"),
"f1" : 10000010,
"f2" : 10000000,
"key" : 0
}
Aggregation Command:
{$group: {"_id": "$key", total: {$sum: "$f1"}}}
Command used to populate records:
for(var i = 10011000; i < 10041000; ++i)
{
db.testp.insert({"f1": i+10, "f2": i, "key": i%1000})
}

How much memory do you have? Could it be the $group is taking up more than 10% of available memory and causing the error? See the aggregation documentation on memory for cumulative operators.
edit 1:
Out of interest - does the aggregation work outside the shell? eg calling it from a driver.
I have seen similar v8 errors and as the shell was updated to v8 in 2.4 Theres a chance it could be that.
edit 2:
If the resulting array is too big in the shell then that can also trigger the error: see SERVER-8859. To work around you might need to run multiple aggregations, either by doing a $match early on to limit the working set or even a $skip and $limit to paginate through the result set.
I tried your aggregation with 10,070,999 docs on 2.4.1 on a mac and didn't get the error

Related

Getting error for query in mongo 3.6.22 which works fine in 4.4.3 enterprise

I'm getting an error for mongo query in mongo version 3.6.22 error as follows
MongoError: arguments to $lookup must be strings, let: { profileIds:
"$ProfileIDs" } is type object
But this same query works fine in mongo version 4.4.3 enterprise, and the ProfileIDs are String
3.6 does not support let in $lookup. You need to write your query differently.

Aggregation not working with monger for mongodb version 3.6

Mongo aggregation framework has some changes in version 3.6 Earlier aggregation queries with monger are not working even when we pass :cursor {} as an option. Is there any workaround or do we have to wait for the next monger release?. The error we get is specified below
MongoCommandException Command failed with error 9: 'The 'cursor' option is required, except for aggregate with the explain argument' on server localhost:27017. The full response is { "ok" : 0.0, "errmsg" : "The 'cursor' option is required, except for aggregate with the explain argument", "code" : 9, "codeName" : "FailedToParse" } com.mongodb.connection.ProtocolHelper.getCommandFailureException (ProtocolHelper.java:115)
by OSt advice, I could run monger aggregation sample with codes below.
(mc/aggregate db coll
[{"$project" {:subtotal {"$multiply" ["$quantity", "$price"]}
:_id "$state"}}]
:cursor {:batch-size 0})
thanks!
According to mongo db spec, cursor became a required field in some cases. So you should provide it through monger API. It is not a problem in monger, it is a breakable change in mongo db API.

Unrecognized pipeline stage name: '$sample'

when I run this aggregation pipeline in Robomongo
db.getCollection('xyz').aggregate([{$match: {tyu: "asd", ghj: "qwe"}},
{$sample: {size: 5}}])
I receive this error:
assert: command failed: {
"errmsg" : "exception: Unrecognized pipeline stage name: '$sample'",
"code" : 16436,
"ok" : 0
I'm using mongodb ver 3.2.6 and since $sample is supported from 3.2 onward.
(https://docs.mongodb.com/manual/reference/operator/aggregation/sample/#pipe._S_sample)
Im a little confused as to why I receive this error message.
Maybe I'm just missing something small.
Thanks
As stated in the comments of the question. Mongo client had a version of 3.2.6 but Mongo db had a version of 3.0.6.
I used version() in the shell to get the client's version and
db.version() to get the DB's version.
ver 3.0.6 is too low to support $sample as stated in the mongo documentation
https://docs.mongodb.com/manual/reference/operator/aggregation/sample/#pipe._S_sample

MongoDB logging all queries

The question is as basic as it is simple... How do you log all queries in a "tail"able log file in mongodb?
I have tried:
setting the profiling level
setting the slow ms parameter starting
mongod with the -vv option
The /var/log/mongodb/mongodb.log keeps showing just the current number of active connections...
You can log all queries:
$ mongo
MongoDB shell version: 2.4.9
connecting to: test
> use myDb
switched to db myDb
> db.getProfilingLevel()
0
> db.setProfilingLevel(2)
{ "was" : 0, "slowms" : 1, "ok" : 1 }
> db.getProfilingLevel()
2
> db.system.profile.find().pretty()
Source: http://docs.mongodb.org/manual/reference/method/db.setProfilingLevel/
db.setProfilingLevel(2) means "log all operations".
I ended up solving this by starting mongod like this (hammered and ugly, yeah... but works for development environment):
mongod --profile=1 --slowms=1 &
This enables profiling and sets the threshold for "slow queries" as 1ms, causing all queries to be logged as "slow queries" to the file:
/var/log/mongodb/mongodb.log
Now I get continuous log outputs using the command:
tail -f /var/log/mongodb/mongodb.log
An example log:
Mon Mar 4 15:02:55 [conn1] query dendro.quads query: { graph: "u:http://example.org/people" } ntoreturn:0 ntoskip:0 nscanned:6 keyUpdates:0 locks(micros) r:73163 nreturned:6 reslen:9884 88ms
Because its google first answer ...
For version 3
$ mongo
MongoDB shell version: 3.0.2
connecting to: test
> use myDb
switched to db
> db.setLogLevel(1)
http://docs.mongodb.org/manual/reference/method/db.setLogLevel/
MongoDB has a sophisticated feature of profiling. The logging happens in system.profile collection. The logs can be seen from:
db.system.profile.find()
There are 3 logging levels (source):
Level 0 - the profiler is off, does not collect any data. mongod always writes operations longer than the slowOpThresholdMs threshold to its log. This is the default profiler level.
Level 1 - collects profiling data for slow operations only. By default slow operations are those slower than 100 milliseconds.
You can modify the threshold for “slow” operations with the slowOpThresholdMs runtime option or the setParameter command. See the Specify the Threshold for Slow Operations section for more information.
Level 2 - collects profiling data for all database operations.
To see what profiling level the database is running in, use
db.getProfilingLevel()
and to see the status
db.getProfilingStatus()
To change the profiling status, use the command
db.setProfilingLevel(level, milliseconds)
Where level refers to the profiling level and milliseconds is the ms of which duration the queries needs to be logged. To turn off the logging, use
db.setProfilingLevel(0)
The query to look in the system profile collection for all queries that took longer than one second, ordered by timestamp descending will be
db.system.profile.find( { millis : { $gt:1000 } } ).sort( { ts : -1 } )
I made a command line tool to activate the profiler activity and see the logs in a "tail"able way --> "mongotail":
$ mongotail MYDATABASE
2020-02-24 19:17:01.194 QUERY [Company] : {"_id": ObjectId("548b164144ae122dc430376b")}. 1 returned.
2020-02-24 19:17:01.195 QUERY [User] : {"_id": ObjectId("549048806b5d3db78cf6f654")}. 1 returned.
2020-02-24 19:17:01.196 UPDATE [Activation] : {"_id": "AB524"}, {"_id": "AB524", "code": "f2cbad0c"}. 1 updated.
2020-02-24 19:17:10.729 COUNT [User] : {"active": {"$exists": true}, "firstName": {"$regex": "mac"}}
...
But the more interesting feature (also like tail) is to see the changes in "real time" with the -f option, and occasionally filter the result with grep to find a particular operation.
See documentation and installation instructions in: https://github.com/mrsarm/mongotail
(also runnable from Docker, specially if you want to execute it from Windows https://hub.docker.com/r/mrsarm/mongotail)
if you want the queries to be logged to mongodb log file, you have to set both
the log level and the profiling, like for example:
db.setLogLevel(1)
db.setProfilingLevel(2)
(see https://docs.mongodb.com/manual/reference/method/db.setLogLevel)
Setting only the profiling would not have the queries logged to file, so you can only get it from
db.system.profile.find().pretty()
Once profiling level is set using db.setProfilingLevel(2).
The below command will print the last executed query.
You may change the limit(5) as well to see less/more queries.
$nin - will filter out profile and indexes queries
Also, use the query projection {'query':1} for only viewing query field
db.system.profile.find(
{
ns: {
$nin : ['meteor.system.profile','meteor.system.indexes']
}
}
).limit(5).sort( { ts : -1 } ).pretty()
Logs with only query projection
db.system.profile.find(
{
ns: {
$nin : ['meteor.system.profile','meteor.system.indexes']
}
},
{'query':1}
).limit(5).sort( { ts : -1 } ).pretty()
The profiler data is written to a collection in your DB, not to file. See http://docs.mongodb.org/manual/tutorial/manage-the-database-profiler/
I would recommend using 10gen's MMS service, and feed development profiler data there, where you can filter and sort it in the UI.
I think that while not elegant, the oplog could be partially used for this purpose: it logs all the writes - but not the reads...
You have to enable replicatoon, if I'm right. The information is from this answer from this question: How to listen for changes to a MongoDB collection?
Setting profilinglevel to 2 is another option to log all queries.
db.setProfilingLevel(2,-1)
This worked! it logged all query info in mongod log file
I recommend checking out mongosniff. This can tool can do everything you want and more. Especially it can help diagnose issues with larger scale mongo systems and how queries are being routed and where they are coming from since it works by listening to your network interface for all mongo related communications.
http://docs.mongodb.org/v2.2/reference/mongosniff/
I wrote a script that will print out the system.profile log in real time as queries come in. You need to enable logging first as stated in other answers. I needed this because I'm using Windows Subsystem for Linux, for which tail still doesn't work.
https://github.com/dtruel/mongo-live-logger
db.adminCommand( { getLog: "*" } )
Then
db.adminCommand( { getLog : "global" } )
This was asked a long time ago but this may still help someone:
MongoDB profiler logs all the queries in the capped collection system.profile. See this: database profiler
Start mongod instance with --profile=2 option that enables logging all queries
OR if mongod instances is already running, from mongoshell, run db.setProfilingLevel(2) after selecting database. (it can be verified by db.getProfilingLevel(), which should return 2)
After this, I have created a script which utilises mongodb's tailable cursor to tail this system.profile collection and write the entries in a file.
To view the logs I just need to tail it:tail -f ../logs/mongologs.txt.
This script can be started in background and it will log all the operation on the db in the file.
My code for tailable cursor for the system.profile collection is in nodejs; it logs all the operations along with queries happening in every collection of MyDb:
const MongoClient = require('mongodb').MongoClient;
const assert = require('assert');
const fs = require('fs');
const file = '../logs/mongologs'
// Connection URL
const url = 'mongodb://localhost:27017';
// Database Name
const dbName = 'MyDb';
//Mongodb connection
MongoClient.connect(url, function (err, client) {
assert.equal(null, err);
const db = client.db(dbName);
listen(db, {})
});
function listen(db, conditions) {
var filter = { ns: { $ne: 'MyDb.system.profile' } }; //filter for query
//e.g. if we need to log only insert queries, use {op:'insert'}
//e.g. if we need to log operation on only 'MyCollection' collection, use {ns: 'MyDb.MyCollection'}
//we can give a lot of filters, print and check the 'document' variable below
// set MongoDB cursor options
var cursorOptions = {
tailable: true,
awaitdata: true,
numberOfRetries: -1
};
// create stream and listen
var stream = db.collection('system.profile').find(filter, cursorOptions).stream();
// call the callback
stream.on('data', function (document) {
//this will run on every operation/query done on our database
//print 'document' to check the keys based on which we can filter
//delete data which we dont need in our log file
delete document.execStats;
delete document.keysExamined;
//-----
//-----
//append the log generated in our log file which can be tailed from command line
fs.appendFile(file, JSON.stringify(document) + '\n', function (err) {
if (err) (console.log('err'))
})
});
}
For tailable cursor in python using pymongo, refer the following code which filters for MyCollection and only insert operation:
import pymongo
import time
client = pymongo.MongoClient()
oplog = client.MyDb.system.profile
first = oplog.find().sort('$natural', pymongo.ASCENDING).limit(-1).next()
ts = first['ts']
while True:
cursor = oplog.find({'ts': {'$gt': ts}, 'ns': 'MyDb.MyCollection', 'op': 'insert'},
cursor_type=pymongo.CursorType.TAILABLE_AWAIT)
while cursor.alive:
for doc in cursor:
ts = doc['ts']
print(doc)
print('\n')
time.sleep(1)
Note: Tailable cursor only works with capped collections. It cannot be used to log operations on a collection directly, instead use filter: 'ns': 'MyDb.MyCollection'
Note: I understand that the above nodejs and python code may not be of much help for some. I have just provided the codes for reference.
Use this link to find documentation for tailable cursor in your languarge/driver choice Mongodb Drivers
Another feature that i have added after this logrotate.
Try out this package to tail all the queries (without oplog operations): https://www.npmjs.com/package/mongo-tail-queries
(Disclaimer: I wrote this package exactly for this need)

Updating mongodb references errors out due to field names

I have a MongoDB collection and I'm trying to update all the entries in it to change the name of a field used to store a reference. The Query I'm using is
db.products.find().forEeach(function(p) {
p.newField = p.oldField;
db.products.save(p);
});
The problem is that p.oldField is a DBRef following the standard format of { "$ref": "collection", "$id": ObjectId("...")}. When I try to run the db.products.save(p); Mongo returns the following error:
Sat Oct 1 13:00:57 uncaught exception: field names cannot start with $ [$db]
I'm using version 1.8.2 of the MongoDB shell. I have seen this work on an older version of the shell (1.6.5), which is where I originally came up with this query. But I can't seem to make this work on newer versions.