Need more disk space for Mongo - mongodb

I am getting the following exceptions in my Mongo
13,08 07:06:06:356 [|||] [ajp-bio-9207-exec-3] ERROR errors.GrailsExceptionResolver - CommandFailureException occurred when processing request: [POST] /MerchantBridge_stable/saks/importdata
{ "serverUsed" : "10.157.192.16:27017" , "ok" : 0.0 , "errmsg" : "Can't take a write lock while out of disk space" , "code" : 14031}. Stacktrace follows:
com.mongodb.CommandFailureException: { "serverUsed" : "10.157.192.16:27017" , "ok" : 0.0 , "errmsg" : "Can't take a write lock while out of disk space" , "code" : 14031}
at com.mongodb.CommandResult.getException(CommandResult.java:76)
at com.mongodb.CommandResult.throwOnError(CommandResult.java:131)
at com.mongodb.DBCollection.drop(DBCollection.java:1015)
I am guessing I need to set something in the mongod.conf file.
Any tips?

Using this guide by Josh Barber, you simply need to increase the machine's disk space! Try running df in the command line to view the machine's current disk space utilization, you're probably close to 100%.
Another option is to nuke it from orbit! If you're using a VM, you could always scrap it and rebuild. This will require a full rebuild, but that might be easier than trying to repartition your VM's hard drive.

Related

Rename MongoDB 4.0.4 Database Name

Getting below error :
db.copyDatabase("old_db","new_db","localhost:27017");
WARNING: db.copyDatabase is deprecated. See http://dochub.mongodb.org/core/copydb-clone-deprecation
{
"note" : "Support for the copydb command has been deprecated. See http://dochub.mongodb.org/core/copydb-clone-deprecation",
"ok" : 1
}
https://docs.mongodb.com/manual/release-notes/4.0-compatibility/#copydb-and-clone-commands
I went to this link, but there's no solution regarding this.
Any leads would be appreciated.
Use mongodump and mongorestore or write a script using the drivers.

How to find mongodb data and log files location through command?

How to find mongodb data and log files location through command?
like below SQL server command.
SELECT * FROM sys.database_files
The easiest way is probably with the getCmdLineOpts command:
db.getSiblingDB("admin").runCommand({getCmdLineOpts:1})
This Mongo Shell command will first switch to the admin database then execute the getCmdLineOpts command. An alternative is the shell wrapper:
db.serverCmdLineOpts()
These will return the parsed command line options, which should contain both the data directory being used and the log path.
{
"argv" : [
"C:\\****\\3.4.10\\bin\\mongod.exe",
"--dbpath",
"C:\\****\\data",
"--port",
"27017",
"--logpath",
"C:\\****\\data\\mongod.log",
"--bind_ip",
"0.0.0.0"
],
"parsed" : {
"net" : {
"bindIp" : "0.0.0.0",
"port" : 27017
},
"storage" : {
"dbPath" : "C:\\****\\data"
},
"systemLog" : {
"destination" : "file",
"path" : "C:\\****\\data\\mongod.log"
}
},
"ok" : 1
}
Note: I obfuscated my paths, they do not normally contain ****.
You can see it provides both the raw values as well as the parsed values. If both command line options and config file options are specified on the command line, this will show the effective values being used by the process. Keep in mind there are several extra options that can effect where data is stored but this should get you on your way pretty quickly.
If you would like to know this information without using the Mongo Shell you will have to either grep the config file or look at the command line options of the running process, or both.
You can view logs in mongoCLI also
to list all logs
> show logs
global
startupWarnings
show log content
> show log global
2018-01-30T09:14:10.305+0530 I CONTROL [initandlisten] MongoDB starting : pid=778 port=27017 dbpath=/var/lib/mongodb 64-bit host=ubuntu
2018-01-30T09:14:10.305+0530 I CONTROL [initandlisten] db version v3.6.1
2018-01-30T09:14:10.305+0530 I CONTROL [initandlisten] git version: 025d4f4fe61efd1fb6f0005be20cb45a004093d1
data path will be printed on global log line 1, in my machine its dbpath=/var/lib/mongodb

MongoError: OperationFailed: Sort operation used more than the maximum 33554432 bytes of RAM [duplicate]

Using the code:
all_reviews = db_handle.find().sort('reviewDate', pymongo.ASCENDING)
print all_reviews.count()
print all_reviews[0]
print all_reviews[2000000]
The count prints 2043484, and it prints all_reviews[0].
However when printing all_reviews[2000000], I get the error:
pymongo.errors.OperationFailure: database error: Runner error: Overflow sort stage buffered data usage of 33554495 bytes exceeds internal limit of 33554432 bytes
How do I handle this?
You're running into the 32MB limit on an in-memory sort:
https://docs.mongodb.com/manual/reference/limits/#Sort-Operations
Add an index to the sort field. That allows MongoDB to stream documents to you in sorted order, rather than attempting to load them all into memory on the server and sort them in memory before sending them to the client.
As said by kumar_harsh in the comments section, i would like to add another point.
You can view the current buffer usage using the below command over the admin database:
> use admin
switched to db admin
> db.runCommand( { getParameter : 1, "internalQueryExecMaxBlockingSortBytes" : 1 } )
{ "internalQueryExecMaxBlockingSortBytes" : 33554432, "ok" : 1 }
It has a default value of 32 MB(33554432 bytes).In this case you're running short of buffer data so you can increase buffer limit with your own defined optimal value, example 50 MB as below:
> db.adminCommand({setParameter: 1, internalQueryExecMaxBlockingSortBytes:50151432})
{ "was" : 33554432, "ok" : 1 }
We can also set this limit permanently by the below parameter in the mongodb config file:
setParameter=internalQueryExecMaxBlockingSortBytes=309715200
Hope this helps !!!
Note:This commands supports only after version 3.0 +
solved with indexing
db_handle.ensure_index([("reviewDate", pymongo.ASCENDING)])
If you want to avoid creating an index (e.g. you just want a quick-and-dirty check to explore the data), you can use aggregation with disk usage:
all_reviews = db_handle.aggregate([{$sort: {'reviewDate': 1}}], {allowDiskUse: true})
(Not sure how to do this in pymongo, though).
JavaScript API syntax for the index:
db_handle.ensureIndex({executedDate: 1})
In my case, it was necessary to fix nessary indexes in code and recreate them:
rake db:mongoid:create_indexes RAILS_ENV=production
As the memory overflow does not occur when there is a needed index of field.
PS Before this I had to disable the errors when creating long indexes:
# mongo
MongoDB shell version: 2.6.12
connecting to: test
> db.getSiblingDB('admin').runCommand( { setParameter: 1, failIndexKeyTooLong: false } )
Also may be needed reIndex:
# mongo
MongoDB shell version: 2.6.12
connecting to: test
> use your_db
switched to db your_db
> db.getCollectionNames().forEach( function(collection){ db[collection].reIndex() } )

Overflow sort stage buffered data usage exceeds internal limit

Using the code:
all_reviews = db_handle.find().sort('reviewDate', pymongo.ASCENDING)
print all_reviews.count()
print all_reviews[0]
print all_reviews[2000000]
The count prints 2043484, and it prints all_reviews[0].
However when printing all_reviews[2000000], I get the error:
pymongo.errors.OperationFailure: database error: Runner error: Overflow sort stage buffered data usage of 33554495 bytes exceeds internal limit of 33554432 bytes
How do I handle this?
You're running into the 32MB limit on an in-memory sort:
https://docs.mongodb.com/manual/reference/limits/#Sort-Operations
Add an index to the sort field. That allows MongoDB to stream documents to you in sorted order, rather than attempting to load them all into memory on the server and sort them in memory before sending them to the client.
As said by kumar_harsh in the comments section, i would like to add another point.
You can view the current buffer usage using the below command over the admin database:
> use admin
switched to db admin
> db.runCommand( { getParameter : 1, "internalQueryExecMaxBlockingSortBytes" : 1 } )
{ "internalQueryExecMaxBlockingSortBytes" : 33554432, "ok" : 1 }
It has a default value of 32 MB(33554432 bytes).In this case you're running short of buffer data so you can increase buffer limit with your own defined optimal value, example 50 MB as below:
> db.adminCommand({setParameter: 1, internalQueryExecMaxBlockingSortBytes:50151432})
{ "was" : 33554432, "ok" : 1 }
We can also set this limit permanently by the below parameter in the mongodb config file:
setParameter=internalQueryExecMaxBlockingSortBytes=309715200
Hope this helps !!!
Note:This commands supports only after version 3.0 +
solved with indexing
db_handle.ensure_index([("reviewDate", pymongo.ASCENDING)])
If you want to avoid creating an index (e.g. you just want a quick-and-dirty check to explore the data), you can use aggregation with disk usage:
all_reviews = db_handle.aggregate([{$sort: {'reviewDate': 1}}], {allowDiskUse: true})
(Not sure how to do this in pymongo, though).
JavaScript API syntax for the index:
db_handle.ensureIndex({executedDate: 1})
In my case, it was necessary to fix nessary indexes in code and recreate them:
rake db:mongoid:create_indexes RAILS_ENV=production
As the memory overflow does not occur when there is a needed index of field.
PS Before this I had to disable the errors when creating long indexes:
# mongo
MongoDB shell version: 2.6.12
connecting to: test
> db.getSiblingDB('admin').runCommand( { setParameter: 1, failIndexKeyTooLong: false } )
Also may be needed reIndex:
# mongo
MongoDB shell version: 2.6.12
connecting to: test
> use your_db
switched to db your_db
> db.getCollectionNames().forEach( function(collection){ db[collection].reIndex() } )

How to match sed pattern with :

I have following log:
File Process Start : 2014-03-12 - 10:43:40
UnZipping File : fetch_export?id=t63184&file=4
Extracted File : exp_4.mbx
Expected Messages : 6236
File Process Start : 2014-03-12 - 10:57:38
UnZipping File : fetch_export?id=t63185&file=39
Extracted File :
Expected Messages : 0
File Process Start : 2014-03-12 - 10:57:38
UnZipping File : fetch_export?id=t63185&file=33
Extracted File :
Expected Messages : 0
File Process Start : 2014-03-12 - 10:57:38
UnZipping File : fetch_export?id=sept2012&file=61
Extracted File : exp_61.mbx
Expected Messages : 7935
What i need to change in this command?
sed -n '/File Process Start/,/Expected Messages/p'
I need output only between
File Process Start
...
Expected Messages : 0
Like this:
File Process Start : 2014-03-12 - 10:57:38
UnZipping File : fetch_export?id=t63185&file=39
Extracted File :
Expected Messages : 0
File Process Start : 2014-03-12 - 10:57:38
UnZipping File : fetch_export?id=t63185&file=33
Extracted File :
Expected Messages : 0
The idea is to print the lines between "File Process Start..." and "Expected Messages : 0". The issue is that you don't know beforehand if you will find the "Expected Messages : 0" or "Expected Messages : X". So, starting at the "File Process Start", keep saving the lines in the hold space until you are able to decide if you will print them or discard them.
sed -n '/Expected Messages : 0/{H;g;p};/.*/H;/File Process Start/h' log.txt
The pseudocode is:
If the line contains "Expected Messages : 0", (h) append this line in the hold space (at this point, the hold space will have the lines "File Process Start ...", "UnZipping File..." and "Extracted File..."), (g) get the content from the hold space to the pattern space (i.e. the lines from "File Process Start..." until the last one you just appended), (p) print the content of the pattern space.
If the line contains anything else, (H) append it to the hold space.
If the line contains "File Process Start", (h) wipe out what's in the hold space and put this new line there.
This might work for you (GNU sed):
sed -n '/File Process Start/{h;d};H;/Expected Messages : 0/{g;p}' file