MongoDB optimization - mongodb

I need to optimize my MongoDB performance but can't figure out how. Maybe there's some tips. Or maybe i should use another storage engine. Any ideas are welcome.
I have following log output in which are described query:
2015-08-04T15:09:56.226+0300 [conn129682] command mongodb_db1.$cmd command: aggregate { aggregate: "collection", pipeline: [ { $match: { _id.index_id_1: 4931359 } } ] } keyUpdates:0 numYields:39 locks(micros) r:83489 reslen:177280 286ms
I have collection named collection which contains following data structure:
{
"_id" : {
"x" : "x",
"index_id_1" : NumberLong(5617088)
},
"value" : {
"value_1" : 1.0000000000000000,
"value_2" : 0.0000000000000000,
"value_3" : 1.0000000000000000
}
}
By querying stats in result i have following details:
{
"ns" : "mongodb_db1.collection",
"count" : 2.07e+007,
"size" : 4968000000.0000000000000000,
"avgObjSize" : 240,
"storageSize" : 5524459408.0000000000000000,
"numExtents" : 25,
"nindexes" : 3,
"lastExtentSize" : 5.36601e+008,
"paddingFactor" : 1.0000000000000000,
"systemFlags" : 0,
"userFlags" : 1,
"totalIndexSize" : 4475975728.0000000000000000,
"indexSizes" : {
"_id_" : 2884043120.0000000000000000,
"_id.x.index_id_1" : 1.07118e+009,
"_id.index_id_1" : 5.20754e+008
},
"ok" : 1.0000000000000000
}
Running on single node ( no shards ).
MongoDB version is: 2.4.
Installed RAM (MB): 24017 ( index size ~120GB )

10Gen / Mongodb are running a series of FREE online courses, that cover all you need to know (Latest iteration starts today). Simply head over and sign up for the DBA course, and if your feeling brave a couple of the others, but there is a lot of common / duplicated material, between all variants at the beginning.

Related

How to measure the query run time in MongoDB

I am trying to measure the query run time in MongoDB.
Steps:
I set the profiling in mongoDB and ran my query
When I did show Profile I got the below output.
db.blogpost.find({post:/.* NATO .*/i})
blogpost is the collection name, I searched for "NATO" keyword in query.
Output: It pulled out 20 records and after running the query to get execution results, I got the below output:
In the output I can see 3 time values, which one is similar to duration time in MySQL ?
query blogtrackernosql.blogpost **472ms** Wed Apr 11 2018 20:37:54
command:{
"find" : "blogpost",
"filter" : {
"post" : /.* NATO .*/i
},
"$db" : "blogtrackernosql"
} cursorid:99983342073 keysExamined:0 docsExamined:1122 numYield:19 locks:{
"Global" : {
"acquireCount" : {
"r" : NumberLong(40)
}
},
"Database" : {
"acquireCount" : {
"r" : NumberLong(20)
}
},
"Collection" : {
"acquireCount" : {
"r" : NumberLong(20)
}
}
} nreturned:101 responseLength:723471 protocol:op_msg planSummary:COLLSCAN
execStats:{
**"stage"** : "COLLSCAN",
"filter" : {
"post" : {
"$regex" : ".* NATO .*",
"$options" : "i"
}
},
"nReturned" : 101,
**"executionTimeMillisEstimate" : 422**,
"works" : 1123,
"advanced" : 101,
"needTime" : 1022,
"needYield" : 0,
"saveState" : 20,
"restoreState" : 19,
"isEOF" : 0,
"invalidates" : 0,
"direction" : "forward",
"docsExamined" : 1122
} client:127.0.0.1 appName:MongoDB Shell allUsers:[ ] user:
This ...
"executionTimeMillisEstimate" : 422
... is MongoDB's estimation of how long that query will take to execute on the MongoDB server.
This ...
query blogtrackernosql.blogpost 472ms
... must be the end-to-end time including some client side piece (e.g. forming the query and sending it to the MongoDB server) plus the data transfer time from the MongoDB server back to your client.
So:
472ms is the total start-to-finish time
422ms is the time spent inside the MongoDb server
Note: the ouput also tells you that MongoDB has to scan the entire collection ("stage": "COLLSCAN") to perform this query. FWIW, the reason it has to scan the collection is that you are using a case insensitive $regex. According to the docs:
Case insensitive regular expression queries generally cannot use indexes effectively.

How to improve/optimize speed of MongoDB query?

I have a small Mongo database with ~30k records.
Simple query, which uses 5-6 parameters takes almost a second (considering entire DB is in RAM).
Can anyone suggest what I'm doing wrong?
2015-11-26T18:41:29.540+0200 [conn3] command vvpilotdb2.$cmd command:
count { count: "TestResults", query: { Test: 5.0, IsAC: true,
InputMode: 0.0, IsOfficialTest: true, IsSanity: false, IsStress:
false, IsUnderNoise: false, MetalRodSize: 9.0 }, fields: {} }
planSummary: COLLSCAN keyUpdates:0 numYields:1 locks(micros) r:1397227
reslen:48 944ms
Here is db.stats(). I haven't assigned any indexes by myself. all settings - default.:
> db.stats()
{
"db" : "vvpilotdb2",
"collections" : 5,
"objects" : 28997,
"avgObjSize" : 7549.571610856296,
"dataSize" : 218914928,
"storageSize" : 243347456,
"numExtents" : 17,
"indexes" : 3,
"indexSize" : 964768,
"fileSize" : 469762048,
"nsSizeMB" : 16,
"dataFileVersion" : {
"major" : 4,
"minor" : 5
},
"extentFreeList" : {
"num" : 0,
"totalSize" : 0
},
"ok" : 1
}
In MongoDB, the _id field is indexed by default.
You should index the field which you will be using for making the query.
Compound indexes can also be created on multiple fields and the order for them (ascending/descending).
Here's the documentation for the same:
https://docs.mongodb.org/manual/indexes/

mongodb query should be covered by index but is not

the query:
db.myColl.find({"M.ST": "mostrepresentedvalueinthecollection", "M.TS": new Date(2014,2,1)}).explain()
explain output :
"cursor" : "BtreeCursor M.ST_1_M.TS_1",
"isMultiKey" : false,
"n" : 587606,
"nscannedObjects" : 587606,
"nscanned" : 587606,
"nscannedObjectsAllPlans" : 587606,
"nscannedAllPlans" : 587606,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 9992,
"nChunkSkips" : 0,
"millis" : 174820,
"indexBounds" : {
"M.ST" : [
[
"mostrepresentedvalueinthecollection",
"mostrepresentedvalueinthecollection"
]
],
"M.TS" : [
[
ISODate("2014-03-01T00:00:00Z"),
ISODate("2014-03-01T00:00:00Z")
]
]
},
"server" : "myServer"
additional details: myColl contains about 40m documents, average object size is 300b.
I don't get why indexOnly is not set to true, I have a compound index on {"M.ST":1, "M.TS":1}
The mongo host is a unix box with 16gb RAM and 500gb disk space (spinning disk).
The total index size of the database is 10gb, we've got around 1k upserts/sec, on those 1K 20 are inserts the rest are Increments.
We have another query that adds a third field in the find query (called "M.X"), and also a compound index on "M.ST", "M.X", "M.TS". That one is lightning fast and scans only 330 documents.
Any idea what could be wrong ?
Thanks.
EDIT : here's the structure of a sample document:
{
"_id" : "somestring",
"D" : {
"20140301" : {
"IM" : {
"CT" : 143
}
},
"20140302" : {
"IM" : {
"CT" : 44
}
},
"20140303" : {
"IM" : {
"CT" : 206
}
},
"20140314" : {
"IM" : {
"CT" : 5
}
}
},
"Y" : "someotherstring",
"IM" : {
"CT" : 1
},
"M" : {
"X" : 99999,
"ST" : "mostrepresentedvalueinthecollection",
"TS" : ISODate("2014-03-01T00:00:00.000Z")
},
}
The idea is to store some analytics metrics by month, the "D" field represents an array of documents containing data for each day of the month.
EDIT:
This feature is not currently implemented. Corresponding JIRA ticket is SERVER-2104. You can upvote for it, but for now, to utilize covered index queries you need to avoid use of dot-notation/embedded document.
I think you need to set a projection on that query, to tell mongo what indexes it covers.
Try this..
db.myColl.find({"M.ST": "mostrepresentedvalueinthecollection", "M.TS": new Date(2014,2,1)},{ M.ST:1, M.TS:1, _id:0 }).explain()

exception: BSONObj size: -286331154 (0xEEEEEEEE) is invalid. Size must be between 0 and 16793600(16MB)

I'm trying to use the full search http://docs.mongodb.org/manual/tutorial/search-for-text/
db ['Item']. runCommand ('text', {search: 'deep voice', language: 'english'})
it works well
but when I add conditions
db['Item'].runCommand( 'text', { search: 'deep voice' , language: 'english' , filter: {"and":[{"_extendedBy":{"in":["Voiceover"]}},{"and":[{"or":[{"removed":null},{"removed":{"\(exists":false}}]},{"category":ObjectId("51bc464ab012269e23278d55")},{"active":true},{"visible":true}]}]} } )
I receive an error
{
"queryDebugString" : "deep|voic||||||",
"language" : "english",
"errmsg" : "exception: BSONObj size: -286331154 (0xEEEEEEEE) is invalid. Size must be between 0 and 16793600(16MB) First element: _extendedBy: \"Voiceover\"",
"code" : 10334,
"ok" : 0
}
delete the word "voice"
db['Item'].runCommand( 'text', { search: 'deep' , language: 'english' , filter: {"\)and":[{"_extendedBy":{"in":["Voiceover"]}},{"and":[{"or":[{"removed":null},{"removed":{"exists":false}}]},{"category":ObjectId("51bc464ab012269e23278d55")},{"active":true},{"visible":true}]}]} } );
receive
response to a request ...... ......
],
"stats" : {
"nscanned" : 87,
"nscannedObjects" : 87,
"n" : 18,
"nfound" : 18,
"timeMicros" : 1013
},
"ok" : 1
}
Couldn’t understand why the error occurs?
database is not large "storageSize" : 2793472,
db.Item.stats()
{
"ns" : "internetjock.Item",
"count" : 616,
"size" : 2035840,
"avgObjSize" : 3304.935064935065,
"storageSize" : 2793472,
"numExtents" : 5,
"nindexes" : 12,
"lastExtentSize" : 2097152,
"paddingFactor" : 1.0000000000001221,
"systemFlags" : 0,
"userFlags" : 1,
"totalIndexSize" : 7440160,
"indexSizes" : {
"_id_" : 24528,
"modlrHff22a60ae822e1e68ba919bbedcb8957d5c5d10f" : 40880,
"modlrH6f786b134a46c37db715aa2c831cfbe1fadb9d1d" : 40880,
"modlrI467f6180af484be29ee9258920fc4837992c825e" : 24528,
"modlrI5cb302f507b9d0409921ac0c51f7d9fc4fd5d2ee" : 40880,
"modlrI6393f31b5b6b4b2cd9517391dabf5db6d6dd3c28" : 8176,
"modlrI1c5cbf0ce48258a5a39c1ac54a1c1a038ebe1027" : 32704,
"modlrH6e623929cc3867746630bae4572b9dbe5bd3b9f7" : 40880,
"modlrH72ea9b8456321008fd832ef9459d868800ce87cb" : 40880,
"modlrU821e16c04f9069f8d0b705d78d8f666a007c274d" : 24528,
"modlrT88fc09e54b17679b0028556344b50c9fe169bdb5" : 7080416,
"modlrIefa804b72cc346d66957110e286839a3f42793ef" : 40880
},
"ok" : 1
}
I had same problem with mongo 3.0.0 and 3.1.9 with relatively small database (12GB).
After wasting roughly 4 hours of time on this I found workaround using hidden parameter
mongorestore --batchSize=10
where number varies depending on nature of your data. Start with 1000.
The result document returned by the first query is apparently greater than 16MB. MongoDB has a max document size of 16MB. The second query is returning a document that's lesser than 16MB and hence no errors.
There's no way around this. Here's the link to documentation:
http://docs.mongodb.org/manual/reference/limits/
Recreate the Text Index and everything works :-)
db.Item.dropIndex('modlrT88fc09e54b17679b0028556344b50c9fe169bdb5');
db.Item.ensureIndex({'keywords':'text'},{'name':'modlrT88fc09e54b17679b0028556344b50c9fe169bdb5'})
db.Item.stats()
...
"modlrT88fc09e54b17679b0028556344b50c9fe169bdb5" : 7080416, //before
...
"modlrT88fc09e54b17679b0028556344b50c9fe169bdb5" : 2518208 //after Recreated the Text Index

Why are my mongodb indexes so large

I have 57M documents in my mongodb collection, which is 19G of data.
My indexes are taking up 10G. Does this sound normal or could I be doing something very wrong! My primary key is 2G.
{
"ns" : "myDatabase.logs",
"count" : 56795183,
"size" : 19995518140,
"avgObjSize" : 352.0636272974065,
"storageSize" : 21217578928,
"numExtents" : 39,
"nindexes" : 4,
"lastExtentSize" : 2146426864,
"paddingFactor" : 1,
"flags" : 1,
"totalIndexSize" : 10753999088,
"indexSizes" : {
"_id_" : 2330814080,
"type_1_playerId_1" : 2999537296,
"type_1_time_-1" : 2344582464,
"type_1_tableId_1" : 3079065248
},
"ok" : 1
}
The index size is determined by the number of documents being indexed, as well as the size of the key (compound keys store more information and will be larger). In this case, the _id index divided by the number of documents is 40 bytes, which seems relatively reasonable.
If you run db.collection.getIndexes(), you can find the index version. If {v : 0}, the index was created prior to mongo 2.0, in which case you should upgrade to {v:1}. This process is documented here: http://www.mongodb.org/display/DOCS/Index+Versions