Assume a collection name test has the following data
{ a : 1}
{ a : 2}
Also, it is indexed on {a : 1}
> db.test.stats()
{
"ns" : "mydb.test",
"count" : 2,
"size" : 96,
"avgObjSize" : 48,
"storageSize" : 8192,
"numExtents" : 1,
"nindexes" : 2,
"lastExtentSize" : 8192,
"paddingFactor" : 1,
"systemFlags" : 1,
"userFlags" : 1,
"totalIndexSize" : 16352,
"indexSizes" : {
"_id_" : 8176,
"a_1" : 8176
},
"ok" : 1
}
> db.test.totalSize()
24544
As per documentation - it returns The total size of the data in the collection plus the size of every indexes on the collection. How ? From the data above,
total size of the data -> "size" : 96
size of every indexes -> 8176 * 2 -> 16392
Total -> 16392 + 96 = 16488
Why is there a difference ? What am I missing ?
The totalSize is equal to storageSize + totalIndexSize. As you will notice, these add up to exactly 24544.
To avoid constant reallocation of new hard-drive space and the resulting filesystem fragmentation when new documents are added to a collection, MongoDB overallocates storage space for each collection in advance. As a result, the total space used by a collection is always larger than the sum of its data.
Related
I have a capped collection. When I call stats() on it I get following output:
/* 0 */
{
"ns" : "log_db.access_logs",
"count" : 42088,
"size" : 13602632,
"avgObjSize" : 323,
"storageSize" : 100003840,
"numExtents" : 1,
"nindexes" : 2,
"lastExtentSize" : 100003840,
"paddingFactor" : 1,
"systemFlags" : 1,
"userFlags" : 0,
"totalIndexSize" : 3932656,
"indexSizes" : {
"_id_" : 1389920,
"apikey_1_ts_1" : 2542736
},
"capped" : true,
"max" : NumberLong(9223372036854775807),
"ok" : 1
}
I don't know what 'max' parameter is showing. I created this collection using db.runCommand('convertToCapped' ... ) construct and I had put the 'size' parameter to be 100 million. But here there's no mention of it (size)
Can someone explain the meaning of 'max' field here and also how to find correct size of a capped collection.
I had this exact question once, since it was not in the documentation and this is the answer I got from a 10gen (MongoDB Inc) employee mongodb capped collection is not using all available space :
The max property is an (optional) maximum number of documents to allow in the capped collection (see db.createCollection()). If you don't specify a max value, it will be set to MAXINT (in your example, the maximum positive value for a signed Int64). Since space for capped collections is always preallocated, the size limit takes precedence over the max number of documents.
When i googled for MongoDB best pratices , i found out that , the size of collection in mongodb must be smaller when compared to RAM Size of the CPU
I have got 6 collections in my mongodb Database .
Please tell me how can i know the size of collections present in MongoDB
The status for one of my collection is
db.chains.stats()
{
"ns" : "at.chains",
"count" : 2967,
"size" : 89191980,
"avgObjSize" : 30061.33468149646,
"storageSize" : 335118080,
"numExtents" : 18,
"nindexes" : 3,
"lastExtentSize" : 67136000,
"paddingFactor" : 1.0099999999999996,
"flags" : 1,
"totalIndexSize" : 34742272,
"indexSizes" : {
"_id_" : 155648,
"symbol_1" : 172032,
"unique_symbol_1" : 34414592
},
"ok" : 1
}
Do i need to sum up the size of all the 6 collections i got and compare that with the RAM Size ??
Or is there any other way ??
Thanks in advance .
You just need call db.stats(); in Mongodb console, here is the Mongodb website about your question.
> db.stats();
{
"db" : "test",
"collections" : 5,
"objects" : 24,
"avgObjSize" : 67.33333333333333,
"dataSize" : 1616,
"storageSize" : 28672,
"numExtents" : 5,
"indexes" : 4,
"indexSize" : 32704,
"fileSize" : 201326592,
"nsSizeMB" : 16,
"dataFileVersion" : {
"major" : 4,
"minor" : 5
},
"ok" : 1
You can calculate the size of your collection (chains) by looking at the "size" value. The 89191980 is in bytes, so it's roughly 85 MB. You can find the documentation here: http://docs.mongodb.org/manual/reference/command/collStats/
It'd be a good idea to take a look at this SO thread, they do a good job of covering RAM size & "working set".
What does it mean to fit "working set" into RAM for MongoDB?
I am looking at the output of db.system.profile.stats() and I'm curious about what the max field means in the returned document (running mongodb 2.2.2).
Here's an example:
> db.system.profile.stats()
{
"ns" : "mydb.system.profile",
"count" : 2476,
"size" : 1012284,
"avgObjSize" : 408.83844911147014,
"storageSize" : 1052672,
"numExtents" : 2,
"nindexes" : 0,
"lastExtentSize" : 4096,
"paddingFactor" : 1,
"systemFlags" : 0,
"userFlags" : 0,
"totalIndexSize" : 0,
"indexSizes" : {
},
"capped" : true,
"max" : 2147483647,
"ok" : 1
}
There is no mention of max on the official mongodb documentation of db.collection.stats().
Perhaps it has something to do with the fact that system.profile is a capped collection. Although max is definitely not the maximum size of the capped collection because (1) the max shown is a huge number and (2) my collection doesn't get larger than 2500 or so documents and the total size is much less than this.
Any thoughts?
Thanks,
Kevin
max is an optional setting for a capped collection to also limit the number of documents in the collection, instead of just limiting by number of bytes (size).
See docs here.
I have 57M documents in my mongodb collection, which is 19G of data.
My indexes are taking up 10G. Does this sound normal or could I be doing something very wrong! My primary key is 2G.
{
"ns" : "myDatabase.logs",
"count" : 56795183,
"size" : 19995518140,
"avgObjSize" : 352.0636272974065,
"storageSize" : 21217578928,
"numExtents" : 39,
"nindexes" : 4,
"lastExtentSize" : 2146426864,
"paddingFactor" : 1,
"flags" : 1,
"totalIndexSize" : 10753999088,
"indexSizes" : {
"_id_" : 2330814080,
"type_1_playerId_1" : 2999537296,
"type_1_time_-1" : 2344582464,
"type_1_tableId_1" : 3079065248
},
"ok" : 1
}
The index size is determined by the number of documents being indexed, as well as the size of the key (compound keys store more information and will be larger). In this case, the _id index divided by the number of documents is 40 bytes, which seems relatively reasonable.
If you run db.collection.getIndexes(), you can find the index version. If {v : 0}, the index was created prior to mongo 2.0, in which case you should upgrade to {v:1}. This process is documented here: http://www.mongodb.org/display/DOCS/Index+Versions
https://gist.github.com/1173528#comments
shows the structure of the data file ...
the short version is
{ "img_ref" : {
"$ref" : "mapimage",
"$id" : ObjectId("4e454599f404e8d51c000002")
},
"scale" : 128, "image" : "4e454599f404e8d51c000002", "tile_i" : 0, "tile_j" : 9, "w" : 9, "e" : 10, "n" : 0, "s" : 0,
"heights" : [
[
0,
2,
0,
1,
515,
0,
256,
...], [...]
, _id: ObjectId("...") }
The stats() on this collection is:
{
"ns" : "ac2.mapimage_tile",
"count" : 18443,
"size" : 99513670744,
"avgObjSize" : 5395742.056281516,
"storageSize" : 100336473712,
"numExtents" : 74,
"nindexes" : 4,
"lastExtentSize" : 2146426864,
"paddingFactor" : 1,
"flags" : 0,
"totalIndexSize" : 5832704,
"indexSizes" : {
"_id_" : 786432,
"img_ref_1_tile_i_1_tile_j_1" : 2236416,
"image_1" : 1212416,
"image_1_tile_i_1_tile_j_1_scale_1" : 1597440
},
"ok" : 1
}
Note the average object size, 5,395,742 bytes - or 5 MB! 5 MB for storing 16,384 ints seems pretty extreme!
See http://bsonspec.org/#/specification for how things get serialized in mongodb. Arrays are actually very space inefficient especially because we store the index number as a string key for each element. This is less of a problem for small arrays of large elements like strings or objects, but for large arrays of 32-bit ints it is very expensive.
MongoDB pre-allocates space for it's databases: http://www.mongodb.org/display/DOCS/Developer+FAQ#DeveloperFAQ-Whyaremydatafilessolarge%3F
What you are likely seeing is that pre-allocation- if you add further items, you probably will not see a further increase in space usage for a long while.
Also: http://www.mongodb.org/display/DOCS/Excessive+Disk+Space