How could I know mongo data unused space - mongodb

In Mongo, the storage size is pre-allocated, example, the db.stats() output as below, the storageSize : 65536 is not full used by the mongo document, but how could I know the free space is available for the pre-allocated storageSize?
"127.0.0.1:27018" : {
"db" : "test",
"collections" : 1,
"objects" : 10,
"avgObjSize" : 53.08618233618234,
"dataSize" : 530,
"storageSize" : 65536,
"numExtents" : 0,
"indexes" : 1,
"indexSize" : 532,
"ok" : 1

If I understand correctly the answer should be equal to storageSize - dataSize?
This document can be really helpfull : http://blog.mongolab.com/2014/01/how-big-is-your-mongodb/
And the db.stats() document of mongodb: https://docs.mongodb.org/manual/reference/command/dbStats/, where you can find the meaning of every values returned by dbStats

Related

Different realation between dataSize and storageSize in different databases

I am using MongoDB 4.0 and I have 2 different databases as below:
First one with this stats
"collections" : 527,
"views" : 0,
"objects" : 20512406,
"avgObjSize" : 145.463036271805,
"dataSize" : 2983796858.0,
"storageSize" : 10980642816.0,
"numExtents" : 0,
"indexes" : 2335,
"indexSize" : 7409999872.0
and the other one with this stats
"collections" : 483,
"views" : 0,
"objects" : 11765584,
"avgObjSize" : 6324.48132315404,
"dataSize" : 74411216264.0,
"storageSize" : 30270824448.0,
"numExtents" : 0,
"indexes" : 1632,
"indexSize" : 939061248.0,
I am using WiredTiger Storage Engine and I know that it compress data and keep it on disc. My question is why in first database, storageSize is larger than dataSize but in the second on dataSize is larger than storageSize?
And one more question, why numExtents : 0 I know that it contains data extents and index extents but why it shows 0?

What are the meanings of "storageSize" and "size" for a collection's sizes in MongoDB 3.2?

About the statistics of a collection
From MongoDB: The Definite Guide, 2ed, by Kristina Chodorow, 2013,
which I guess uses MongoDB 2.4.0, it says that "storageSize" is
greater than "size"
For seeing information about a whole collection, there is a stats function:
> db.boards.stats()
{
"ns" : "brains.boards",
"count" : 12,
"size" : 32292,
"avgObjSize" : 2691,
"storageSize" : 270336,
"numExtents" : 3,
"nindexes" : 2,
"lastExtentSize" : 212992,
"paddingFactor" : 1.0099999999999825,
"flags" : 1,
"totalIndexSize" : 16352,
"indexSizes" : {
"_id_" : 8176,
"username_1_slug_1" : 8176
},
"ok" : 1
}
"size" is what you’d get if you called Object.bsonsize() on each
element in the collection and added up all the sizes: it’s the actual
number of bytes the document in the collection are taking up.
Equivalently, if you take the "avgObjSize" and multiply it by "count",
you’ll get "size".
As mentioned above, a total count of the documents’ bytes leaves out
some important space a collection uses: the padding around each
document and the indexes. "storage Size" not only includes those, but
also empty space that has been set aside for the collection but not
yet used. Collections always have empty space at the “end” so that new
documents can be added quickly.
On my local computer, I experiment with MongoDB 3.2 to get the
statistics of a collection, and find that "storageSize" is smaller
than "size"
> db.c20160712.stats(1024)
{
"ns" : "TVFS.c20160712",
"count" : 2305,
"size" : 231,
"avgObjSize" : 102,
"storageSize" : 80,
...
Do the meanings of "storageSize" and "size" change in MongoDB 3.2 from 2.x?
If yes, what do they mean in 3.2?
Thanks.

Is there a "ghost collection" in MongoDB

I generated a database test in MongoDB, having a collections named col. The command show dbs gives:
admin (empty)
local 0.078GB
test 1.953GB
(I really don't know, why the size is >1.9 GB, since there is only 21 small documents in the collection col).
The Command db.stats() tells me, that there are 3 collections available:
{
"db" : "test",
"collections" : 3,
"objects" : 25,
"avgObjSize" : 96.64,
"dataSize" : 2416,
"storageSize" : 857456640,
"numExtents" : 19,
"indexes" : 1,
"indexSize" : 8176,
"fileSize" : 2080374784,
"nsSizeMB" : 16,
"dataFileVersion" : {
"major" : 4,
"minor" : 5
},
"extentFreeList" : {
"num" : 2,
"totalSize" : 139264
},
"ok" : 1
}
But when i type show collections, only two collections are listet.
col
system.indexes
So, where does this 3rd collection come from?
And does it explain why the test-Database is 1.9GB large? db.col.stats() tells me, that lots of data is stored, but the 21 documents are really small:
{
"ns" : "test.col",
"count" : 21,
"size" : 2160,
"avgObjSize" : 102,
"storageSize" : 857440256,
"numExtents" : 17,
"nindexes" : 1,
"lastExtentSize" : 227803136,
"paddingFactor" : 1,
"systemFlags" : 1,
"userFlags" : 1,
"totalIndexSize" : 8176,
"indexSizes" : {
"_id_" : 8176
},
"ok" : 1
}
The size
Ok, first, as for the size: show dbs shows the size of the actual data files, not their content.
MongoDB preallocates datafiles of static size, usually beginning with 64MB (+ a namespace file), doubling which each step until 2GB are reached. Each subsequent datafile will have 2GB in size. A new datafile is allocated as soon as the last allocated datafile receives it's first entry, thereby eliminating latency for data file allocation for requests.
It may well be the case that your test database (being the default database) was much bigger, but documents or even whole collections were deleted. However, as of the time of this writing, mongod never reclaims disk space automatically, but leaves allocated datafiles for future use.
The "Ghost" collection
As for the "Ghost" collection: Yes, there is one. It is the namespace collection of the database (not surprisingly called system.namespaces), which is implicit. Have a look at the output of
db.system.namespaces.find()
That being said: Don't fiddle with any of the system.* collections. They don't have their name for fun.

How do i know the size of my mongodb collections so that i can compare that with RAM Size

When i googled for MongoDB best pratices , i found out that , the size of collection in mongodb must be smaller when compared to RAM Size of the CPU
I have got 6 collections in my mongodb Database .
Please tell me how can i know the size of collections present in MongoDB
The status for one of my collection is
db.chains.stats()
{
"ns" : "at.chains",
"count" : 2967,
"size" : 89191980,
"avgObjSize" : 30061.33468149646,
"storageSize" : 335118080,
"numExtents" : 18,
"nindexes" : 3,
"lastExtentSize" : 67136000,
"paddingFactor" : 1.0099999999999996,
"flags" : 1,
"totalIndexSize" : 34742272,
"indexSizes" : {
"_id_" : 155648,
"symbol_1" : 172032,
"unique_symbol_1" : 34414592
},
"ok" : 1
}
Do i need to sum up the size of all the 6 collections i got and compare that with the RAM Size ??
Or is there any other way ??
Thanks in advance .
You just need call db.stats(); in Mongodb console, here is the Mongodb website about your question.
> db.stats();
{
"db" : "test",
"collections" : 5,
"objects" : 24,
"avgObjSize" : 67.33333333333333,
"dataSize" : 1616,
"storageSize" : 28672,
"numExtents" : 5,
"indexes" : 4,
"indexSize" : 32704,
"fileSize" : 201326592,
"nsSizeMB" : 16,
"dataFileVersion" : {
"major" : 4,
"minor" : 5
},
"ok" : 1
You can calculate the size of your collection (chains) by looking at the "size" value. The 89191980 is in bytes, so it's roughly 85 MB. You can find the documentation here: http://docs.mongodb.org/manual/reference/command/collStats/
It'd be a good idea to take a look at this SO thread, they do a good job of covering RAM size & "working set".
What does it mean to fit "working set" into RAM for MongoDB?

Why are my mongodb indexes so large

I have 57M documents in my mongodb collection, which is 19G of data.
My indexes are taking up 10G. Does this sound normal or could I be doing something very wrong! My primary key is 2G.
{
"ns" : "myDatabase.logs",
"count" : 56795183,
"size" : 19995518140,
"avgObjSize" : 352.0636272974065,
"storageSize" : 21217578928,
"numExtents" : 39,
"nindexes" : 4,
"lastExtentSize" : 2146426864,
"paddingFactor" : 1,
"flags" : 1,
"totalIndexSize" : 10753999088,
"indexSizes" : {
"_id_" : 2330814080,
"type_1_playerId_1" : 2999537296,
"type_1_time_-1" : 2344582464,
"type_1_tableId_1" : 3079065248
},
"ok" : 1
}
The index size is determined by the number of documents being indexed, as well as the size of the key (compound keys store more information and will be larger). In this case, the _id index divided by the number of documents is 40 bytes, which seems relatively reasonable.
If you run db.collection.getIndexes(), you can find the index version. If {v : 0}, the index was created prior to mongo 2.0, in which case you should upgrade to {v:1}. This process is documented here: http://www.mongodb.org/display/DOCS/Index+Versions