Does MongoDB reuse deleted space? - mongodb

First off, I know about this question:
Auto compact the deleted space in mongodb?
My question is not about shrinking DB file sizes though, but more about the reuse of deleted space. Say I have 100K documents in a collection, I then delete 50K of those. Will Mongo reuse the space within its data file that the deleted documents have freed? Or are they simply "marked" as deleted?
I don't care so much about the actual size of the file on disk, its more about "does it just grow and grow".

Update (Mar 2015): As of the 3.0 release, there are multiple storage engines available in MongoDB. This answer applies to the MMAP storage engine (still the default in MongoDB 3.0), the answer for other engines (WiredTiger for example) is quite different and may well be tunable and adjustable. Hence if you are using another engine, please read the relevant docs for that storage engine to determine what your space re-use defaults and options are.
With the MMAP storage engine, when documents are deleted the space left behind is put into a free list. However, to use the space there will need to be similarly sized documents inserted later, and MongoDB will need to find an appropriate space for that document within a certain time frame (once it times out looking at the list, it will just append) otherwise the space re-use is not going to happen very often. This deletion is done within the data files, so there is no disk space reclamation happening here - all of this is done internally within the existing data files.
If you subsequently do a repair, or resync a secondary from scratch, the data files are rewritten and the space on disk will be reclaimed (any padding on docs is also removed). This is where you will see actual space reclamation on-disk. For any other actions (compact included) the on disk usage will not change and may even increase.
With 2.2+ you can now use the collMod command and the usePowersOf2Sizes option to make the re-use of deleted space more likely (note that this is the default in 2.6+). This means that the initial space allocation for a document is a bit less efficient (512 bytes for a 400 byte doc for example) but means that when a new doc is inserted it is more likely to be able to re-use that space. If you are deleting (or growing and hence moving) documents a lot, then this will be more efficient in the long term.
For anyone that is interested, one of the people that wrote a lot of the storage code (Mathias Stearn) has a great presentation about the storage internals, which can be found here

Related

Confused about the advantage of MongoDB gridfs

MongoDB gridfs says the big advantage is that splitting big file to chunks, and then you don't have to load entire file to memory if you just want to see part of the file. But my confusion is that even though I open a big file from local disk I can just use skip() API to just load part of the file which I wanted. I don't have to load the entire file at all. So how come MongoDB says that is the advantage?
Even though cursor.skip() method does not return the entire file, it has to load it into memory. It requires the server to walk from the beginning of the collection or index to get the offset or skip position before beginning to return results(Doesn't greatly affect when collection is small in size).
As the offset increases, cursor.skip() will become slower and more CPU intensive. With larger collections, cursor.skip() may become IO bound.
However, Instead of storing a file in a single document, GridFS divides the file into parts, or chunks, and stores each chunk as a separate document.
Thus, allowing the user to access information from arbitrary sections of files, such as to “skip” to the middle of file(using id or filename) without being CPU intensive.
Official documentations: 1.Skip
2.GridFS.
Update:
About what Peter Brittain is suggesting:
There are many things to consider(infrastructure,presumed usage stats,file size etc.) while one is choosing between filesystem and GridFS.
For example: If you have millions of files, GridFS tends to
handle it better, also you need to consider file system limitations
like the maximum number of files/directory etc.
You might want to consider going through this article:
Why use GridFS over ordinary Filesystem Storage?

MongoDB collection size before/after dump

I have a question regarding MongoDB's collection size.
I did a small stress test in which my MongoDB server was constantly inserting, deleting and updating data for about 48 hours. The documents were only of small size, simply a numerical value and a timestamp as well as an ID.
Now, after those 48 hours, the collection used for inserting, deleting and updating data was 98.000 Bytes and the preallocated storage size was 696.320 Bytes. It has become that much higher than the actual collection size because of one input spike during an insertion phase. Due to following deletions of objects the actual collection size decreased again, the preallocated storage size didn't (AFAIK a common database management problem, since it's the same with e.g. MySQL).
After the stress test was completed I created a dump of my MongoDB database and dropped the database completely, so I could import the dump afterwards again and see how the stats would look then. And as I suspected, the collection size was still the same (98.000 Bytes) but the preallocated storage size went down to 40.960 Bytes (from 696.320 Bytes before).
Since we want to try out MongoDB for an application that produces hundreds of MB of data and therefore I/O traffic every day, we need to keep the database and its occupied space to a minimum. And preferably without having to create a dump, drop the whole database and import the dump again every now and then.
Now my question is: is there a way to call the MongoDB garbage collector functionally from code? The software behind it is a Java software and my idea was to call the garbage collector after a certain amount of time/operations or after the preallocated storage size has reached a certain threshold.
Or maybe there's an ever better (more elegant) way to minimize the occupied space?
Any help would be appreciated and I'll try to provide any further information if needed. Thanks in advance.

MMAP storage Engine in Mongodb

A storage engine acts as a interface which acts between the mongo db server and physical Disc which decides how much memory is required also supports Collection level locking. My question is what happened before version 3.0 ? Who allocated memory before the storage engine ? And how did the locking mechanism work before M MAP
We call it MMAPv1 - the original storage engine of MongoDB because it internally uses the mmap call under the covers in order to implement storage management. Let's look at what the MMAP system call looks like. On Linux, it talks about memory allocation, or mapping files or devices into memory. Causes the pages starting address and continuing for at most length bytes to be mapped from the object described by the file descriptor at an offset. So, what does that really practically mean?
Well, MongoDB practically needs a place to put documents. And it puts the documents inside files. And to do that it initially allocates, let's say a large file. Let's say it allocates a 100GB file on disk. So, we wind up with 100GB file on disk. The disk may or may not be physically contiguous on the actual disk, because there are some algorithms that occur beneath that layer that control the actual allocation of space on a disk. But from our point, it's a 100GB contiguous file. If MongoDB calls mmap system call, it can map this 100GB file into 100GB of virtual memory. To get this big virtual memory, we need to be on a x64 machine. And these are all page-sized. So pages on an OS or either 4k or 16k large. So, there is lot of them inside the 100GB virtual memory. And the operating system is going to decide what can fit in the memory. So, the actual physical memory of the box is let's say 32GB, then if we go to access one of the pages in this memory space, it may not be in memory at any given time. The operating system decides which of these pages are going to be in memory. We're showing the ones available in memory as green ones. So, when we go to read the document, if it hits a page that's in memory, then we get it. If it hits a page that's not in memory (the white ones), the OS has to bring it from the disk.
source
MMAPv1 storage engine provides
Collection level concurrency (locking). Each collection inside MongoDB is it's own file (can be seen in ~\data\db). If multiple writes are fired on the same collection, one has to wait for another to finish. It's a multiple reader. Only one write can happen at a time to a particular collection.
Allows in place updates. So, if a document is sitting here in one of the available (green) page and we do an update to it, then we'll try to update it right in place. And if we can't update it, then what we'll do is we'll mark it as a whole, and then we'll move it somewhere else where there is some space. And finally we'll update it there. In order to make it possible that we update the document in place without having to move it, we uses
Power of 2 sizes when we allocate the initial storage for a document. So, if we try to create a 3bytes document, we'll get 4bytes. 8bytes, if we create 7bytes. 32bytes when creating 19bytes document. In this way, it's possible to grow the document a little bit. And that space that opens up, that we can re-use it more easily.
Also, notice that since, OS decides what is in memory and what is on disk - we cannot do much about it. The OS is smart enough for memory management.
There was only one storage engine before 3.0 - MMAP, which has been the storage engine for MongoDB since the beginning (now usually referred to as MMAPv0, with the version in 3.0 being MMAPv1, though the versioning is not really official like the DB itself).
You couldn't plug in new engines prior to 3.0 nor were there any alternatives built-in so you didn't see a lot of discussion about storage engines as a result. Any presentations (here's a good one if you are interested) prior to 3.0 that discuss storage are implicitly talking about the MMAP storage engine, it just didn't have that name yet.
MMAP has been improved to include collection level locking in 3.0, before that release (in 2.6) the locking granularity was database level and before that (prior to 2.2) it was a global lock.

GridFS disk management

In my environments I can have DB of 5-10 GB or DB of 10 TB (video recordings).
Focusing on the 5-10 GB: if I keep default settings for prealloc an small-files I can actually loose 20-40% of the disk space because of allocations.
In my production environments, the disk size can be 512G, but user can limit DB allocation to only 10G.
To implement this, I have a scheduled task that deletes the old documents from the DB when DB dataSize reached a certain threshold.
I can't use capped-collection (GridFS, sharding limitation, cannot delete random documents..), I can't use --no-prealloc/small-files flags, cause i need the files insert to be efficient.
So what happens, is this: if dataSize gets to 10G, the fileSize would be at least 12G, so I need to take that in consideration and lower the threshold in 2GB (and lose a lot of disk space).
What I do want, is to tell mongo to pre-allocate all the 10 GB the user requested, and disable further pre-alloc.
For example, running mongod with --no-prealloc and --small-files, but pre-allocate in advance all the 10 GB.
Another protection I gain here, is protecting the user against sudden disk-full errors. If he regularly downloads Game of Thrones episodes to the same drive, he can't take space from the DB 10G, since it's already pre-allocated.
(using C# driver)
I think I found a solution: You might want to look at the --quota and --quotafiles command line opts. In your case, you also might want to add the --smalfiles option. So
mongod --smallfiles --quota --quotafiles 11
should give you a size of exactly 10224 MB for your data, which, adding the default namespace file size of 16MB equals your target size of 10GB, excluding indices.
The following applies to regular collections as per documentation. But since metadata can be attached to files, it might very well apply to GridFS as well.
MongoDB uses what is called a record to store data. A record consists of two parts: the actual data and something which is called "padding". The padding is basically unused data which is used if the document grows in size. The reason for that is that a document or file chunk in GridFS respectively never gets fragmented to enhance query performance. So what would happen when the document or a file chunk grows in size is that it had to be moved to a different location in the datafile(s) every time the file is modified, which can be a very costly operation in terms of IO and time. So with the default settings, if the document or file chunk grows in size is that the padding is used instead of moving the file, thus reducing the need of moving around data in the data file and thereby improving performance. Only if the growth of the data exceeds the preallocated padding the document or file chunk is moved within the datafile(s).
The default strategy for preallocating padding space is "usePowerOf2Sizes", which determines the padding size by taking the document size and uses the next power of two size as the size preallocated for the document. Say we have a 47 byte document, the usePowerOf2Sizes strategy would preallocate 64 bytes for that document, resulting in a padding of 17 bytes.
There is another preallocation strategy, however. It is called "exactFit". It determines the padding space by multiplying the document size with a dynamically computed "paddingFactor". As far as I understood, the padding factor is determined by the average document growth in the respective collection. Since we are talking of static files in your case, the padding factor should always be 0, and because of this, there should not be any "lost" space any more.
So I think a possible solution would be to change the allocation strategy for both the files and the chunks collection to exactFit. Could you try that and share your findings with us?

Why does MongoDB takes up so much space?

I am trying to store records with a set of doubles and ints (around 15-20) in mongoDB. The records mostly (99.99%) have the same structure.
When I store the data in a root which is a very structured data storing format, the file is around 2.5GB for 22.5 Million records. For Mongo, however, the database size (from command show dbs) is around 21GB, whereas the data size (from db.collection.stats()) is around 13GB.
This is a huge overhead (Clarify: 13GB vs 2.5GB, I'm not even talking about the 21GB), and I guess it is because it stores both keys and values. So the question is, why and how Mongo doesn't do a better job in making it smaller?
But the main question is, what is the performance impact in this? I have 4 indexes and they come out to be 3GB, so running the server on a single 8GB machine can become a problem if I double the amount of data and try to keep a large working set in memory.
Any guesses into if I should be using SQL or some other DB? or maybe just keep working with ROOT files if anyone has tried them?
Basically, this is mongo preparing for the insertion of data. Mongo performs prealocation of storage for data to prevent (or minimize) fragmentation on the disk. This prealocation is observed in the form of a file that the mongod instance creates.
First it creates a 64MB file, next 128MB, next 512MB, and on and on until it reaches files of 2GB (the maximum size of prealocated data files).
There are some more things that mongo does that might be suspect to using more disk space, things like journaling...
For much, much more info on how mongoDB uses storage space, you can take a look at this page and in specific the section titled Why are the files in my data directory larger than the data in my database?
There are some things that you can do to minimize the space that is used, but these tequniques (such as using the --smallfiles option) are usually only recommended for development and testing use - never for production.
Question: Should you use SQL or MongoDB?
Answer: It depends.
Better way to ask the question: Should you use use a relational database or a document database?
Answer:
If your data is highly structured (every row has the same fields), or you rely heavily on foreign keys and you need strong transactional integrity on operations that use those related records... use a relational database.
If your records are heterogeneous (different fields per document) or have variable length fields (arrays) or have embedded documents (hierarchical)... use a document database.
My current software project uses both. Use the right tool for the job!