mongodb memory usage is going high even if only insertions are made - mongodb

I am using mongodb for only inserting documents. There are no indexes created for the collection I use. But I see that memory used by Mongodb is going high. Machine is having 20GB of RAM which is completely used. I would like to know the reason for this and is this normal?

There is an excellent discussion of how MongoDB uses storage and memory here:
http://docs.mongodb.org/manual/faq/storage/
Specifically to your case, Mongo memory maps the files that data and indexes live in (the ones inside of /data/db directory by default) which means that the files are mapped to OS's virtual address space and then whenever MongoDB accesses a page (any part of the file) that page will get pulled into RAM and it will stay there until all of RAM that's made available to mongod process by the OS is used (at that point least-recently-used pages will be swapped out of RAM).
You are inserting data and therefore you are writing to data files - those pages you are writing to need to be in RAM (mongo writes to files but since they are memory mapped it gets to interact with memory as if it's disk storage). If mongod is using 20GB+ of RAM that means your data plus your indexes (plus some overhead for other things) are 20GB or more.

Related

MMAP storage Engine in Mongodb

A storage engine acts as a interface which acts between the mongo db server and physical Disc which decides how much memory is required also supports Collection level locking. My question is what happened before version 3.0 ? Who allocated memory before the storage engine ? And how did the locking mechanism work before M MAP
We call it MMAPv1 - the original storage engine of MongoDB because it internally uses the mmap call under the covers in order to implement storage management. Let's look at what the MMAP system call looks like. On Linux, it talks about memory allocation, or mapping files or devices into memory. Causes the pages starting address and continuing for at most length bytes to be mapped from the object described by the file descriptor at an offset. So, what does that really practically mean?
Well, MongoDB practically needs a place to put documents. And it puts the documents inside files. And to do that it initially allocates, let's say a large file. Let's say it allocates a 100GB file on disk. So, we wind up with 100GB file on disk. The disk may or may not be physically contiguous on the actual disk, because there are some algorithms that occur beneath that layer that control the actual allocation of space on a disk. But from our point, it's a 100GB contiguous file. If MongoDB calls mmap system call, it can map this 100GB file into 100GB of virtual memory. To get this big virtual memory, we need to be on a x64 machine. And these are all page-sized. So pages on an OS or either 4k or 16k large. So, there is lot of them inside the 100GB virtual memory. And the operating system is going to decide what can fit in the memory. So, the actual physical memory of the box is let's say 32GB, then if we go to access one of the pages in this memory space, it may not be in memory at any given time. The operating system decides which of these pages are going to be in memory. We're showing the ones available in memory as green ones. So, when we go to read the document, if it hits a page that's in memory, then we get it. If it hits a page that's not in memory (the white ones), the OS has to bring it from the disk.
source
MMAPv1 storage engine provides
Collection level concurrency (locking). Each collection inside MongoDB is it's own file (can be seen in ~\data\db). If multiple writes are fired on the same collection, one has to wait for another to finish. It's a multiple reader. Only one write can happen at a time to a particular collection.
Allows in place updates. So, if a document is sitting here in one of the available (green) page and we do an update to it, then we'll try to update it right in place. And if we can't update it, then what we'll do is we'll mark it as a whole, and then we'll move it somewhere else where there is some space. And finally we'll update it there. In order to make it possible that we update the document in place without having to move it, we uses
Power of 2 sizes when we allocate the initial storage for a document. So, if we try to create a 3bytes document, we'll get 4bytes. 8bytes, if we create 7bytes. 32bytes when creating 19bytes document. In this way, it's possible to grow the document a little bit. And that space that opens up, that we can re-use it more easily.
Also, notice that since, OS decides what is in memory and what is on disk - we cannot do much about it. The OS is smart enough for memory management.
There was only one storage engine before 3.0 - MMAP, which has been the storage engine for MongoDB since the beginning (now usually referred to as MMAPv0, with the version in 3.0 being MMAPv1, though the versioning is not really official like the DB itself).
You couldn't plug in new engines prior to 3.0 nor were there any alternatives built-in so you didn't see a lot of discussion about storage engines as a result. Any presentations (here's a good one if you are interested) prior to 3.0 that discuss storage are implicitly talking about the MMAP storage engine, it just didn't have that name yet.
MMAP has been improved to include collection level locking in 3.0, before that release (in 2.6) the locking granularity was database level and before that (prior to 2.2) it was a global lock.

What is memory map in mongodb?

I read about this topic at
http://docs.mongodb.org/manual/faq/storage/#faq-storage-memory-mapped-files
But didn't understand point .Does it is used to keep query data in physical memory ? How it is related with virtual memory ? Why it is important and how it effect at performance ?
I'll try to explain in a simple way.
MongoDB (and other storage systems) stores data in files. Each database has its own files, created as they are needed. The first file weights 64 MB, the next 128 and so up to 2 GB. Then, new files created weigh 2 GB. Each of these files are logically divided into different blocks, that correspond with one virtual memory block.
When MongoDB needs to access a file or a part of it, loads all virtual blocks corresponding to that file or parts of the files into memory using mmap.On the other hand, mmap is a way for applications to leverage the system cache (linux).
So what really happens when you are doing a query is that MongoDB "tells" the OS to load the part it needs with the data requested, so the next time is requested will be faster. As you can imagine this is a very important feature to boost performance in databases like MongoDB, because accessing RAM is way faster than hard drive.
Another benefit of using mmap is that MongoDB memory will grow as it needs and the system memory is free.

How to force the MongoDB to save the text index in SSD, Not in RAM

My server just has 2G RAM, and I have 100G text and many other key fields which must be indexed to RAM, full text search is not my key function, I hope the mongoDB don't load the text index to RAM. How to config the server.
Thanks.
MongoDB implements caching by storing everything in memory-mapped files. This is a feature which is managed by your operating system, not by MongoDB.
Most operating systems are quite smart at detecting which memory-mapped files are accessed how often. When your text index is used much less frequently than other data, it will usually be the first to be suspended to hard drive.

mongod clean memory used in ram

I have a huge amount of data in my mongodb. It's filled with tweets (50 GB) and my Ram is 8 GB. When querying it retrieves all tweets and mongodb starts filling the ram, when it reaches 8 GB it starts moving files to disk. This is the part where it gets really slowwwww. So i changed the query from skipping and starting using indexes. Now i have indexes and i query only 8GB to my program, save the id of the last tweet used in a file and the program stops. Then restart the program and it goes get the id of the tweet from the file. But mogod server still is ocupping the ram with the first 8GB, that no longer will be used, because i have a index to the last. How can i clean the memory of the mongo db server without restarting it?
(running in a win)
I am a bit confused by your logic here.
So i changed the query from skipping and starting using indexes. Now i have indexes and i query only 8GB to my program, save the id of the last tweet used in a file and the program stops.
Using ranged queries will not help the amount of data you have to page in (in fact it might worsen it because of the index), it merely makes the query faster server side by using an index for huge skips (like 42K+ row skip). If you are dong the same as that skip() but in index then (without a covered index) then you are still paging in exactly the same.
It is slow due to memory mapping and your working set. You have more data than RAM and not only that but you are using more of that data than you have RAM as such you are page faulting probably all the time.
Restarting the program will not solve this, nor will clearing its data OS side (with restart or specific command) because of your queries. You probably need to either:
Think about your queries so that your working set is more in line to your memory
Or shard your data across many servers so that you don't have to build up your primary server
Or get a bigger primary server (moar RAM!!!!!)
Edit
The LRU of your OS should be swapping out old data already since MongoDB is using its fully allocated lot, which means that if that 8GB isn't swapped it is because your working set is taking that full 8GB (most likely with some swap on the end).

MongoDB consumes a lot of memory

For more than a month is my war with mongoDB. Until I lose =] ...
Battle 1. Battle 2.
And now a new problem. Again, not enough memory.
Initially, this was solved by simply increasing the memory at a rate of VPS. Then journal = false. But now I got to the top of your plan and continue to increase the memory is not possible.
For my base are lacking 4 GB of memory.
How should I choose a database for the project, was nowhere written that there are so many mongoDB memory. With about 10 million records in the mongoDB missing 4 GB of memory, when my MySQL database with 10 million easily copes with 1.4 GB of memory.
The problem as I understand it, a large number of index fields. But since I can not log into the database, respectively, can not remove them. They needed me in the early stages of development, now they are not important to me.
Tell me please, can I remove them somehow?
There is a dump of the database is completely whole folder database / data / db
On my PC with 4 GB of memory database does not start on a VPS with 4GB same.
As an alternative, I think to take a test period at some VPS / VDS to run mongo and delete keys.
Do you know a web hosting with a test period and 6 GB of memory?
Or if there is an alternative, could you say what?
The issues has very little to do with the size of your data set. MongoDB uses memory mapped files for its storage engine. As such it'll start swapping in pages of hot data into memory when it can and it does so fairly aggressively (or more accurately, the OS memory management does).
Basically it uses as much memory as is available to it and there's very little you can do to avoid it. All data pages (be it actual data or indexes) that are accessed during operation will be swapped into memory if there is space available.
There are plenty of references to this on the internet and on mongodb.org by the way. Saying it isn't mentioned anywhere isn't really true.