I use MongoDB 3.2, WiredTiger engine. I am trying to use batch inserts on 10K records, the size of one record about kb. All great, but 60-70 million such records memory ends. Mongo is limited cache in 3GB, but the memory is consumed by memory map files collections and indexes. After some time Mongo CPU load at 100% and stops receiving data. OS: Windows 7. What am I doing wrong? :)
Related
I have the following configuration:
a host machine that runs three docker containers:
Mongodb
Redis
A program using the previous two containers to store data
Both Redis and Mongodb are used to store huge amounts of data. I know Redis needs to keep all its data in RAM and I am fine with this. Unfortunately, what happens is that Mongo starts taking up a lot of RAM and as soon as the host RAM is full (we're talking about 32GB here), either Mongo or Redis crashes.
I have read the following previous questions about this:
Limit MongoDB RAM Usage: apparently most RAM is used up by the WiredTiger cache
MongoDB limit memory: here apparently the problem was log data
Limit the RAM memory usage in MongoDB: here they suggest to limit mongo's memory so that it uses a smaller amount of memory for its cache/logs/data
MongoDB using too much memory: here they say it's WiredTiger caching system which tends to use as much RAM as possible to provide faster access. They also state it's completely okay to limit the WiredTiger cache size, since it handles I/O operations pretty efficiently
Is there any option to limit mongodb memory usage?: caching again, they also add MongoDB uses the LRU (Least Recently Used) cache algorithm to determine which "pages" to release, you will find some more information in these two questions
MongoDB index/RAM relationship: quote: MongoDB keeps what it can of the indexes in RAM. They'll be swaped out on an LRU basis. You'll often see documentation that suggests you should keep your "working set" in memory: if the portions of index you're actually accessing fit in memory, you'll be fine.
how to release the caching which is used by Mongodb?: same answer as in 5.
Now what I appear to understand from all these answers is that:
For faster access it would be better for Mongo to fit all indices in RAM. However, in my case, I am fine with indices partially residing on disk as I have a quite fast SSD.
RAM is mostly used for caching by Mongo.
Considering this, I was expecting Mongo to try and use as much RAM space as possible but being able to function also with few RAM space and fetching most things from disk. However, I limited Mongo Docker container's memory (to 8GB for instance), by using --memory and --memory-swap, but instead of fetching stuff from disk, Mongo just crashed as soon as it ran out of memory.
How can I force Mongo to use only the available memory and to fetch from disk everything that does not fit into memory?
Thanks to #AlexBlex's comment I solved my issue. Apparently the problem was that Docker limited the container's RAM to 8GB but the wiredTiger storage engine was still trying to use up 50% - 1GB of the total system RAM for it's cache (which in my case would have been 15 GB).
Capping wiredTiger's cache size by using this configuration option to a value less than what Docker was allocating solved the problem.
I'm using pymongo to insert a big amount of jsons to MongoDB gridFS + some data to collection.
What I noticed some time ago is that MongoDB consumes just crazy amount of RAM within using single connection. As soon as I close this connection it releases it.
RAM consumption is like 10-12GB in total within connection and 200MB without. The actual size of collection is actually ~300MB with 10-18GB gridFS storage.
Why does it happen? How can opening new connection for any bulky operation can be lot less resource-dependent than using one single connection for everything?
Is it somehow related to Journaling?
I will have to break down this problem into multiple smaller problems for ease of understanding:
It is well known that MongoDB is RAM hungry, it will try to use as much RAM as possible.
GridFS tends to store files in collection fs.chunks and corresponding meta-data in fs.files. The files stored in GridFS are split into chunks of 256KB each.
When you read GridFS data by opening a connection, the chunks belonging to file(s) have to be loaded into the RAM from the disk(if it is not already present in RAM). So , RAM usage is directly proportional to the amount of data stored and importantly frequency of GridFS data access. Just to re-iterate GridFS data gets pulled into RAM if the query references it.
If you have a active connection for large amounts of GridFS data then you should expect heavy RAM usage. But if your query frequency is low(just write, but read rarely) then RAM usage will be relatively lower.If you are mostly writing data, then ensure the connection is closed after the operation in done.
The more the number of open connections, your RAM usage will increase.
This is no-way related to journaling.
Note: GridFS also supports sharding which will tend to solve your problem of excessive RAM usage.
Hope this clarifies.
Since MongoDB 2.0. each connection consumes about 1MB of RAM.
You can read more here.
While doing indexing on MongoDB. Now we have nearly 350 GBs of data in the database and its deployed as a windows service in AWS EC2.
And we are doing indexing for some experimentation. But every time I run the indexing command the memory usage goes to 99% and even after the indexing is done the memory usage keeps like that until I restart the service.
The instance has 30 GB of RAM and SSD drive. And right now we have the DB setup as stand alone (not sharded till now). And we are using the latest version of MongoDB.
Any feedback related to this will be helpful.
Thanks,
Arpan
That's normal behavior for MongoDB.
MongoDB grabs all the RAM it can get to cache each accessed document as long as possible. When you add an index to a collection, each document needs to be read once to build the index, which causes MongoDB to load each document into RAM. It then keeps them in RAM in case you want to access them later. But MongoDB will not squat the RAM. When another process needs memory, MongoDB will willingly release it.
This is explained in the FAQ:
Does MongoDB require a lot of RAM?
Not necessarily. It’s certainly
possible to run MongoDB on a machine with a small amount of free RAM.
MongoDB automatically uses all free memory on the machine as its
cache. System resource monitors show that MongoDB uses a lot of
memory, but its usage is dynamic. If another process suddenly needs
half the server’s RAM, MongoDB will yield cached memory to the other
process.
Technically, the operating system’s virtual memory subsystem manages
MongoDB’s memory. This means that MongoDB will use as much free memory
as it can, swapping to disk as needed. Deployments with enough memory
to fit the application’s working data set in RAM will achieve the best
performance.
See also: FAQ: MongoDB Diagnostics for answers to additional questions
about MongoDB and Memory use.
I am currently using MongoDB to store a single collection of data. This data is 50 GB in size and has about 95 GB of indexes. I am running a machine with 256 GB RAM. I basically want to have all the index and working set in the RAM since the machine is exclusively allocated to mongo.
Currently I see that, though mongo is running with the collection size of 50 GB + Index size of 95 GB, total RAM being used in the machine is less than 20 GB.
Is there a way to force mongo to leverage the RAM available so that it can store all its indexes and working set in memory ?
When your mongod process starts it has none of your data in resident memory. Data is then paged in as it is accessed. Given your collection fits in memory (which is the case here) you can run the touch command on it. On Linux this will call the readahead system call to pull your data and indexes into the filesystem cache, making them available in memory to mongod. On Windows mongod will read the first byte of each page, pulling it in to memory.
If your collection+indexes do not fit into memory, only the tail end of data accessed during the touch will be available.
I am undergoing testing with 64-bit MongoDB. If I configure large size backups then the MongoDB memory utilization seems to be high.
Is there any possibility to reduce the memory utilization by MongoDB.
Is it ideal that MongoDB is using 150 MB memory?
Update /etc/security/limits.conf
Add a line as below:
#mongodb - rss 33554432
and then save the file.
It will take effects right away.
In this example, we assign 32GB memory usage for mongodb group users
For Windows it seems possible to control the amount of memory MongoDB uses, see this tutorial at Captain Codeman:
Limit MongoDB memory use on Windows without Virtualization
Do you mean memory usage or disk space? If you are really talking about memory, then Mongo stores its data in memory-mapped files, giving the illusion of massive memory usage.
You will find more information here:
http://www.mongodb.org/display/DOCS/Checking+Server+Memory+Usage