MongoDB WiredTiger Storage Engine cacheSizeGB configure option - mongodb

What's the minimum number of cacheSizeGB can I configure which configure option introduce by MongoDB 3.0.0 in WiredTiger Storage Engine?
Represent the number of cacheSizeGB must be integer? Can I configure it with floating number like 15.5?
I cannot find the detail from MongoDB official document.

I know this is quite old, but as I fall on this question looking for an answer, here it is:
Changed in version 3.4: Values can range from 256MB to 10TB and can be a float. In addition, the default value has also changed.
default:
50% of RAM minus 1 GB, or
256 MB.
Avoid increasing the WiredTiger internal cache size above its default value.
If you run mongod in a container (e.g. lxc, cgroups, Docker, etc.) that does not have access to all of the RAM available in a system, you must set storage.wiredTiger.engineConfig.cacheSizeGB to a value less than the amount of RAM available in the container. The exact amount depends on the other processes running in the container.
source: https://docs.mongodb.com/manual/reference/configuration-options/#storage.wiredTiger.engineConfig.cacheSizeGB

Related

How do i calculate WiredTiger cache size in a docker container?

We run MongoDB mongod processes inside Docker containers in Kubernetes with clear memory limits.
I am trying to configure the mongod processes correctly for the imposed memory limits.
These are the information I could collect from the docs:
The memory usage of MongoDB is correlated to the WiredTiger cache size. Its is calculated using the formula 50% of (RAM - 1 GB) or a minimum of 256 MB https://docs.mongodb.com/manual/core/wiredtiger/#memory-use
RAM is the total amount of ram available on the system. In the case of containerized nodes, it is the available memory to the container (since MongoDB 4.0.9) https://docs.mongodb.com/manual/faq/diagnostics/#must-my-working-set-size-fit-ram
“If you run mongod in a container (e.g. lxc, cgroups, Docker, etc.) that does not have access to all of the RAM available in a system, you must set storage.wiredTiger.engineConfig.cacheSizeGB to a value less than the amount of RAM available in the container.” https://docs.mongodb.com/manual/faq/diagnostics/#must-my-working-set-size-fit-ram
The docs state that increasing the WiredTiger cache size above the default should be avoided. https://docs.mongodb.com/manual/faq/diagnostics/#must-my-working-set-size-fit-ram
This information is a little unclear.
Do I leave the default values of the WiredTiger cache size or do I set it to "a value less than the amount of RAM available in the container"? How much lower should that value be? (a higher value than the default would also contradict the advice to not increase it above the default value)
The default is to allow the WiredTiger cache to use slightly less than half of the total RAM on the system.
The process normally determines the total RAM automatically by querying the underlying operating system.
In the case of a Docker container which has been allocated 16GB of RAM but is running on a host machine that has 128GB RAM, the system call will report 128GB. The default in this case would be 63GB, which obviously would cause a problem.
In general:
Use the default in situations where the system call reports the true memory available in the environment. This includes bare metal, most VMs, cloud providers, etc.
In containers where the amount of memory reported by the system call does not reflect the total amount available to the container, manually make the calculation for what the default would have been if it did, and use that value instead.

MongoDB set maximum size limit on data files

I downloaded MongoDb on my local desktop. Since I have other important stuff on this machine, I need to make sure MongoDB data files do not occupy more than 10GB of disk storage at any time. If it exceeds 10GB, I expect an error message while inserting new documents.
Is there a way to set this max size on disk via the config file?
As at MongoDB 4.0, there isn't a general configuration option to limit the maximum size on disk for a deployment.
However, there are several ways you could limit storage usage on your desktop:
Use a separate partition or storage volume for your MongoDB dbPath
Run MongoDB inside a virtual machine or container with a maximum storage allocation
Connect to a hosted MongoDB deployment
In general it is a bad idea to let your database server run out of space as this may result in unexpected errors or shutdown depending on the operations that are trying to complete at the point when the server runs out of space. Typically you would want to have a process monitoring storage so you can proactively free some space before the issue becomes critical.
A few space saving tips that might help:
Rotate your MongoDB log files regularly to limit storage usage. If you are using a Unix/Linux system you can configure the logrotate utility to rotate and compress logs when they reach a target filesize and subsequently remove archived logs when they reach a certain age.
Consider using TTL indexes to automatically remove old data from collections. This can be useful if you have collections with ephemeral data like user sessions that will become stale after an expiry date.
Drop unused indexes & collections. The WiredTiger storage engine (default in MongoDB 3.2+) allocates a file per collection and index, so dropping either of those should immediately free up storage space.

High level of fragmentation with MongoDB 2.2.1

On a legacy system that is running MongoDB 2.2.1 we are running out of disk space due to excessively large database files. Our actual data size is just under 3 GB, with about 1.7 GB index size, but the storage size is over 70 GB. So, the storage to data+index ratio is close to factor 15. There are about 40 data files, most of which are at the 2 GB maximum file size.
We are contemplating to run a compact() or repair() to regain some of the unused space, but we are worried about the problem recurring soon after. It seems that the current configuration (pretty close to the default configuration) is not suitable for the database usage pattern of our application.
What other tools, diagnostics, remedies or configuration changes are available that could help MongoDB make better use of the disk space?
WiredTiger, used in MongoDB 3.0 and later, is much more efficient in terms of disk usage.
However, migrating from MongoDB 2.2 to 3.0 is going to be a huge leap.
Another option, assuming this is configured as a replica set, is to re-sync the Secondary nodes individually and then perform a failover. This will have the same affect as performing a repair without the downtime that would occur as a result of using the repairDatabase command.

Mongodb terminates when it runs out of memory

I have the following configuration:
a host machine that runs three docker containers:
Mongodb
Redis
A program using the previous two containers to store data
Both Redis and Mongodb are used to store huge amounts of data. I know Redis needs to keep all its data in RAM and I am fine with this. Unfortunately, what happens is that Mongo starts taking up a lot of RAM and as soon as the host RAM is full (we're talking about 32GB here), either Mongo or Redis crashes.
I have read the following previous questions about this:
Limit MongoDB RAM Usage: apparently most RAM is used up by the WiredTiger cache
MongoDB limit memory: here apparently the problem was log data
Limit the RAM memory usage in MongoDB: here they suggest to limit mongo's memory so that it uses a smaller amount of memory for its cache/logs/data
MongoDB using too much memory: here they say it's WiredTiger caching system which tends to use as much RAM as possible to provide faster access. They also state it's completely okay to limit the WiredTiger cache size, since it handles I/O operations pretty efficiently
Is there any option to limit mongodb memory usage?: caching again, they also add MongoDB uses the LRU (Least Recently Used) cache algorithm to determine which "pages" to release, you will find some more information in these two questions
MongoDB index/RAM relationship: quote: MongoDB keeps what it can of the indexes in RAM. They'll be swaped out on an LRU basis. You'll often see documentation that suggests you should keep your "working set" in memory: if the portions of index you're actually accessing fit in memory, you'll be fine.
how to release the caching which is used by Mongodb?: same answer as in 5.
Now what I appear to understand from all these answers is that:
For faster access it would be better for Mongo to fit all indices in RAM. However, in my case, I am fine with indices partially residing on disk as I have a quite fast SSD.
RAM is mostly used for caching by Mongo.
Considering this, I was expecting Mongo to try and use as much RAM space as possible but being able to function also with few RAM space and fetching most things from disk. However, I limited Mongo Docker container's memory (to 8GB for instance), by using --memory and --memory-swap, but instead of fetching stuff from disk, Mongo just crashed as soon as it ran out of memory.
How can I force Mongo to use only the available memory and to fetch from disk everything that does not fit into memory?
Thanks to #AlexBlex's comment I solved my issue. Apparently the problem was that Docker limited the container's RAM to 8GB but the wiredTiger storage engine was still trying to use up 50% - 1GB of the total system RAM for it's cache (which in my case would have been 15 GB).
Capping wiredTiger's cache size by using this configuration option to a value less than what Docker was allocating solved the problem.

MongoDB scaling

How much a MongoDB can scale? I heard a talk about 32bit system have 2-4GB of space available or something like that? Can it save 32GB of data in a single Mongo database in a computer and support querying that 32GB of data from that database using a regular query?
How powerful is MongoDB anyway in terms of size? And when/if the sharding comes into play. I'm looking for a gigantic database as long as the disk permits using MongoDB? It would be funny if MongoDB supports 4GB per database. I'm looking towards 200GB of storage in 5 collections in 1 mongo database in 1 computer running Mongo.
It's true that a single instance of MongoDB on a 32-bit system supports up to 2Gb of data. This is due to the storage engine being directly built on top of memory mapped files which have a maximum addressable space of 2Gb.
That said, I'd say very few, if any, companies will actually run a production database on 32-bit hardware so it's hardly ever an issue. On 64-bit builds the theoretical maximum storage is 2^63, but that's obviously well beyond the size of any real world dataset.
So, on a single 64-bit system you can very easily run 200Gb of data. Whether or not you want to on a production environment is another question. If you only run a single instance there's no real fail-over available. With journaling enabled and safe writes (w >= 1) you should be relatively fine though.
You can have a look at this document about sharding and scaling limits:
http://www.mongodb.org/display/DOCS/Sharding+Limits
Scale Limits
Goal is support of systems of up to 1,000 shards. Testing so far has
been limited to clusters with a modest number of shards (e.g., 100).