How do i calculate WiredTiger cache size in a docker container? - mongodb

We run MongoDB mongod processes inside Docker containers in Kubernetes with clear memory limits.
I am trying to configure the mongod processes correctly for the imposed memory limits.
These are the information I could collect from the docs:
The memory usage of MongoDB is correlated to the WiredTiger cache size. Its is calculated using the formula 50% of (RAM - 1 GB) or a minimum of 256 MB https://docs.mongodb.com/manual/core/wiredtiger/#memory-use
RAM is the total amount of ram available on the system. In the case of containerized nodes, it is the available memory to the container (since MongoDB 4.0.9) https://docs.mongodb.com/manual/faq/diagnostics/#must-my-working-set-size-fit-ram
“If you run mongod in a container (e.g. lxc, cgroups, Docker, etc.) that does not have access to all of the RAM available in a system, you must set storage.wiredTiger.engineConfig.cacheSizeGB to a value less than the amount of RAM available in the container.” https://docs.mongodb.com/manual/faq/diagnostics/#must-my-working-set-size-fit-ram
The docs state that increasing the WiredTiger cache size above the default should be avoided. https://docs.mongodb.com/manual/faq/diagnostics/#must-my-working-set-size-fit-ram
This information is a little unclear.
Do I leave the default values of the WiredTiger cache size or do I set it to "a value less than the amount of RAM available in the container"? How much lower should that value be? (a higher value than the default would also contradict the advice to not increase it above the default value)

The default is to allow the WiredTiger cache to use slightly less than half of the total RAM on the system.
The process normally determines the total RAM automatically by querying the underlying operating system.
In the case of a Docker container which has been allocated 16GB of RAM but is running on a host machine that has 128GB RAM, the system call will report 128GB. The default in this case would be 63GB, which obviously would cause a problem.
In general:
Use the default in situations where the system call reports the true memory available in the environment. This includes bare metal, most VMs, cloud providers, etc.
In containers where the amount of memory reported by the system call does not reflect the total amount available to the container, manually make the calculation for what the default would have been if it did, and use that value instead.

Related

How to limit memory size in mongo v3.4

In v4.0.12 or later, wiredTigerMaxCacheOverflowSizeGB option is available to specify the maximum size of "lookaside table" file.
Is there any parameter which can limit the memory usage in mongo v3.4?
wiredTigerMaxCacheOverflowSizeGB limits disk usage, not memory usage.
For memory usage on 3.4 I use:
--wiredTigerCacheSizeGB 0.25
WiredTiger memory usage: see here
Tell MongoDB how much memory exists in the system: see here
Note that generally, limiting the memory available to the database (such as through system-level configuration) is not useful because if such a limit is reached, the database process will typically immediately terminate. Instead one generally would either:
Understand how much memory is required for the workloads being executed, and provision that much memory for the database, or
Limit workloads to use the memory which is available (for example through adding indexes or sharding the data).
Limiting database memory is a database anti-pattern. The primary goal of a database is to keep your working set in memory. if you really want limit memory consumption run MongoDB inside a container with a memory cap.

Postgresql Aurora DB freeable_memory

I have a question regarding the freeable memory for AWS Aurora Postgres.
We recently wanted to create an index on one of our dbs and the db died and made a failover to the slave which all worked fine. It looks like the freeable memory dropped by the configured 500mb of maintenance_work_mem and by that went to around 800mb of memory - right after that the 32gig instance died.
1) I am wondering if the memory that is freeable is the overall system memory and if a low memory here could invoke the system oom killer on the AWS Aurora instance? So we may want to plan in more head room for operational tasks and the running of autovacuum jobs to not encounter this issue again?
2) The actual work of the index creation should then have used the free local storage as far as I understood, so the size of the index shouldn't have mattered, right?
Thanks in advance,
Chris
Regarding 1)
Freeable Memory from (https://forums.aws.amazon.com/thread.jspa?threadID=209720)
The freeable memory includes the amount of physical memory left unused
by the system plus the total amount of buffer or page cache memory
that are free and available.
So it's freeable memory across the entire system. While MySQL is the
main consumer of memory on the host we do have internal processes in
addition to the OS that use up a small amount of additional memory.
If you see your freeable memory near 0 or also start seeing swap usage
then you may need to scale up to a larger instance class or adjust
MySQL memory settings. For example decreasing the
innodb_buffer_pool_size (by default set to 75% of physical memory) is
one way example of adjusting MySQL memory settings.
That also means that if the memory gets low its likely to impact the process in some form. In this thread (https://forums.aws.amazon.com/thread.jspa?messageID=881320&#881320) e.g. it was mentioned that it caused the mysql server to restart.
Regarding 2)
This is like it is described in the documentation (https://aws.amazon.com/premiumsupport/knowledge-center/postgresql-aurora-storage-issue/) so I guess its right and the size shouldn't have mattered.
Storage used for temporary data and logs (local storage). All DB
temporary files (for example, logs and temporary tables) are stored in
the instance local storage. This includes sorting operations, hash
tables, and grouping operations that are required by queries.
Each Aurora instance contains a limited amount of local storage that
is determined by the instance class. Typically, the amount of local
storage is twice the amount of memory on the instance. If you perform
a sort or index creation operation that requires more memory than is
available on your instance, Aurora uses the local storage to fulfill
the operation.

How to calculate the number of CPU, memory and storage that my Google Cloud SQL needs

My DB is reaching the 100% of CPU utilization and increasing the number of CPU is not working anymore.
What kind of information should I consider to create my Google Cloud SQL? How do you set up the DB configuration?
Info I have:
For 10-50 minute a day I have 120 request/seconds and the CPU reaches 100% of utilization
Memory usage is the maximum 2.5GB during this critical period
Storage usage is currently around 1.3GB
Current configuration:
vCPUs: 10
Memory: 10 GB
SSD storage: 50 GB
Unfortunately, there is no magic formula for determining the correct database size. This is because queries have variable load - some are small and simple and take no time at all, others are complex or huge and take lots of resources to complete.
There are generally two strategies to dealing with high load: Reduce your load (use connection pooling, optimize your queries, cache results), or increase the size of your database (add additional CPUs, Storage, or Read replicas).
Usually, when we have CPU utilization, it is because the CPU is overloaded or we have too many database tables in the same instances. Here are some common issues and fixes provided by Google’s documentation:
If CPU utilization is over 98% for 6 hours, your instance is not properly sized for your workload, and it is not covered by the SLA.
If you have 10,000 or more database tables on a single instance, it could result in the instance becoming unresponsive or unable to perform maintenance operations, and the instance is not covered by the SLA.
When the CPU is overloaded, it is recommended to use this documentation to view the percentage of available CPU your instance is using on the Instance details page in the Google Cloud Console.
It is also recommended to monitor your CPU usage and receive alerts at a specified threshold, set up a Stackdriver alert.
Increasing the number of CPUs for your instance should reduce the strain of your instance. Note that changing CPUs requires an instance restart. If your instance is already at the maximum number of CPUs, shard your database to multiple instances.
Google has this very interesting documentation about investigating high utilization and determining whether a system or user task is causing high CPU utilization. You could use it to troubleshoot your instance and find what's causing the high CPU utilization.

Mongodb terminates when it runs out of memory

I have the following configuration:
a host machine that runs three docker containers:
Mongodb
Redis
A program using the previous two containers to store data
Both Redis and Mongodb are used to store huge amounts of data. I know Redis needs to keep all its data in RAM and I am fine with this. Unfortunately, what happens is that Mongo starts taking up a lot of RAM and as soon as the host RAM is full (we're talking about 32GB here), either Mongo or Redis crashes.
I have read the following previous questions about this:
Limit MongoDB RAM Usage: apparently most RAM is used up by the WiredTiger cache
MongoDB limit memory: here apparently the problem was log data
Limit the RAM memory usage in MongoDB: here they suggest to limit mongo's memory so that it uses a smaller amount of memory for its cache/logs/data
MongoDB using too much memory: here they say it's WiredTiger caching system which tends to use as much RAM as possible to provide faster access. They also state it's completely okay to limit the WiredTiger cache size, since it handles I/O operations pretty efficiently
Is there any option to limit mongodb memory usage?: caching again, they also add MongoDB uses the LRU (Least Recently Used) cache algorithm to determine which "pages" to release, you will find some more information in these two questions
MongoDB index/RAM relationship: quote: MongoDB keeps what it can of the indexes in RAM. They'll be swaped out on an LRU basis. You'll often see documentation that suggests you should keep your "working set" in memory: if the portions of index you're actually accessing fit in memory, you'll be fine.
how to release the caching which is used by Mongodb?: same answer as in 5.
Now what I appear to understand from all these answers is that:
For faster access it would be better for Mongo to fit all indices in RAM. However, in my case, I am fine with indices partially residing on disk as I have a quite fast SSD.
RAM is mostly used for caching by Mongo.
Considering this, I was expecting Mongo to try and use as much RAM space as possible but being able to function also with few RAM space and fetching most things from disk. However, I limited Mongo Docker container's memory (to 8GB for instance), by using --memory and --memory-swap, but instead of fetching stuff from disk, Mongo just crashed as soon as it ran out of memory.
How can I force Mongo to use only the available memory and to fetch from disk everything that does not fit into memory?
Thanks to #AlexBlex's comment I solved my issue. Apparently the problem was that Docker limited the container's RAM to 8GB but the wiredTiger storage engine was still trying to use up 50% - 1GB of the total system RAM for it's cache (which in my case would have been 15 GB).
Capping wiredTiger's cache size by using this configuration option to a value less than what Docker was allocating solved the problem.

MongoDB WiredTiger Storage Engine cacheSizeGB configure option

What's the minimum number of cacheSizeGB can I configure which configure option introduce by MongoDB 3.0.0 in WiredTiger Storage Engine?
Represent the number of cacheSizeGB must be integer? Can I configure it with floating number like 15.5?
I cannot find the detail from MongoDB official document.
I know this is quite old, but as I fall on this question looking for an answer, here it is:
Changed in version 3.4: Values can range from 256MB to 10TB and can be a float. In addition, the default value has also changed.
default:
50% of RAM minus 1 GB, or
256 MB.
Avoid increasing the WiredTiger internal cache size above its default value.
If you run mongod in a container (e.g. lxc, cgroups, Docker, etc.) that does not have access to all of the RAM available in a system, you must set storage.wiredTiger.engineConfig.cacheSizeGB to a value less than the amount of RAM available in the container. The exact amount depends on the other processes running in the container.
source: https://docs.mongodb.com/manual/reference/configuration-options/#storage.wiredTiger.engineConfig.cacheSizeGB