Processor, RAM, and disk usage - mongodb

How can I know the consumption of RAM, processor and disk that MongoDB takes when I'm doing find queries, insert queries, update queries, bulk queries, etc.
I though about MongoPerf but it only shows me disk usage, although is awesome cause can create threads, choose an amount of gb, and read or write. But I need to know how much RAM it takes too, and processor
It could be like doing htop for MongoDB

You could use the ps(1) command (I guess you are on Linux).
Programmatically, you could (on Linux) use the /proc/ file system (which is used by ps, top, htop). For details, read proc(5).
To get the pid of your MongoDb process, you could use pidof(1) or pgrep(1). If the pid of mongod server is 1234, you should be interested by /proc/1234/status.
Notice that (on Linux) a process does not directly consume RAM. The (mongod server) process has a virtual address space, and the kernel manages the RAM (and dispatches it among-st processes). You could be interested by the resident set size (and you can query it with ps or via /proc/)
The virtual address space of process of pid 1234 can be queried via /proc/1234/status and /proc/1234/maps (see also pmap(1)).
If you are not familiar with /proc/ play first with it on the command line, for your shell, by running cat /proc/$$/status and cat /proc/$$/maps and exploring /proc/$$/.
On my machine, sudo cat /proc/$(pidof mongod)/status gives some interesting output.

Related

How to dump specific process to RAM file

I use DumpIt tool to dump RAM to raw file for forensic job. But it have a problem that, sometime I just want to dump specific process like cmd or notepad but not dump all current process.
my Dumpit program I used
Use Sysinternals ProcDump. It can take dumps manually or by condition (like high CPU usage or unhandled exception).
Also you can take a dump using Process manager (but need to be careful and use 32-bit process manager if your target process is 32 bit as well)

MongoDB cant access document above at specific skip

I have a MongoDB instance in a cloud on AWS EC2 t2.micro (30GB storage, 1GB ram) running in Docker and in that database I have a single collection which stores 411 thousand documents, an this takes ~700MB disk space.
On my local computer, if I run this in mongo shell:
db.my_collection.find().skip(200000).limit(1)
then I get the correct results, but if I run this
db.my_collection.find().skip(220000).limit(1)
then MongoDB shuts down. Why? What should I do, to access these data?
It appears that your system doesn't have enough RAM to fulfill mongodb demand. When a Linux system is critically low in memory, kernel starts killing processes to avoid system crash itself.
I believe, this is what happening in your case too. Mongodb is not even getting chance to write a log. I'd recommend to increase RAM or if it's not feasible, add more swap space. This will prevent system crash but mongodb will keep working though very very slow.
Please visit these excellent resources on Linux and it's behavior.
https://unix.stackexchange.com/questions/136291/will-linux-start-killing-my-processes-without-asking-me-if-memory-gets-short
https://serverfault.com/questions/480266/how-to-know-if-the-server-runs-out-of-ram-before-crashing-down

Memory issues with new version of Mongodb

I'm using Mongodb on my Windows server 2012 for more than two years. Since the last update some weird issues started to happen which in the end lead to usage of the entire RAM memory.
The service Iv'e configured for Mongodb is as follows:
logpath=d:\data\log\mongod.log
dbpath=d:\data\db
storageEngine=wiredTiger
rest=true
#override port
port=27017
#configsvr = true
shardsvr = true
And in order to limit the Cache memory usage Iv'e added the following line:
wiredTigerCacheSizeGB=10
And this is where the weird stuff started happening. When I check the task manager it says that now Mongodb is really limited to 10GB as I defined in the service but it is actually using a lot more than 10GB.
In the first image you can see the memory consumption sorted by RAM consumption
While in fact the machine I'm using has 28GB in total
This crazy consumption leads to failure in the scripts I'm running, even the most basic ones, even when I only run simple queries like 'count' or 'distinct', I believe that this is a direct results of the memory consumption.
When I checked the log files I saw that there are many open connections that even when the session ends it indicates that still the same amount of connections is opened:
So in the end I have two major questions:
1. Is there a way of solving this issue without downgrading the Mongodb version?
2. The config file looks right? is everything there is necessary?
Memory usage in WiredTiger is a two-level cache:
First is the WiredTiger cache as controlled by --wiredTigerCacheSizeGB
Second is the Operating System filesystem cache. MongoDB automatically uses all free memory that is not used by the WiredTiger cache or by other processes
See also WiredTiger memory usage
For OS filesystem cache, MongoDB doesn't manage the memory it uses directly - it lets the OS manage it. Windows will try to use every last scrap of physical memory if it can - but lots of it should and will be thrown out if other processes request memory.
An alternative is to run mongod in a container (e.g. lxc, cgroups, Docker, etc.) that does not have access to all of the RAM available in a system.
Having said the above:
You are also running another database in the server i.e. mysqld. MongoDB, like some databases will perform better on a dedicated server to reduce memory contention.
Task Manager shows mongod is using 10GB, although the machine is using up to ~28GB. This may or may not be mongod as you have other processes as well.
Useful resources:
FAQ: Memory diagnostics for WiredTiger
FAQ: MongoDB Cache Handling
MongoDB Production Notes

Mongodb hotfix KB2731284

I installed MongoDb on a windows server 2008 R2 and the hotfix KB2731284 is not installed, but I cannot restart the server easily.
In the hotfix description, I got this message "You run an application that uses the FlushViewOfFile() function to clean up memory-mapped files from the paged memory pool." (see https://support.microsoft.com/en-us/kb/2731284)
My question is, when the funtion FlushViewOfFile() is called? My application is just writing in a collection and get data from it. Do I risk to get some wrong behaviors?
I think you can run MongoDb without applying the Hotfix, but I would not recommend it. In long time you may run into problems. They have included some fixes in MongoDB to workaround the problem.
A detailed description of the problem can be found here and here.
See also this.
On Windows, Memory Mapped File flushes are synchronous operations. When the OS Virtual Memory Manager is asked to flush a memory mapped file, it makes a synchronous write request to the file cache manager in the OS. This causes large I/O stalls on Windows systems with high Disk IO latency, while on Linux the same writes are asynchronous.
The problem becomes critical on high-latency disk drives like Azure persistent storage (10ms). This behavior results in very long bg flush times, capping disk IOPS at 100. On low latency storage (local storage and AWS) the problem is not that visible.
On Windows 7 and Windows Server 2008 R2 when applying the hotfix you get a better file allocation performance what is relevant for MongoDB

Running MongoDB and Redis on two different containers in the same host machine

I have read somewhere that MongoDB and Redis server shouldn't be executed in the same host because the way that Redis manages the memory damages MongoDb. This is before Docker.io. But now thing seems are pretty different or not? Is is convenient running Redis server and MongoDB on two different containers on the same host machine?
Docker does not change your hardware, also it is the OS that deals with resources which is not virtualized so the same rules as a normal hardware should apply here.
RAM
MongoDB and Redis don't share any memory. The problem of using the same host will be that you can run out of RAM with these two processes, you can put a max size for redis, you can probably do the same for MongoDB, it is mandatory.
If your sizing is good (MongoDB RAM + Redis RAM < Hardware RAM), you won't get any swap on disk for redis (which is absolutely what you want to prevent) but maybe mongodb cache won't be as good (not enough place for optimization). Less memory for redis is always a challenge if your data grows: beware of out of memory if the data size is unpredictable!
If you use backups with redis, it uses more RAM than its dataset to produce the dump, so beware of that. It implies also using IO.
IO
In this case (less RAM) mongo will do a lot more of IO to access data. Redis, depending on your backup policy, can use IO or not (your choice). Worst case: if you use AOF on redis, it is a lot of IO so maybe IO can become a bottleneck in this architecture. If you don't use backups with redis: you won't have problems. Also a SSD is a good choice for Mongo.
CPU
I don't know if MongoDB uses a lot of CPU, but redis most of the time does not except during backups. If you use backups with redis: try to have two CPU cores available for it (one for redis, one for backup task).
Network
It depends on your number of clients. But you should check the throughput / input load of your machine to see if you are not saturating (using monit for instance with alerts). Sometimes it is the bottleneck, not enought throughput in one machine!
Many of today's services, in particular Databases, are very aggressive consuming resources and are designed thinking they will (or should) be executed in a dedicated machine for them. MongoDB and Redis try to keep a lot of data in memory and will try to take the more memory they can for themselves. To avoid this services take all the memory of your host machine you can limit the maximum memory used by a container using -m="<number><optional unit>" in docker run. E.g.: docker run -d -m="2g" -p 27017:27017 --name mongodb dockerfile/mongodb
So you can control in an easy way the resource limits of your services, and run them in the same host with a fine grained control of the resources. Anyway it's important to consider that the performance of these services is designed thought that the resources of the host machine will be fully available for them. For example there are other databases as Cassandra that will consume a lot of memory, and furthermore, are designed to have sequential access writing to disk. In these cases Docker will let you to run limiting the resources used, but if you run multiple services in the same host the performance of them will decrease severely.