I have two instances running on AWS (EC2). One instance is running only mongodb server while the other one is running a multi process python program that acquires info from the remote mongo server.
On the python instance I am using pymongo, and each process establishes connection (MongoClient) independently.
While monitoring the CPU utilization of the mongo's instance, I get very low CPU usage (about 2%).
In the free monitoring tool (https://cloud.mongodb.com/freemonitoring/cluster), I get about 40% CPU utilization.
Why there is such a big difference between the two values?
Does the mongodb needs to be special configured in order to utilize multiple CPU's cores?
Does the mongodb needs to be special configured in order to utilize multiple CPU's cores?
No.
Why there is such a big difference between the two values?
You have not described where the 2% value came from or what it is measuring, hence this question is impossible to answer.
Related
I'm running a single instance mongodb node using docker-compose
I'm trying to update 40k documents.
I tried running this on 3 setups of aws instances - 2x4 (cpu/mem), 4x8 and 16x32.
It seems that even after adding more cpu, the task doesn't run faster.
Also in top it shows that cpu of mongodb is always 100% during this task.
Any way to improve this?
Use iostat to determine what the cpu is spent on.
If it is spent on iowait, your system is i/o bound and adding more cpu won't help - you need faster disks.
MacOS with mongodb-community#4.2 (installed using brew)
TLDR: MongoDB is only running as one process, seemingly not taking advantage of the 7 other available CPU cores.
I'm running a simple NodeJS application with PM2, making use of all 8 of my CPU cores.
Using Apache Benchmark, I try to stress-test the application for retrieving data. The endpoint I am hitting retrieves data from my MongoDB database. (Only reading, no write operations are performed).
During the stress-test I get these results:
There are 8 active NodeJS processes
There is only 1 active MongoDB process
CPU usage indicates that MongoDB is the bottleneck. How can I ensure that MongoDB takes advantage of more cores?
Screenshot from TOP:
Why is MongoDB only making use of 1 process/core?
Can I increase performance by configuring it to use more than one process/core?
Some additional information, serverStatus() run during the stress-test:
MongoDB (as any database) works with single process to ensure consistency, it uses locking and other concurrency control measures to prevent multiple clients from modifying the same piece of data simultaneously.
MongoDB Performance
In some cases, the number of connections between the applications and the database can overwhelm the ability of the server to handle requests. The following fields in the serverStatus document can provide insight:
connections is a container for the following two fields:
connections.current the total number of current clients connected to the database instance.
connections.available the total number of unused connections available for new clients.
If there are numerous concurrent application requests, the database may have trouble keeping up with demand. If this is the case, then you will need to increase the capacity of your deployment.
For read-heavy applications, increase the size of your replica set and distribute read operations to secondary members.
For write-heavy applications, deploy sharding and add one or more shards to a sharded cluster to distribute load among mongod instances.
https://docs.mongodb.com/manual/administration/analyzing-mongodb-performance/#number-of-connections
RAM- 4GB,
PROCESSOR-i3 5010ucpu #2.10 GHz
64 bit OS
can Cassandra and MongoDB be installed in such a laptop? Will it run successfully?
The hardware configuration proposed does not meet the minimum requirements. For Cassandra, the documentation requests a minimum of 8GB of RAM and at least 2 cores.
MongoDB's documentation also states that it will need at least 2 real cores or one multi-core physical CPU. With 4GB in RAM, the WiredTiger will allocate 1.5GB for the cache. Please also note that MongoDB will require changes in BIOS to allow memory interleaving to enable Non-Uniform Access Memory, a.k.a. NUMA, such changes will impact the performance of the laptop for other processes.
Will it run successfully?
This will depend on the workload expected to be executed; there are documented examples where Cassandra was installed on a Raspberry Pi array, which since the design it was expected to have slow performance and have a limited amount of data that can be held in the cluster.
If you are looking to have a small sandbox to start using these databases there are other options, MongoDB has a service named Atlas, with a model of a database as a service, it offers a free tier for a 3-node replica and up to 512Mb of storage. For Cassandra there are similar options, AWS offers in the free tier a small cluster of their Managed Cassandra Service (MCS), Datastax is also planning to offer similar services with Constellation
Attempted to migrate my production environment from Native Postgres environment (hosted on AWS EC2) to RDS Postgres (9.4.4) but it failed miserably. The CPU utilisation of RDS Postgres instances shooted up drastically when compared to that of Native Postgres instances.
My environment details goes here
Master: db.m3.2xlarge instance
Slave1: db.m3.2xlarge instance
Slave2: db.m3.2xlarge instance
Slave3: db.m3.xlarge instance
Slave4: db.m3.xlarge instance
[Note: All the slaves were at Level 1 replication]
I had configured Master to receive only write request and this instance was all fine. The write count was 50 to 80 per second and they CPU utilisation was around 20 to 30%
But apart from this instance, all my slaves performed very bad. The Slaves were configured only to receive Read requests and I assume all writes that were happening was due to replication.
Provisioned IOPS on these boxes were 1000
And on an average there were 5 to 7 Read request hitting each slave and the CPU utilisation was 60%.
Where as in Native Postgres, we stay well with in 30% for this traffic.
Couldn't figure whats going wrong on RDS setup and AWS support is not able to provide good leads.
Did anyone face similar things with RDS Postgres?
There are lots of factors, that maximize the CPU utilization on PostgreSQL like:
Free disk space
CPU Usage
I/O usage etc.
I came across with the same issue few days ago. For me the reason was that some transactions was getting stuck and running since long time. Hence forth CPU utilization got inceased. I came to know about this, by running some postgreSql monitoring command:
SELECT max(now() - xact_start) FROM pg_stat_activity
WHERE state IN ('idle in transaction', 'active');
This command shows the time from which a transaction is running. This time should not be greater than one hour. So killing the transaction which was running from long time or that was stuck at any point, worked for me. I followed this post for monitoring and solving my issue. Post includes lots of useful commands to monitor this situation.
I would suggest increasing your work_mem value, as it might be too low, and doing normal query optimization research to see if you're using queries without proper indexes.
I have read somewhere that MongoDB and Redis server shouldn't be executed in the same host because the way that Redis manages the memory damages MongoDb. This is before Docker.io. But now thing seems are pretty different or not? Is is convenient running Redis server and MongoDB on two different containers on the same host machine?
Docker does not change your hardware, also it is the OS that deals with resources which is not virtualized so the same rules as a normal hardware should apply here.
RAM
MongoDB and Redis don't share any memory. The problem of using the same host will be that you can run out of RAM with these two processes, you can put a max size for redis, you can probably do the same for MongoDB, it is mandatory.
If your sizing is good (MongoDB RAM + Redis RAM < Hardware RAM), you won't get any swap on disk for redis (which is absolutely what you want to prevent) but maybe mongodb cache won't be as good (not enough place for optimization). Less memory for redis is always a challenge if your data grows: beware of out of memory if the data size is unpredictable!
If you use backups with redis, it uses more RAM than its dataset to produce the dump, so beware of that. It implies also using IO.
IO
In this case (less RAM) mongo will do a lot more of IO to access data. Redis, depending on your backup policy, can use IO or not (your choice). Worst case: if you use AOF on redis, it is a lot of IO so maybe IO can become a bottleneck in this architecture. If you don't use backups with redis: you won't have problems. Also a SSD is a good choice for Mongo.
CPU
I don't know if MongoDB uses a lot of CPU, but redis most of the time does not except during backups. If you use backups with redis: try to have two CPU cores available for it (one for redis, one for backup task).
Network
It depends on your number of clients. But you should check the throughput / input load of your machine to see if you are not saturating (using monit for instance with alerts). Sometimes it is the bottleneck, not enought throughput in one machine!
Many of today's services, in particular Databases, are very aggressive consuming resources and are designed thinking they will (or should) be executed in a dedicated machine for them. MongoDB and Redis try to keep a lot of data in memory and will try to take the more memory they can for themselves. To avoid this services take all the memory of your host machine you can limit the maximum memory used by a container using -m="<number><optional unit>" in docker run. E.g.: docker run -d -m="2g" -p 27017:27017 --name mongodb dockerfile/mongodb
So you can control in an easy way the resource limits of your services, and run them in the same host with a fine grained control of the resources. Anyway it's important to consider that the performance of these services is designed thought that the resources of the host machine will be fully available for them. For example there are other databases as Cassandra that will consume a lot of memory, and furthermore, are designed to have sequential access writing to disk. In these cases Docker will let you to run limiting the resources used, but if you run multiple services in the same host the performance of them will decrease severely.