I have an AWS ec2 server that is running an application that is connected to a MongoDB atlas sharded cluster. Periodically, the application will slow down and I will receive alerts from MongoDB about high CPU steal %. I am looking to upgrade my MongoDB server tier and see the only difference in the options is more storage space and more RAM, but the number of vCPUs is the same. I'm wondering if anyone has any insight on whether this increased RAM will help with the CPU steal % alerts I am receiving and whether it will help speed up the app? Otherwise, am I better off upgrading my AWS server tier for more CPU that way?
Any help is appreciated! Thanks :)
I don't think more RAM will necessarily help if you're mostly CPU Bound. However, if you're using MongoAtlas then the alternative tiers definitely do provide more vCPU as you go up the scaling options.
You can also enable auto scaling and set your minimum and maximum tiers to allow the database to scale as necessary: https://docs.atlas.mongodb.com/cluster-autoscaling/
However, be warned that MongoAtlas has a pretty aggressive scale-out and a pretty crappy scale-in. I think the scale-in only happens after 24hours so it can get costly.
Related
I am a little confuse about my message server's network bottleneck issue. I can obviously found the problem caused by the a lot of network operation, but I am not sure why and how to identify it.
Currently we are using GCP as our VM and 4 core/8G RAM for our message server. Redis & Cassandra is in other server at the same place. The problem happened at the network operation to the redis server and cassandra server.
I need to handle 3000+ requests at once to save data to redis and 12000+ requests to cassandra server.
My task consuming all my CPU power and the CPU usage down right after I merge the redis request and cassandra request to kind of batch request. The penalty is I have to delay my data saving.
What I want to know is how can I know the network's capability of my system. How many requests within 1 second is a reasonable task?. As my testing, this is obviously true that the bottleneck is the network operation, but I can't prove it. I can't even know how to estimate a reasonable network usage of my system? Are there some tools or other thing that can help to my make sure my network's problem? Or this is just a error config of my GCP system?
Thanks,
Eric
There is a "monitoring" label in each instance where you can check through graphs values like instance CPU, Network and RAM usage.
But to further check the performance of your instance you should use StackDriver Logging1 and Monitoring2. It stores a lot of information from the internal servers and the system performance. for that you will need to install the agent in the instance. It also stores information about your Load Balancer3, in case you are using one with your web application, which is very advisable since it scale your resources up or down with intelligent Autoscaling.
But in order to test out your network you will need to use some third party tool to overload the network. There are multiple tools to achieve this, like JMeter.
Currently, I can dynamically increase or decrease the APP servers with AWS ELB(just by monitoring the CPU loading).
However, All of the data is stored in MongoDB at one machine with 2GB Ram, all of the data is keeping updating as well,
It could NOT be easily scaled under burst incoming flow.
Vertical horizontal won't work because the server will be out of service for few minutes.
To create a new DB machine sounds won't work too. Because the newly created machine doesn't have updated data.
How could I design the DB infrastructure to handle this dynamic loading situation?
Most of the time, there are only about 20 members on my site. Nevertheless, at some particular moment, there will be about 1500 members on my site.
Thanks
You should look into the topic of replica sets to enable vertical scaling, and sharded deployments to enable horizontal scaling.
These topics are introduced nicely on page 10 of the following document -
https://d0.awsstatic.com/whitepapers/AWS_NoSQL_MongoDB.pdf
Both these features are slightly complex and will take some intimate knowledge with mongo to work well. If you want an out-of-the-box solution, you can run you DB on a seperate service outside AWS. We are using compose.io for this matter. It satisfies our needs during peak hours and isn't that expensive.
I have my RHEL linux server(VM) running a 4core processor and 8GB ram running the below applications
- an Apache Karaf container
- an Apache tomcat server
- an ActiveMQ server
- and the mongod server(either primary of secondary).
Often I see that mongo consumes nearly 80% of cpu. Now I see that my cpu and memory is overshooting most of the time and this has caused me to doubt whether my hardware config is too low for running these many components.
Please let me know if it is ok to run mongo like this on a shared server..
The question is to broad and the answer depends on too many variables, but I'll try to give you overall sense of it.
Can you use all these services together on the same machine at a minimum load? - for sure. It's not clear where other shards reside though, but it will work either way. You didn't provide your HDD specs which is quite important for a DB server, but again it will work at a minimum load.
Can you use this setup under heavy load - not the best idea. Perhaps it's better to have separate servers handling these services.
Monitor overall server load like: CPU, memory, IO. Check mongo logs for slow queries. If your queries supposed to run fast and they don't, you'll need more hardware.
Nobody would be really able to tell you how much load a specific server configuration can handle. You need at least 512Mb RAM and 1 CPU to get going these days but very soon you hit the limits. It all depends on how many users you have, what kinds of queries they run and how much data they cover.
Can you run MongoDB along other applications on a single server? Well it would appear that if you are having memory issues or CPU issues in your current configuration then you will likely need to address something. But "Can You?", well if it is not going to affect you then of course you can.
Should you, do this? Most people would firmly agree that you should not, and that would also stand for most of the other applications you are running on the one machine.
There are various reasons, process isolation, resource allocation, security, and far too many for a short topic response to go into why you should not have this kind of configuration. And certainly where it becomes a problem you should be addressing the issue by seeking a new configuration.
For Mongo alone, most people would not think twice about running their SQL database on dedicated hardware. The choice for Mongo should likely be no different.
Have also suggested this be moved to ServerFault, as it is not a programming question suited to stack overflow.
I'm actually new on AWS. And I configured 2 EC2 instances.
One for my MongoDB database and an other one for my application.
I'm using pymongo to make the connection. But If send data through instances each time, it takes too much time. I would like to know if it's possible to have the mongoDB instance as localhost for the application one, using groups or I don't know, to get better performances.
Or If it is better to put the database on the same instance as my application and get more EBS.
Be sure you know where your performance bottleneck is.
If both instances are in the same Availability Zone, network latency should not be the largest performance issue. In fact if you have instances that are at least large... due to the better NIC... network latency should be a non-issue.
To know for sure, measure your network utilization with a monitoring tool.
If any of your working set (MongoDB documents that are used with any frequency) cannot fit in RAM of the instance, that means you are touching EBS. EBS is very, very slow compared to what MongoDB needs. I measured a single EBS volume using iozone recently and found the EBS volume to be half as fast as my laptop's rotational hard drive.
You can improve EBS performance substantially by striping multiple EBS volumes into a software RAID configuration.
The bottom line when running MongoDB on AWS is that you need enough RAM to hold the MongoDB documents that you will touch with any frequency.
I have an application in production that uses a mongodb instance on the same machine as the web server. Works fine for me but then I don't have need for scalability right now. One instance is enough.
So to answer your question, sure you can run it as localhost.
But if your app picks up and you need multiple instances or sharding or such then you'd have to have instances deployed on other machines as well.
So, I have a what I call huge mongo database which is about 30Gb (about 30 millions documents). I tried to run mongod on the server shared with another application and it was completely slowed down. So I have to look for a dedicated server but have no idea how much RAM do I need.
I understand that I probably need to have amount of RAM enough to put all indexes there. But, if I'm correct, it would be about 13Gb of RAM which makes the price for the server very-very expensive (my app isn't making any money yet).
I tried to look into mongoHQ, but their cheapest dedicated plan is $600/month.
Any ideas? Is it really that expensive to host heavy mongo databases like that?
Build your own server and colocate it instead of renting someone's server. You have full control over the hardware, higher startup costs, but lower long-term costs. You are also liable for hardware malfunctions, so watch out for that.