Is 25G 250iops is necessary for journalling in mongodb

Is 25G 250iops is necessary for journalling in mongodb - mongodb

I followed an offical document to deploy mongodb to AWS,After the template completed,I found two huge volumes are allocated :
After some investigation,I found it hard coded in template file(provided by the guide):
And used for journaling:
I am not a dedicated database administrator ,so I want to know what is reason to allocate so huge storage
I want to know if decrease the rate for iops for journalling volume will decrease the overall database performance

Related

dealing with full mongodb-atlas database

I have set up a free tier MongoDB-atlas database and have a script that is storing tweets on it. Using db.collection.stats() it says storage size is 32768 which will fill up quite fast. Firstly, what happens when you exceed this limit? are new entries rejected or something else? Secondly, is there a way to deal with this without upgrading? For example, is it possible to clear entries before exceeding capacity?

When you exceed the limit the atlas cluster node will have exceeded the limit will be unavailable. It may be possible that the entire cluster will go down and then you will need to contact the MongoDB support to make the cluster up.
Although the best option is this that you need to upgrade to next tier for having more storage capacity. But in case you don't want that in that case you may write a script to delete old data from your cluster and after deleting the data make sure to run the compact command to reclaim the data storage.

Completely deleting a database in sharded mongoDB cluster

I am planning to test a MongoDB cluster with some random data to test for performance. Then, I am planning to delete the data and use it for production.
My concern is that doing just the db.dropDatase() may not reclaim all the disk space in all the shards and config servers. This answer from stack overflow says that "MongoDB does not recover disk space when actually data size drops due to data deletion along with other causes."
This documentation kind of says that "You do not need to reclaim disk space for MongoDB to reuse freed space. See Empty records for information on reuse of freed space" but I want to know the proper steps to delete a sharded MongoDB database.
I am currently on MongoDB 3.6.2.
Note: To people who may say I need a different Mongodb database for testing and production I want to make it clear that the production is itself a test to replace another old database. So right now, I am not looking for another big cluster just to test for performance.

I think that you have here the best solution, i can explaint you but i would be wasting my time and you would be losing your time
[https://dzone.com/articles/reclaiming-disk-space-from-mongodb][1]

What is the recommended EC2 instance size for MongoDB

We plan to use MongoDB in production for our website which will be hosted (for the start) on a single ec2 instance.
What is the recommandation for a MongoDB, which will have around 25k documents for the start with low traffic? So far i am not used to AWS, therefore a have no comparison to other dedicated hosters.
The "storagesize" of the collection in question will be arond 400MB, "totalindexsize" maybe around 20MB

If it's not a production instance, you can start with t2.medium instance. If its a production instance, start with m5.large.
Attach a new ebs volume of size 10GB and configure mongodb to use this new volume as the data directory. This helps to scale up your storage easily at later point of time.
Make sure you format your ebs volume to have xfs file system before installing mongodb which is required for best performance by mongo.
Also, later if you feel like increasing the instace size wheb your traffic increases, just use the "instance modify" option to get it done.

I cannot give you a specific answer, but the cloud provides you with the option to test your setup very quickly. Just start with some instance (e.g. m3.medium) and you can create an Amazon Machine Image of your running MongoDB instance anytime and just start it on a larger or smaller instance type.
You can find the instance types here: https://aws.amazon.com/ec2/instance-types/
Deeper thought about the instance type choice can be found here: https://www.mongodb.com/blog/post/maximizing-mongodb-performance-on-aws
If you have any doubt about sizing, err on the side of going larger and then scaling down as needed. Undersizing your instances when deploying MongoDB can, and likely will, create performance problems. In order to achieve the best performance, your MongoDB working set should fit in memory.
The working set is the portion of data and related indexes that your clients access most frequently.

Setting up MongoDB to work with two data directories

I have a mongoDb machine with 1 TB drive (on AWS that is the limit).
however I need to store more than 1 TB of data on this mongoDB setup, but it's not heavy on reads / write.
Is there a way to split the data directory to two mounts - two different directories? (instead of using LVM)

You can use the directoryperdb configuration option so that each database's files are stored in a separate subdirectory, and then use different mount points depending on the volume you want to use.
This option can be helpful in provisioning additional storage or different storage configurations depending on the database usage (i.e. SSD or PIOPS for a database which is more I/O intensive, while using normal EBS storage for archival data).
Important caveats:
There is a highlighted note in the documentation for directoryperdb which shows how to migrate the files for an existing deployment.
If you separate your data files onto multiple volumes, this may change your Backup Strategy. In particular, if you are using filesystem/EC2 snapshots to get a consistent backup of a running mongod you can no longer do so if the data and/or journal files are on different volumes.

I haven't solved anything like your problem. However I've read about Sharding technique which is defined as:
Sharding is a method for storing data across multiple machines.
MongoDB uses sharding to support deployments with very large data sets
and high throughput operations.
It sounds promissing for your problem. Wanna try? Sharding MongoDB
(sorry, not enough rep for commenting)

mongodb single node configuration

I am going to configure mongodb on a small number of cloud servers.
I am coming from mysql, and I remember that if I needed to change settings like RAM, etc. I would have to modify "my.cnf" file. This came useful while resizing each cloud server.
Now, how can I check or modify how much RAM or disk space the database is going to take for each node?
thank you in advance.

I don't think there are any built in broad stroke limitation tools or flags in mongodb per se and that is most likely because this is something you should be doing at the operating system level.
Most modern multi-user operating systems have built in ways to set quotas on disk space, etc per user so you could probably set up a mongo user and place the limits on them if you really wanted to. MongoDB works best when it has enough memory to hold the working set of data and indexes in memory and it does a good job of managing that on its own.
However, if you want to get granular you can take a look at the help output of mongod --help
I see the following options that you could tweak:
--nssize arg (=16) .ns file size (in MB) for new databases
--quota limits each database to a certain number of files (8 default)
--quotaFiles arg number of files allower per db, requires --quota