dealing with full mongodb-atlas database - mongodb

I have set up a free tier MongoDB-atlas database and have a script that is storing tweets on it. Using db.collection.stats() it says storage size is 32768 which will fill up quite fast. Firstly, what happens when you exceed this limit? are new entries rejected or something else? Secondly, is there a way to deal with this without upgrading? For example, is it possible to clear entries before exceeding capacity?

When you exceed the limit the atlas cluster node will have exceeded the limit will be unavailable. It may be possible that the entire cluster will go down and then you will need to contact the MongoDB support to make the cluster up.
Although the best option is this that you need to upgrade to next tier for having more storage capacity. But in case you don't want that in that case you may write a script to delete old data from your cluster and after deleting the data make sure to run the compact command to reclaim the data storage.

Related

Cloud SQL disk size is much larger than actual database

Cloud SQL reports that I've used ~4TB of SSD storage, but my database is only ~225 GB. What explains this discrepancy? Is there something I can delete to free up space? If I moved it to a different instance, would the required storage go down?
There are a couple of options about why your Cloud SQL storage has increase:
-Did you enable Point-in-time recovery? PITR uses write-ahead logs and if you enabled this feature, that could be the reason why of your increases.
-Have you used temporary tables and you have not deleted them?
If none of the above applies to you, I highly recommend you to open a case with GCP support team so that they take a look at your Cloud SQL instance.
On the other hand, you should open a case to decrease the disk size to a smaller one so it won’t be necessary to create a new instance and copy all the data to that new instance in addition that shrinking the disk is done at Google's end making the effort from you the lowest possible.
A maintenance window can be scheduled where Google can proceed with this task and you may want to schedule a maintenance window to minimize the impact of the downtime. For this case it is necessary to know the new disk size and when you would like to perform this operation.
Finally, if you prefer to use the migration method, you should export the DB, then create the new instance, import the DB and synchronize the old one with the new one to have all the data in both instances to which can take several hours to complete those four steps.
You do not specify what kind of database. In my case, for a MySQL database, there were several hundred GB as binary logs (mysql flag).
You could check with:
SHOW BINARY LOGS;

Completely deleting a database in sharded mongoDB cluster

I am planning to test a MongoDB cluster with some random data to test for performance. Then, I am planning to delete the data and use it for production.
My concern is that doing just the db.dropDatase() may not reclaim all the disk space in all the shards and config servers. This answer from stack overflow says that "MongoDB does not recover disk space when actually data size drops due to data deletion along with other causes."
This documentation kind of says that "You do not need to reclaim disk space for MongoDB to reuse freed space. See Empty records for information on reuse of freed space" but I want to know the proper steps to delete a sharded MongoDB database.
I am currently on MongoDB 3.6.2.
Note: To people who may say I need a different Mongodb database for testing and production I want to make it clear that the production is itself a test to replace another old database. So right now, I am not looking for another big cluster just to test for performance.
I think that you have here the best solution, i can explaint you but i would be wasting my time and you would be losing your time
[https://dzone.com/articles/reclaiming-disk-space-from-mongodb][1]

What is the recommended EC2 instance size for MongoDB

We plan to use MongoDB in production for our website which will be hosted (for the start) on a single ec2 instance.
What is the recommandation for a MongoDB, which will have around 25k documents for the start with low traffic? So far i am not used to AWS, therefore a have no comparison to other dedicated hosters.
The "storagesize" of the collection in question will be arond 400MB, "totalindexsize" maybe around 20MB
If it's not a production instance, you can start with t2.medium instance. If its a production instance, start with m5.large.
Attach a new ebs volume of size 10GB and configure mongodb to use this new volume as the data directory. This helps to scale up your storage easily at later point of time.
Make sure you format your ebs volume to have xfs file system before installing mongodb which is required for best performance by mongo.
Also, later if you feel like increasing the instace size wheb your traffic increases, just use the "instance modify" option to get it done.
I cannot give you a specific answer, but the cloud provides you with the option to test your setup very quickly. Just start with some instance (e.g. m3.medium) and you can create an Amazon Machine Image of your running MongoDB instance anytime and just start it on a larger or smaller instance type.
You can find the instance types here: https://aws.amazon.com/ec2/instance-types/
Deeper thought about the instance type choice can be found here: https://www.mongodb.com/blog/post/maximizing-mongodb-performance-on-aws
If you have any doubt about sizing, err on the side of going larger and then scaling down as needed. Undersizing your instances when deploying MongoDB can, and likely will, create performance problems. In order to achieve the best performance, your MongoDB working set should fit in memory.
The working set is the portion of data and related indexes that your clients access most frequently.

Is 25G 250iops is necessary for journalling in mongodb

I followed an offical document to deploy mongodb to AWS,After the template completed,I found two huge volumes are allocated :
After some investigation,I found it hard coded in template file(provided by the guide):
And used for journaling:
I am not a dedicated database administrator ,so I want to know what is reason to allocate so huge storage
I want to know if decrease the rate for iops for journalling volume will decrease the overall database performance

mongodb single node configuration

I am going to configure mongodb on a small number of cloud servers.
I am coming from mysql, and I remember that if I needed to change settings like RAM, etc. I would have to modify "my.cnf" file. This came useful while resizing each cloud server.
Now, how can I check or modify how much RAM or disk space the database is going to take for each node?
thank you in advance.
I don't think there are any built in broad stroke limitation tools or flags in mongodb per se and that is most likely because this is something you should be doing at the operating system level.
Most modern multi-user operating systems have built in ways to set quotas on disk space, etc per user so you could probably set up a mongo user and place the limits on them if you really wanted to. MongoDB works best when it has enough memory to hold the working set of data and indexes in memory and it does a good job of managing that on its own.
However, if you want to get granular you can take a look at the help output of mongod --help
I see the following options that you could tweak:
--nssize arg (=16) .ns file size (in MB) for new databases
--quota limits each database to a certain number of files (8 default)
--quotaFiles arg number of files allower per db, requires --quota