I downloaded MongoDb on my local desktop. Since I have other important stuff on this machine, I need to make sure MongoDB data files do not occupy more than 10GB of disk storage at any time. If it exceeds 10GB, I expect an error message while inserting new documents.
Is there a way to set this max size on disk via the config file?
As at MongoDB 4.0, there isn't a general configuration option to limit the maximum size on disk for a deployment.
However, there are several ways you could limit storage usage on your desktop:
Use a separate partition or storage volume for your MongoDB dbPath
Run MongoDB inside a virtual machine or container with a maximum storage allocation
Connect to a hosted MongoDB deployment
In general it is a bad idea to let your database server run out of space as this may result in unexpected errors or shutdown depending on the operations that are trying to complete at the point when the server runs out of space. Typically you would want to have a process monitoring storage so you can proactively free some space before the issue becomes critical.
A few space saving tips that might help:
Rotate your MongoDB log files regularly to limit storage usage. If you are using a Unix/Linux system you can configure the logrotate utility to rotate and compress logs when they reach a target filesize and subsequently remove archived logs when they reach a certain age.
Consider using TTL indexes to automatically remove old data from collections. This can be useful if you have collections with ephemeral data like user sessions that will become stale after an expiry date.
Drop unused indexes & collections. The WiredTiger storage engine (default in MongoDB 3.2+) allocates a file per collection and index, so dropping either of those should immediately free up storage space.
Related
I have a question regarding the freeable memory for AWS Aurora Postgres.
We recently wanted to create an index on one of our dbs and the db died and made a failover to the slave which all worked fine. It looks like the freeable memory dropped by the configured 500mb of maintenance_work_mem and by that went to around 800mb of memory - right after that the 32gig instance died.
1) I am wondering if the memory that is freeable is the overall system memory and if a low memory here could invoke the system oom killer on the AWS Aurora instance? So we may want to plan in more head room for operational tasks and the running of autovacuum jobs to not encounter this issue again?
2) The actual work of the index creation should then have used the free local storage as far as I understood, so the size of the index shouldn't have mattered, right?
Thanks in advance,
Chris
Regarding 1)
Freeable Memory from (https://forums.aws.amazon.com/thread.jspa?threadID=209720)
The freeable memory includes the amount of physical memory left unused
by the system plus the total amount of buffer or page cache memory
that are free and available.
So it's freeable memory across the entire system. While MySQL is the
main consumer of memory on the host we do have internal processes in
addition to the OS that use up a small amount of additional memory.
If you see your freeable memory near 0 or also start seeing swap usage
then you may need to scale up to a larger instance class or adjust
MySQL memory settings. For example decreasing the
innodb_buffer_pool_size (by default set to 75% of physical memory) is
one way example of adjusting MySQL memory settings.
That also means that if the memory gets low its likely to impact the process in some form. In this thread (https://forums.aws.amazon.com/thread.jspa?messageID=881320󗊨) e.g. it was mentioned that it caused the mysql server to restart.
Regarding 2)
This is like it is described in the documentation (https://aws.amazon.com/premiumsupport/knowledge-center/postgresql-aurora-storage-issue/) so I guess its right and the size shouldn't have mattered.
Storage used for temporary data and logs (local storage). All DB
temporary files (for example, logs and temporary tables) are stored in
the instance local storage. This includes sorting operations, hash
tables, and grouping operations that are required by queries.
Each Aurora instance contains a limited amount of local storage that
is determined by the instance class. Typically, the amount of local
storage is twice the amount of memory on the instance. If you perform
a sort or index creation operation that requires more memory than is
available on your instance, Aurora uses the local storage to fulfill
the operation.
On a legacy system that is running MongoDB 2.2.1 we are running out of disk space due to excessively large database files. Our actual data size is just under 3 GB, with about 1.7 GB index size, but the storage size is over 70 GB. So, the storage to data+index ratio is close to factor 15. There are about 40 data files, most of which are at the 2 GB maximum file size.
We are contemplating to run a compact() or repair() to regain some of the unused space, but we are worried about the problem recurring soon after. It seems that the current configuration (pretty close to the default configuration) is not suitable for the database usage pattern of our application.
What other tools, diagnostics, remedies or configuration changes are available that could help MongoDB make better use of the disk space?
WiredTiger, used in MongoDB 3.0 and later, is much more efficient in terms of disk usage.
However, migrating from MongoDB 2.2 to 3.0 is going to be a huge leap.
Another option, assuming this is configured as a replica set, is to re-sync the Secondary nodes individually and then perform a failover. This will have the same affect as performing a repair without the downtime that would occur as a result of using the repairDatabase command.
I have an AWS instance running in 1 machine. It has all the data files, server setup, mongodb database etc. I created a new AMI image, and then tried to launch an instance from this image.
In the new machine as soon as created, the size of the mongodb journal started to increase from just 2.6MB in the original machine to 3.1GB in the new machine. (When the machine starts, and I ssh to the machine, I can see the size of the files increasing gradually and in 10 minutes it reaches around 3.1GB and stops.
I see that based on other answers, the 3.1GB is some magic number for journal files. My question is, why was it small in my original machine and why does it increase only after starting the instance.
I don't see 'smallFile' setting enabled in old or the new machine. There is no other changes. I have retried creating new images and new instances from these images multiple times.
Please let me know how to fix this issue? My total data file size is around 195MB only and the original journal file size is around 2.6MB.
From the sounds of things, you are using MMAPv1 and so are just seeing the natural usage of the journal files, where each journal file should be up to 1GB in size. As stated in the docs, in normal conditions you should have up to 3 files and so up to 3GB of journal.
The fix for really small DBs that need to run on small VMs (like yours), is to enable the smallFiles setting. As already noted in SO, this should not be a problem.
While you're at it, you might also want to check out this other answer when you switch: Setting smallfiles option for controlling journaling doesn't control the size
There is a spike in the memory utilization of mongodb on our CentOS-07 server.It has 64 Gigabytes of RAM.This is a stand alone mongodb instance which doesn't have any application running on it and there are house keeping scripts enabled to keep only the relevant data .We haven't indexed the data .The total size of data on disk is 81 Gigabyte. This issue was not seen before we tried enabling replication,post which the the node set-up has been using high memory hence was disabled,we then brought up a fresh stand alone instance of mongo. The memory usage hasn't come down ever since we tried re-starting the mongo server but hasn't worked.Is there any reason for mongodb to use so much memory??Below is a link to the snap shot of the mem usage taken from the site server.
The mongo version is 2.6.5
Image link
This is not surprising. See the Memory Use section in the docs for the MMAPv1 storage engine (which is what MongoDB 2.6 uses):
With MMAPv1, MongoDB automatically uses all free memory on the machine as its cache. System resource monitors show that MongoDB uses a lot of memory, but its usage is dynamic. If another process suddenly needs half the server’s RAM, MongoDB will yield cached memory to the other process.
It is also not surprising that the usage spiked after enabling replication, as it sounds like you had a fully populated database and then added a replica member. This would mean that the replica member would need to perform an initial sync of the data from that node, which would require a read of every document which would "prime" MongoDB's cache as a result.
I am concern about my server machine performance . The application deals with gazillion data from RETS sever feed. Whenever server starts mongod service its getting like taking off the performance and the PF usage shoot upto ~3.59GB although it has good configuration(Server2008, 4GB RAM) and using mongodb 64bit latest release (2.0.6).Please enlighten me on this regard.
Thanks
I'm not sure how much you know about MongoDB but Mongo uses memory mapped files to access data, which results in large numbers being displayed for the mongod process. This is normal when using memory-mapped files. The amount of mapped datafile is shown in the virtual size parameter and resident bytes shows how much data is being cached in RAM. The larger your data files, the higher the vmsize of the mongod process.
If other processes need more ram, the operating system’s virtual memory manager will relinquish some memory from the cache and the resident bytes on mongod process will drop.
It is recommended to use a fixed pagefile size. If you use a dynamic page file the OS doesn't increase it fast enough to keep up with the (private) mapped memory calls. There's actually an open ticket to add special warning if the page file is dynamic or (min is) set too small.
This document explains how memory usage works on MongoDB.
Here are some tools that show how you can diagnose system issues with MongoDB -
mongostat
Monitoring and Diagnostics
To be honest, I'd recommend moving this issue to the MongoDB User Google Group and posting your issue there along with the mongostat output during the issue as well as information from perfmon as this will likely be a longer discussion.
Another something to consider is to setup MMS on your Mongod instances.
https://mms.10gen.com