We have a MongoDB cluster with 4 shards.
Our primary shard disk space has 700GB, and according to db.stats() that shard is using ~530GB.
When checking df -h, the disk usage is on 99% (9.5 GB free), i'm guessing this means that all the rest is data files pre-allocated by Mongo.
I've ran compact on couple of collections, and the disk space was reduced to 3.5GB(?)
We're going to run a process that will generate ~140GB of extra data (35GB per shard).
Should we have any concerned on running out of disk space?
Thanks in advance.
compact doesn't decrease disk usage at all, actually it could even lead to additional file perallocation. To reduce disk usage you could use repairDatabase command or start mongo with repair option. However, it would require additional free space on disk.
Described situation could be the case if you did a lot of document deletions or some operations that forced documents to move. In this case your database would be highly defragmented. compact command helps you to reduce defragmentation and you will have more space for new records, but again, it doesn't reclaim any space back to OS.
Best option for you is to try to get why you have such level of defragmentation.
Related
I have been considering the idea of moving to a RAMdisk for a while. I know its risks, but just wanted to do a little benchmark. I just had two questions: (a) when reading the query plan, will it still differentiate between disk and buffers hits? If so, should I assume that both are equally expensive or should I assume that there is a difference between them?
(b) a RAM disk is not persistent, but if I want to export some results to persistent storage, are there some precautions I would need to take? Is it the same as usual e.g. COPY command?
I do not recommend using RAM disks in PostgreSQL for persistent storage. With careful tuning, you can get PostgreSQL not to use more disk I/O than what is required to make your data persistent.
I recommend doing this:
Have more RAM in your machine than the size of the database.
Define shared_buffers big enough to contain the database (on Linux, define memory hugepages to contain them).
Increase checkpoint_timeout and max_wal_size to get fewer checkpoints.
Set synchronous_commit = off to keep PostgreSQL from syncing WAL to disk on every commit.
If you are happy to lose all your data in the case of a crash, define your tables UNLOGGED. The data will survive a normal shutdown.
Anyway, to answer your questions:
(a) You should set seq_page_cost and random_page_cost way lower to tell PostgreSQL how fast your storage is.
(b) You could run backups with either pg_dump or pg_basebackup, they don't care what kind of storage you have got.
when reading the query plan, will it still differentiate between disk and buffers hits?
It never distinguished between them in the first place. It distinguishes between "hit" and "read", but the "read" can't tell which are truly from disk and which are from OS/FS cache.
PostgreSQL has no idea you are running on a RAM disk, so will continue to report those as it always has.
If so, should I assume that both are equally expensive or should I assume that there is a difference between them?
This is a question that should be answered through your benchmarking. On some systems, memory can be read-ahead from main memory into the faster caches, making sequential reads still faster than random reads. If you care, you will have to benchmark it on your own system.
Reading data from RAM into shared_buffers is still surprisingly expensive due to things like lock management. So as a rough starting point, maybe seq_page_cost=0.1 and random_page_cost=0.15.
a RAM disk is not persistent, but if I want to export some results to persistent storage, are there some precautions I would need to take?
The risk would be that your system crashes before the export has finished. But what precaution can you take against that?
On a legacy system that is running MongoDB 2.2.1 we are running out of disk space due to excessively large database files. Our actual data size is just under 3 GB, with about 1.7 GB index size, but the storage size is over 70 GB. So, the storage to data+index ratio is close to factor 15. There are about 40 data files, most of which are at the 2 GB maximum file size.
We are contemplating to run a compact() or repair() to regain some of the unused space, but we are worried about the problem recurring soon after. It seems that the current configuration (pretty close to the default configuration) is not suitable for the database usage pattern of our application.
What other tools, diagnostics, remedies or configuration changes are available that could help MongoDB make better use of the disk space?
WiredTiger, used in MongoDB 3.0 and later, is much more efficient in terms of disk usage.
However, migrating from MongoDB 2.2 to 3.0 is going to be a huge leap.
Another option, assuming this is configured as a replica set, is to re-sync the Secondary nodes individually and then perform a failover. This will have the same affect as performing a repair without the downtime that would occur as a result of using the repairDatabase command.
I have the following configuration:
a host machine that runs three docker containers:
Mongodb
Redis
A program using the previous two containers to store data
Both Redis and Mongodb are used to store huge amounts of data. I know Redis needs to keep all its data in RAM and I am fine with this. Unfortunately, what happens is that Mongo starts taking up a lot of RAM and as soon as the host RAM is full (we're talking about 32GB here), either Mongo or Redis crashes.
I have read the following previous questions about this:
Limit MongoDB RAM Usage: apparently most RAM is used up by the WiredTiger cache
MongoDB limit memory: here apparently the problem was log data
Limit the RAM memory usage in MongoDB: here they suggest to limit mongo's memory so that it uses a smaller amount of memory for its cache/logs/data
MongoDB using too much memory: here they say it's WiredTiger caching system which tends to use as much RAM as possible to provide faster access. They also state it's completely okay to limit the WiredTiger cache size, since it handles I/O operations pretty efficiently
Is there any option to limit mongodb memory usage?: caching again, they also add MongoDB uses the LRU (Least Recently Used) cache algorithm to determine which "pages" to release, you will find some more information in these two questions
MongoDB index/RAM relationship: quote: MongoDB keeps what it can of the indexes in RAM. They'll be swaped out on an LRU basis. You'll often see documentation that suggests you should keep your "working set" in memory: if the portions of index you're actually accessing fit in memory, you'll be fine.
how to release the caching which is used by Mongodb?: same answer as in 5.
Now what I appear to understand from all these answers is that:
For faster access it would be better for Mongo to fit all indices in RAM. However, in my case, I am fine with indices partially residing on disk as I have a quite fast SSD.
RAM is mostly used for caching by Mongo.
Considering this, I was expecting Mongo to try and use as much RAM space as possible but being able to function also with few RAM space and fetching most things from disk. However, I limited Mongo Docker container's memory (to 8GB for instance), by using --memory and --memory-swap, but instead of fetching stuff from disk, Mongo just crashed as soon as it ran out of memory.
How can I force Mongo to use only the available memory and to fetch from disk everything that does not fit into memory?
Thanks to #AlexBlex's comment I solved my issue. Apparently the problem was that Docker limited the container's RAM to 8GB but the wiredTiger storage engine was still trying to use up 50% - 1GB of the total system RAM for it's cache (which in my case would have been 15 GB).
Capping wiredTiger's cache size by using this configuration option to a value less than what Docker was allocating solved the problem.
I'm using pymongo to insert a big amount of jsons to MongoDB gridFS + some data to collection.
What I noticed some time ago is that MongoDB consumes just crazy amount of RAM within using single connection. As soon as I close this connection it releases it.
RAM consumption is like 10-12GB in total within connection and 200MB without. The actual size of collection is actually ~300MB with 10-18GB gridFS storage.
Why does it happen? How can opening new connection for any bulky operation can be lot less resource-dependent than using one single connection for everything?
Is it somehow related to Journaling?
I will have to break down this problem into multiple smaller problems for ease of understanding:
It is well known that MongoDB is RAM hungry, it will try to use as much RAM as possible.
GridFS tends to store files in collection fs.chunks and corresponding meta-data in fs.files. The files stored in GridFS are split into chunks of 256KB each.
When you read GridFS data by opening a connection, the chunks belonging to file(s) have to be loaded into the RAM from the disk(if it is not already present in RAM). So , RAM usage is directly proportional to the amount of data stored and importantly frequency of GridFS data access. Just to re-iterate GridFS data gets pulled into RAM if the query references it.
If you have a active connection for large amounts of GridFS data then you should expect heavy RAM usage. But if your query frequency is low(just write, but read rarely) then RAM usage will be relatively lower.If you are mostly writing data, then ensure the connection is closed after the operation in done.
The more the number of open connections, your RAM usage will increase.
This is no-way related to journaling.
Note: GridFS also supports sharding which will tend to solve your problem of excessive RAM usage.
Hope this clarifies.
Since MongoDB 2.0. each connection consumes about 1MB of RAM.
You can read more here.
I have a sharded cluster in 3 systems.
While inserting I get the error message:
cant map file memory-mongo requires 64 bit build for larger datasets
I know that 32 bit machine have a limit size of 2 gb.
I have two questions to ask.
The 2 gb limit is for 1 system, so the total data will be, 6gb as my sharding is done in 3 systems. So it would be only 2 gb or 6 gb?
While sharding is done properly, all the data are stored in single system in spite of distributing data in all the three sharded system?
Does Sharding play any role in increasing the datasize limit?
Does chunk size play any vital role in performance?
I would not recommend you do anything with 32bit MongoDB beyond running it on a development machine where you perhaps cannot run 64bit. Once you hit the limit the file becomes unuseable.
The documentation states "Use 64 bit for production. This is important as if you hit the mmap size limit (exact limit varies but less than 2GB) you will be unable to write to the database (analogous to a disk full condition)."
Sharding is all about scaling out your data set across multiple nodes so in answer to your question, yes you have increased the possible size of your data set. Remember though that namespaces and indexes also take up space.
You haven't specified where your mongos resides??? Where are you seeing the error from - a mongod or the mongos? I suspect that it's the mongod. I believe that you need to look at pre-splitting the chunks - http://docs.mongodb.org/manual/administration/sharding/#splitting-chunks.
which would seem to indicate that all your data is going to the one mongod.
If you have a mongos, what does sh.status() return? Are chunks spread across all mongod's?
For testing, I'd recommend a chunk size of 1mb. In production, it's best to stick with the default of 64mb unless you've some really important reason why you don't want the default and you really know what you are doing. If you have too small of a chunk size, then you will be performing splits far too often.