I have a single-host database which grew up to 95% of disk space while I was not watching. To remedy the situation I created a process that automatically removes the old records from the biggest collection, so the data usage fell to about 40% of disk space. I figured I was safe as long as the data size doesn't grow near the size of preallocated files, but after a week I was proven wrong:
Wed Jan 23 18:19:22 [FileAllocator] allocating new datafile /var/lib/mongodb/xxx.101, filling with zeroes...
Wed Jan 23 18:25:11 [FileAllocator] done allocating datafile /var/lib/mongodb/xxx.101, size: 2047MB, took 347.8 secs
Wed Jan 23 18:25:14 [conn4243] serverStatus was very slow: { after basic: 0, middle of mem: 590, after mem: 590, after connections: 590, after extra info: 970, after counters: 970, after repl: 970, after asserts: 970, after dur: 1800, at end: 1800 }
This is the output of db.stats(): (note that the numbers are in MB because of scale)
> db.stats(1024*1024)
{
"db" : "xxx",
"collections" : 47,
"objects" : 189307130,
"avgObjSize" : 509.94713418348266,
"dataSize" : 92064,
"storageSize" : 131763,
"numExtents" : 257,
"indexes" : 78,
"indexSize" : 29078,
"fileSize" : 200543,
"nsSizeMB" : 16,
"ok" : 1
}
Question: What can I do to stop MongoDB from allocating new datafiles?
Running repair is difficult because I would have to install new disk. Would running compact help? If yes, should I be running it regularly and how can I tell when I should run it?
UPDATE: I guess I am missing something fundamental here... Could someone please elaborate on connection between data files, extents, collections and database, and how space is allocated when needed?
Upgrade to 2.2.2 - 2.2.0 has an idempotency bug in replication and no longer recommended for production.
See here for general info http://docs.mongodb.org/manual/faq/storage/#faq-disk-size
The only way to recover space back from mongodb is to either sync a new node over the network - in which case the documents are copied over the the new file system and stored anew without fragmentation. Or to use the repair command - but for that you need double the disk space that you are using on disk. The data files are copied, defragged and compacted and copied back over the original. The compact command is badly named and only defrags - it doesn't recover disk space back from mongo.
Going forward, use usePowerOf2Sizes command (new in 2.2.x) http://docs.mongodb.org/manual/reference/command/collMod/
If you use that command and allocate say an 800 byte document, 1024 bytes will be allocated on disk. If you then delete that doc and insert a new one - say 900 bytes, that doc can fit in the 1024 byte space. Without this option enabled, the 800 byte doc might only have 850 bytes on disk - so when it's deleted and the 900 byte doc inserted, new space has to be allocated. And if that is then deleted you will end up with two free space - 850 bytes and 950 bytes which are never joined (unless compact or repair is used) - so then insert a 1000 byte doc and you need to allocate another chunk of disk. usePowerOf2Sizes helps this situation a lot by using standard bucket sizes.
Related
We chose to deploy the mongos router in the same VM as our applications, but we're running into some issues where the application gets OOM Killed because the mongos eats up a lot more RAM than we'd expect / want to.
After a reboot, the mongos footprint is a bit under 2GB, but from here it constantly requires more memory. About 500MB per week. It went up to 4.5+GB
This is the stats for one of our mongos for the past 2 weeks and it clearly looks like it's leaking memory...
So my question is: how to investigate such behavior? We've not really been able to find explanations as of why the router might require more RAM, or how to diagnosis the behavior much. Or even how to set a memory usage limit to the mongos.
With a db.serverStatus on the mongos we can see the allocations:
"tcmalloc" : {
"generic" : {
"current_allocated_bytes" : 536925728,
"heap_size" : NumberLong("2530185216")
},
"tcmalloc" : {
"pageheap_free_bytes" : 848211968,
"pageheap_unmapped_bytes" : 213700608,
"max_total_thread_cache_bytes" : NumberLong(1073741824),
"current_total_thread_cache_bytes" : 819058352,
"total_free_bytes" : 931346912,
"central_cache_free_bytes" : 108358128,
"transfer_cache_free_bytes" : 3930432,
"thread_cache_free_bytes" : 819058352,
"aggressive_memory_decommit" : 0,
"pageheap_committed_bytes" : NumberLong("2316484608"),
"pageheap_scavenge_count" : 35286,
"pageheap_commit_count" : 64660,
"pageheap_total_commit_bytes" : NumberLong("28015460352"),
"pageheap_decommit_count" : 35286,
"pageheap_total_decommit_bytes" : NumberLong("25698975744"),
"pageheap_reserve_count" : 513,
"pageheap_total_reserve_bytes" : NumberLong("2530185216"),
"spinlock_total_delay_ns" : NumberLong("38522661660"),
"release_rate" : 1
}
},
------------------------------------------------
MALLOC: 536926304 ( 512.1 MiB) Bytes in use by application
MALLOC: + 848211968 ( 808.9 MiB) Bytes in page heap freelist
MALLOC: + 108358128 ( 103.3 MiB) Bytes in central cache freelist
MALLOC: + 3930432 ( 3.7 MiB) Bytes in transfer cache freelist
MALLOC: + 819057776 ( 781.1 MiB) Bytes in thread cache freelists
MALLOC: + 12411136 ( 11.8 MiB) Bytes in malloc metadata
MALLOC: ------------
MALLOC: = 2328895744 ( 2221.0 MiB) Actual memory used (physical + swap)
MALLOC: + 213700608 ( 203.8 MiB) Bytes released to OS (aka unmapped)
MALLOC: ------------
MALLOC: = 2542596352 ( 2424.8 MiB) Virtual address space used
MALLOC:
MALLOC: 127967 Spans in use
MALLOC: 73 Thread heaps in use
MALLOC: 4096 Tcmalloc page size
------------------------------------------------
Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()).
Bytes released to the OS take up virtual address space but no physical memory.
But I can't say it's really helpful. At least to me.
In the server stats we can also see that the number of calls to killCursors is quite high (2909015) but I'm not sure how it would explain the steady increase in memory usage? As the cursors are automatically killed after 30ish seconds, and the number of calls made to the mongos is pretty much steady all throughout the period.
So yeah, any idea on how to diagnosis / where to look / what to look for?
Mongos version: 4.0.19
Edit: seems like our monitoring is based on the virt and not the res memory, so the graph might not be very pertinent. However, we still ended-up with 4+ GB of RES memory at some point
Why the router would require more memory?
If there is any query in the sharded cluster where the system needs to do a scatter gather then merging activity is taken care of by the mongos itself.
For example I am running a query db.collectionanme.find({something : 1})
If this something field here is not the shard key itself then by default it will do a scatter gather, use explainPlan to check the query. It does a scatter gather because mongos interacts with config server and realises that it doesn't have information for this respective field. {This is applicable for a collection which is sharded}
To make things worse, if you have sorting operations where the index cannot be used then even that now has to be done on the mongos itself. Sorting operations have to block the memory segment to get the pages together based on volume of data then sort works, imagine the best possible Big O for a sorting operation here. Till that is done the memory is blocked for that operation.
What you should do?
Based on settings (your slowms setting, default should be 100ms), check the logs, take a look at your slow queries in the system. If you see a lot of SHARD_MERGE & in memory sorts taking place then you have your culprit right there.
And for quick fix increase the Swap memory availability and make sure settings are apt.
All the best.
Without access to the machine one can only speculate, with that said I want to say this is somewhat expected behaviour by Mongo.
Mongo likes memory, it saves many things in RAM in order to increase performance.
Many different things can cause this, i'll list a few:
MongoDB uses RAM to handle open connections, aggregations, serverside code, open cursors and more.
WiredTiger keeps multiple versions of records in its cache (Multi
Version Concurrency Control, read operations access the last
committed version before their operation).
WiredTiger Keeps checksums of the data in cache.
There are many more things that are cached / stored in memory by mongo, one such example could be an index tree for a collection.
If Mongo has the memory to store something, it will. that's why as you're using it more and more the RAM usage increases. however I personally do not think it's a memory leak. as I said Mongo just like's RAM, A LOT.
We chose to deploy the mongos router in the same VM as our applications.
This is a big no no from personal experience in general because of Mongo's hungriness I would personally try and avoid this if possible.
To summarize I don't think you have a memory leak ( although possible ), I just think the more time passes Mongo stores more things into RAM as they are being used.
You should be on the lookout for long running queries as those are the most likely culprit IMO.
I am a new user of MongoDB, and I am hoping to get pointed in the right direction. I will provide any further needed information I have missed as this question develops.
I am using a Perl program to upload and annotate/modify documents to a/in a MongoDB database via the MongoDB cpan module. Indexes are being used (I believe) for this program, but the problem I have is that reading from MongoDB takes increasingly long. Based on mongotop, it takes ~500 ms to read and only 10-15 ms to write. After allowing the program to run for a considerable amount of time, the read time increases significantly, taking more then 3000+ ms after many hours of running.
Monitoring the program while its running using top, Perl starts out at around 10-20% CPU usage and MongoDB starts at 70-90% CPU usage. While running, within a few minutes Perl drops below 5% and mongoDB is 90-95%. After running for a much longer period of time (12+ hours), MongoDB is ~98% CPU usage while Perl is around 0.3%, but only pops up every 5-10 seconds in top.
Based on this trend, an indexing issue seems very likely but I am not sure how to check this, and all I know is that the appropriate indexes are at least made, but not necessarily being used.
Additional information:
$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 19209
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 19209
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
As the program runs, I see that the indexSize and dataSize change (via db.stats() in the Mongo shell), which makes me think that they are at least being used to some degree
Is this something that could be affected by the power of my computer? I am under the impression that indexing should make a lot of this process very manageable for the computer
That sounds a lot like it could be doing a collection scan rather than using the index. I.e. as your collection grows, the reads are getting slower.
If you're using the find method, you can run explain on the resulting cursor to get information on how the query would execute.
Here's a trivial example:
use MongoDB;
use JSON::MaybeXS;
my $coll = MongoDB->connect->ns("test.foo");
$coll->drop();
$coll->insert_one({x => $_}) for 1 .. 1000;
my $cursor = $coll->find({x => 42});
my $diag = $cursor->explain;
my $json = JSON::MaybeXS->new(
allow_blessed => 1, convert_blessed => 1, pretty => 1, canonical => 1
);
print $json->encode($diag->{queryPlanner}{winningPlan});
Looking at just the 'winningPlan' part of the output you can see 'COLLSCAN':
{
"direction" : "forward",
"filter" : {
"x" : {
"$eq" : 42
}
},
"stage" : "COLLSCAN"
}
Now I'll do it again, but first creating an index on 'x' before the insertions with $coll->indexes->create_one([x => 1]). You can see in the output that the query plan is now using the index (IXSCAN).
{
"inputStage" : {
"direction" : "forward",
"indexBounds" : {
"x" : [
"[42, 42]"
]
},
"indexName" : "x_1",
"indexVersion" : 2,
"isMultiKey" : false,
"isPartial" : false,
"isSparse" : false,
"isUnique" : false,
"keyPattern" : {
"x" : 1
},
"multiKeyPaths" : {
"x" : []
},
"stage" : "IXSCAN"
},
"stage" : "FETCH"
}
There's a lot more you can discover from the full 'explain' output. You can watch a great video from MongoDB World 2016 to learn more about it: Deciphering Explain Output.
I have been working with orientDB and stored about 120 Million records to it, the size on disk was 24 GB, I then I deleted all the records by running the following commands against console :
Delete from E unsafe
Delete from V unsafe
When i checked the DB size on disk it was also 24 GB, Is there anything extra I need to do to get free disk space?
In OrientDB when you delete a record the disk space remains allocated. The only way to free it is to export than re-import the DB.
I'm getting the "not enough storage" error when trying to insert data into my MongoDB. But I'm nowhere near the size limit, as seen in the stats.
> db.stats()
{
"db" : "{my db name}",
"collections" : 20,
"objects" : 281092,
"avgObjSize" : 806.4220539894412,
"dataSize" : 226678788,
"storageSize" : 470056960,
"numExtents" : 95,
"indexes" : 18,
"indexSize" : 13891024,
"fileSize" : 1056702464,
"nsSizeMB" : 16,
"ok" : 1
}
Journaling is on, but the journal file size is only 262232 KB.
The data file at size 524032 KB has been created, although dataSize is below the smaller file of 262144 KB.
The NS file is 16384 KB.
I've read several places that this error is caused by the size limit on a 32 bit of around 2 GB, but then why am I getting this error when my dataSize is below that?
First of all. The constraint applies to the fileSize since the storage engine uses memory mapped files. This is currently at 1Gb for you. What is likely is that MongoDB is about to prealloc a new data file which is probably going to be 1Gb in size (the default sizes are 64mb, 128mb, 256mb, 512mb, 1Gb, and from there on 2Gb per additional data file). You're probably at the point where you have the 512mb one but not the 1Gb one.
Frankly I think using MongoDB on a 32-bit environment is an absolute no-go but if you're stuck on it you can try the --smallfiles option which is allocates smaller files of at most 512Mb.
I have been using Mongo-db from one month ago i am getting the error as follows "MapViewOfFile failed /data/db/sgserver.4 errno:0 The operation completed successfully. 0".
If i check in the DB path c:/data/db the size not exceeded 2GB.I am using windows2003serverR2...Anyone faced same Issue share your experience.......
Advance Thanks,
Default file sizes for MongoDB
.ns => 16MB
.0 => 64 MB
.1 => 128 MB
.2 => 256 MB
.3 => 512 MB
.4 => 1024 MB
Add that up and you're just under 2GB. So if you've filled the .4 file, then you won't be able to allocate any more space. (the .5 file will be 2GB)
If you log into Mongo and do a db.stats(), how much space are you using? That should tell you how close you are to the limit.
What size is the /data/db? This error is most likely from Mongo trying to allocate a new database file and that new file would push the size of the db past 2GB. MongoDB allocates database files in fairly large chunks so if you are anywhere near 2GB this could be the problem.