I'm getting the "not enough storage" error when trying to insert data into my MongoDB. But I'm nowhere near the size limit, as seen in the stats.
> db.stats()
{
"db" : "{my db name}",
"collections" : 20,
"objects" : 281092,
"avgObjSize" : 806.4220539894412,
"dataSize" : 226678788,
"storageSize" : 470056960,
"numExtents" : 95,
"indexes" : 18,
"indexSize" : 13891024,
"fileSize" : 1056702464,
"nsSizeMB" : 16,
"ok" : 1
}
Journaling is on, but the journal file size is only 262232 KB.
The data file at size 524032 KB has been created, although dataSize is below the smaller file of 262144 KB.
The NS file is 16384 KB.
I've read several places that this error is caused by the size limit on a 32 bit of around 2 GB, but then why am I getting this error when my dataSize is below that?
First of all. The constraint applies to the fileSize since the storage engine uses memory mapped files. This is currently at 1Gb for you. What is likely is that MongoDB is about to prealloc a new data file which is probably going to be 1Gb in size (the default sizes are 64mb, 128mb, 256mb, 512mb, 1Gb, and from there on 2Gb per additional data file). You're probably at the point where you have the 512mb one but not the 1Gb one.
Frankly I think using MongoDB on a 32-bit environment is an absolute no-go but if you're stuck on it you can try the --smallfiles option which is allocates smaller files of at most 512Mb.
Related
We chose to deploy the mongos router in the same VM as our applications, but we're running into some issues where the application gets OOM Killed because the mongos eats up a lot more RAM than we'd expect / want to.
After a reboot, the mongos footprint is a bit under 2GB, but from here it constantly requires more memory. About 500MB per week. It went up to 4.5+GB
This is the stats for one of our mongos for the past 2 weeks and it clearly looks like it's leaking memory...
So my question is: how to investigate such behavior? We've not really been able to find explanations as of why the router might require more RAM, or how to diagnosis the behavior much. Or even how to set a memory usage limit to the mongos.
With a db.serverStatus on the mongos we can see the allocations:
"tcmalloc" : {
"generic" : {
"current_allocated_bytes" : 536925728,
"heap_size" : NumberLong("2530185216")
},
"tcmalloc" : {
"pageheap_free_bytes" : 848211968,
"pageheap_unmapped_bytes" : 213700608,
"max_total_thread_cache_bytes" : NumberLong(1073741824),
"current_total_thread_cache_bytes" : 819058352,
"total_free_bytes" : 931346912,
"central_cache_free_bytes" : 108358128,
"transfer_cache_free_bytes" : 3930432,
"thread_cache_free_bytes" : 819058352,
"aggressive_memory_decommit" : 0,
"pageheap_committed_bytes" : NumberLong("2316484608"),
"pageheap_scavenge_count" : 35286,
"pageheap_commit_count" : 64660,
"pageheap_total_commit_bytes" : NumberLong("28015460352"),
"pageheap_decommit_count" : 35286,
"pageheap_total_decommit_bytes" : NumberLong("25698975744"),
"pageheap_reserve_count" : 513,
"pageheap_total_reserve_bytes" : NumberLong("2530185216"),
"spinlock_total_delay_ns" : NumberLong("38522661660"),
"release_rate" : 1
}
},
------------------------------------------------
MALLOC: 536926304 ( 512.1 MiB) Bytes in use by application
MALLOC: + 848211968 ( 808.9 MiB) Bytes in page heap freelist
MALLOC: + 108358128 ( 103.3 MiB) Bytes in central cache freelist
MALLOC: + 3930432 ( 3.7 MiB) Bytes in transfer cache freelist
MALLOC: + 819057776 ( 781.1 MiB) Bytes in thread cache freelists
MALLOC: + 12411136 ( 11.8 MiB) Bytes in malloc metadata
MALLOC: ------------
MALLOC: = 2328895744 ( 2221.0 MiB) Actual memory used (physical + swap)
MALLOC: + 213700608 ( 203.8 MiB) Bytes released to OS (aka unmapped)
MALLOC: ------------
MALLOC: = 2542596352 ( 2424.8 MiB) Virtual address space used
MALLOC:
MALLOC: 127967 Spans in use
MALLOC: 73 Thread heaps in use
MALLOC: 4096 Tcmalloc page size
------------------------------------------------
Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()).
Bytes released to the OS take up virtual address space but no physical memory.
But I can't say it's really helpful. At least to me.
In the server stats we can also see that the number of calls to killCursors is quite high (2909015) but I'm not sure how it would explain the steady increase in memory usage? As the cursors are automatically killed after 30ish seconds, and the number of calls made to the mongos is pretty much steady all throughout the period.
So yeah, any idea on how to diagnosis / where to look / what to look for?
Mongos version: 4.0.19
Edit: seems like our monitoring is based on the virt and not the res memory, so the graph might not be very pertinent. However, we still ended-up with 4+ GB of RES memory at some point
Why the router would require more memory?
If there is any query in the sharded cluster where the system needs to do a scatter gather then merging activity is taken care of by the mongos itself.
For example I am running a query db.collectionanme.find({something : 1})
If this something field here is not the shard key itself then by default it will do a scatter gather, use explainPlan to check the query. It does a scatter gather because mongos interacts with config server and realises that it doesn't have information for this respective field. {This is applicable for a collection which is sharded}
To make things worse, if you have sorting operations where the index cannot be used then even that now has to be done on the mongos itself. Sorting operations have to block the memory segment to get the pages together based on volume of data then sort works, imagine the best possible Big O for a sorting operation here. Till that is done the memory is blocked for that operation.
What you should do?
Based on settings (your slowms setting, default should be 100ms), check the logs, take a look at your slow queries in the system. If you see a lot of SHARD_MERGE & in memory sorts taking place then you have your culprit right there.
And for quick fix increase the Swap memory availability and make sure settings are apt.
All the best.
Without access to the machine one can only speculate, with that said I want to say this is somewhat expected behaviour by Mongo.
Mongo likes memory, it saves many things in RAM in order to increase performance.
Many different things can cause this, i'll list a few:
MongoDB uses RAM to handle open connections, aggregations, serverside code, open cursors and more.
WiredTiger keeps multiple versions of records in its cache (Multi
Version Concurrency Control, read operations access the last
committed version before their operation).
WiredTiger Keeps checksums of the data in cache.
There are many more things that are cached / stored in memory by mongo, one such example could be an index tree for a collection.
If Mongo has the memory to store something, it will. that's why as you're using it more and more the RAM usage increases. however I personally do not think it's a memory leak. as I said Mongo just like's RAM, A LOT.
We chose to deploy the mongos router in the same VM as our applications.
This is a big no no from personal experience in general because of Mongo's hungriness I would personally try and avoid this if possible.
To summarize I don't think you have a memory leak ( although possible ), I just think the more time passes Mongo stores more things into RAM as they are being used.
You should be on the lookout for long running queries as those are the most likely culprit IMO.
I have a cluster with three shards using MongoDB 4.2. I have a collection (users) that, before sharding can be checked it has 600000 documents:
mongos> db.users.count()
600000
Next, I shard it with the usual commands (first DB, next collection):
mongos> sh.enableSharding("app")
mongos> sh.shardCollection("app.users", {"name.first": 1})
getting after a couple of minutes or so an equally distrution of chunks amongs the shards:
chunks:
shard0000 3
shard0001 2
shard0002 3
So far so good.
However, if I get a count just after this, I get a weird value, higher than the number of documents in the collection:
mongos> db.users.count()
994243
mongos> db.users.find({}).count()
994243
Moreover, the getShardDistribution() result on the collection is also weird, showing the total number of document all them in one of the shards (which makes no sense, as part of them have been distributed in the other two shards):
mongos> db.users.getShardDistribution()
Shard shard0000 at localhost:27018
data : 95.85MiB docs : 236611 chunks : 3
estimated data per chunk : 31.95MiB
estimated docs per chunk : 78870
Shard shard0001 at localhost:27019
data : 64.06MiB docs : 157632 chunks : 2
estimated data per chunk : 32.03MiB
estimated docs per chunk : 78816
Shard shard0002 at localhost:27020
data : 243.69MiB docs : 600000 chunks : 3
estimated data per chunk : 81.23MiB
estimated docs per chunk : 200000
Totals
data : 403.62MiB docs : 994243 chunks : 8
Shard shard0000 contains 23.74% data, 23.79% docs in cluster, avg obj size on shard : 424B
Shard shard0001 contains 15.87% data, 15.85% docs in cluster, avg obj size on shard : 426B
Shard shard0002 contains 60.37% data, 60.34% docs in cluster, avg obj size on shard : 425B
Interestingly, if I wait a while (not sure how much, but not more than 30 minutes), count and getShardDistribution() are back to normality:
mongos> db.users.count()
600000
mongos> db.users.getShardDistribution()
Shard shard0001 at localhost:27019
data : 64.06MiB docs : 157632 chunks : 2
estimated data per chunk : 32.03MiB
estimated docs per chunk : 78816
Shard shard0002 at localhost:27020
data : 83.77MiB docs : 205757 chunks : 3
estimated data per chunk : 27.92MiB
estimated docs per chunk : 68585
Shard shard0000 at localhost:27018
data : 95.85MiB docs : 236611 chunks : 3
estimated data per chunk : 31.95MiB
estimated docs per chunk : 78870
Totals
data : 243.69MiB docs : 600000 chunks : 8
Shard shard0001 contains 26.28% data, 26.27% docs in cluster, avg obj size on shard : 426B
Shard shard0002 contains 34.37% data, 34.29% docs in cluster, avg obj size on shard : 426B
Shard shard0000 contains 39.33% data, 39.43% docs in cluster, avg obj size on shard : 424B
Why is this happening? How I can avoid this effect? (maybe forcing somekind of sync with a command?)
Thanks!
PD: In the case it may be relevant, I'm using a testing environment setup, which uses a standalone mongod process to implement each shard. The config server uses a mono-node replica set configuration.
count provides an estimated count, and may not be accurate. Use countDocuments to get an accurate count.
You can read the source of getShardDistribution by typing db.users.getShardDistribution in the shell. It seems to use information stored in the config database.
It is quite reasonable to expect that the statistics stored by the database aren't exactly accurate. This is because there is a cost to have them be up-to-date whenever any operation is performed anywhere in the cluster.
You seem to be looking at statistics at a point in time after some chunks have been copied from one shard to another and before these chunks are removed from the original shard. In this situation the data is stored twice in the cluster. The statistics aren't accurate in this case. To obtain an accurate count, use countDocuments.
I am a new user of MongoDB, and I am hoping to get pointed in the right direction. I will provide any further needed information I have missed as this question develops.
I am using a Perl program to upload and annotate/modify documents to a/in a MongoDB database via the MongoDB cpan module. Indexes are being used (I believe) for this program, but the problem I have is that reading from MongoDB takes increasingly long. Based on mongotop, it takes ~500 ms to read and only 10-15 ms to write. After allowing the program to run for a considerable amount of time, the read time increases significantly, taking more then 3000+ ms after many hours of running.
Monitoring the program while its running using top, Perl starts out at around 10-20% CPU usage and MongoDB starts at 70-90% CPU usage. While running, within a few minutes Perl drops below 5% and mongoDB is 90-95%. After running for a much longer period of time (12+ hours), MongoDB is ~98% CPU usage while Perl is around 0.3%, but only pops up every 5-10 seconds in top.
Based on this trend, an indexing issue seems very likely but I am not sure how to check this, and all I know is that the appropriate indexes are at least made, but not necessarily being used.
Additional information:
$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 19209
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 19209
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
As the program runs, I see that the indexSize and dataSize change (via db.stats() in the Mongo shell), which makes me think that they are at least being used to some degree
Is this something that could be affected by the power of my computer? I am under the impression that indexing should make a lot of this process very manageable for the computer
That sounds a lot like it could be doing a collection scan rather than using the index. I.e. as your collection grows, the reads are getting slower.
If you're using the find method, you can run explain on the resulting cursor to get information on how the query would execute.
Here's a trivial example:
use MongoDB;
use JSON::MaybeXS;
my $coll = MongoDB->connect->ns("test.foo");
$coll->drop();
$coll->insert_one({x => $_}) for 1 .. 1000;
my $cursor = $coll->find({x => 42});
my $diag = $cursor->explain;
my $json = JSON::MaybeXS->new(
allow_blessed => 1, convert_blessed => 1, pretty => 1, canonical => 1
);
print $json->encode($diag->{queryPlanner}{winningPlan});
Looking at just the 'winningPlan' part of the output you can see 'COLLSCAN':
{
"direction" : "forward",
"filter" : {
"x" : {
"$eq" : 42
}
},
"stage" : "COLLSCAN"
}
Now I'll do it again, but first creating an index on 'x' before the insertions with $coll->indexes->create_one([x => 1]). You can see in the output that the query plan is now using the index (IXSCAN).
{
"inputStage" : {
"direction" : "forward",
"indexBounds" : {
"x" : [
"[42, 42]"
]
},
"indexName" : "x_1",
"indexVersion" : 2,
"isMultiKey" : false,
"isPartial" : false,
"isSparse" : false,
"isUnique" : false,
"keyPattern" : {
"x" : 1
},
"multiKeyPaths" : {
"x" : []
},
"stage" : "IXSCAN"
},
"stage" : "FETCH"
}
There's a lot more you can discover from the full 'explain' output. You can watch a great video from MongoDB World 2016 to learn more about it: Deciphering Explain Output.
I have a sharded cluster setup for my app but unfortunately one of the shard is taking 17 GB of data size and others are taking average 3 GB of data size. What could be the issue?
sh.status() gives me huge output. Shared here: https://www.dropbox.com/s/qqsucbm6q9egbhf/shard.txt?dl=0
My bad collection shard distribution details is below.
mongos> db.MyCollection_1_100000.getShardDistribution()
Shard shard_0 at shard_0/mongo-11.2816.mongodbdns.com:270
00,mongo-12.2816.mongodbdns.com:27000,mongo-13.2816. mongodbdns.com:27000,mongo-3.2816.mongodbdns.com:27003
data : 143.86MiB docs : 281828 chunks : 4
estimated data per chunk : 35.96MiB
estimated docs per chunk : 70457
Shard shard_1 at shard_1/mongo-10.2816.mongodbdns.com:270 00,mongo-11.2816.mongodbdns.com:27002,mongo-19.2816. mongodbdns.com:27001,mongo-9.2816.mongodbdns.com:27005
data : 107.66MiB docs : 211180 chunks : 3
estimated data per chunk : 35.88MiB
estimated docs per chunk : 70393
Shard shard_2 at shard_2/mongo-14.2816.mongodbdns.com:270 00,mongo-3.2816.mongodbdns.com:27000,mongo-4.2816.mo ngodbdns.com:27000,mongo-6.2816.mongodbdns.com:27002
data : 107.55MiB docs : 210916 chunks : 3
estimated data per chunk : 35.85MiB
estimated docs per chunk : 70305
Shard shard_3 at shard_3/mongo-14.2816.mongodbdns.com:270 04,mongo-18.2816.mongodbdns.com:27002,mongo-6.2816.m ongodbdns.com:27000,mongo-8.2816.mongodbdns.com:27000
data : 107.99MiB docs : 211506 chunks : 3
estimated data per chunk : 35.99MiB
estimated docs per chunk : 70502
Shard shard_4 at shard_4/mongo-12.2816.mongodbdns.com:270 01,mongo-13.2816.mongodbdns.com:27001,mongo-17.2816. mongodbdns.com:27002,mongo-6.2816.mongodbdns.com:27003
data : 107.92MiB docs : 211440 chunks : 3
estimated data per chunk : 35.97MiB
estimated docs per chunk : 70480
Shard shard_5 at shard_5/mongo-17.2816.mongodbdns.com:270 01,mongo-18.2816.mongodbdns.com:27001,mongo-19.2816. mongodbdns.com:27000
data : 728.64MiB docs : 1423913 chunks : 4
estimated data per chunk : 182.16MiB
estimated docs per chunk : 355978
Shard shard_6 at shard_6/mongo-10.2816.mongodbdns.com:270 01,mongo-14.2816.mongodbdns.com:27005,mongo-3.2816.m ongodbdns.com:27001,mongo-8.2816.mongodbdns.com:27003
data : 107.52MiB docs : 211169 chunks : 3
estimated data per chunk : 35.84MiB
estimated docs per chunk : 70389
Shard shard_7 at shard_7/mongo-17.2816.mongodbdns.com:270 00,mongo-18.2816.mongodbdns.com:27000,mongo-19.2816. mongodbdns.com:27003,mongo-9.2816.mongodbdns.com:27003
data : 107.87MiB docs : 211499 chunks : 3
estimated data per chunk : 35.95MiB
estimated docs per chunk : 70499
Shard shard_8 at shard_8/mongo-19.2816.mongodbdns.com:270 02,mongo-4.2816.mongodbdns.com:27002,mongo-8.2816.mo ngodbdns.com:27001,mongo-9.2816.mongodbdns.com:27001
data : 107.83MiB docs : 211154 chunks : 3
estimated data per chunk : 35.94MiB
estimated docs per chunk : 70384
Shard shard_9 at shard_9/mongo-10.2816.mongodbdns.com:270 02,mongo-11.2816.mongodbdns.com:27003,mongo-12.2816. mongodbdns.com:27002,mongo-13.2816.mongodbdns.com:27002
data : 107.84MiB docs : 211483 chunks : 3
estimated data per chunk : 35.94MiB
estimated docs per chunk : 70494
Totals
data : 1.69GiB docs : 3396088 chunks : 32
Shard shard_0 contains 8.29% data, 8.29% docs in cluster, avg obj size on shard : 535B
Shard shard_1 contains 6.2% data, 6.21% docs in cluster, avg obj size on shard : 5 34B
Shard shard_2 contains 6.2% data, 6.21% docs in cluster, avg obj size on shard : 5 34B
Shard shard_3 contains 6.22% data, 6.22% docs in cluster, avg obj size on shard : 535B
Shard shard_4 contains 6.22% data, 6.22% docs in cluster, avg obj size on shard : 535B
Shard shard_5 contains 42% data, 41.92% docs in cluster, avg obj size on shard : 5 36B
Shard shard_6 contains 6.19% data, 6.21% docs in cluster, avg obj size on shard : 533B
Shard shard_7 contains 6.21% data, 6.22% docs in cluster, avg obj size on shard : 534B
Shard shard_8 contains 6.21% data, 6.21% docs in cluster, avg obj size on shard : 535B
Shard shard_9 contains 6.21% data, 6.22% docs in cluster, avg obj size on shard : 534B
I have 150+ similar collections where I have divided data by user_id's
e.g. MyCollection_1_100000
MyCollection_100001_200000
MyCollection_200001_300000
Here I have divided data of user id's ranging from 1 to 100000 in MyCollection_1_100000 likewise for other collections
shard key for all 150+ collection is sequential number but it is hashed. Applied by following way
db.MyCollection_1_100000.ensureIndex({"column": "hashed"})
sh.shardCollection("dbName.MyCollection_1_100000", { "column": "hashed" })
Please suggest me corrective steps to get rid of unbalanced shard problem.
Unshared Collections
Shard 5 is the primary shard in your cluster, which means it will take all unsharded collections and therefore grows bigger in size. You should check for that. See here.
Chunk Split
As Markus pointed out, distribution is done by chunk and not by documents. Chunks may grow up to their defined chunk size. When they exceed the chunk size they are split and redistributed. In your case there seems to be at least one collection that has 1 additional chunk than all the other shards. The reason could be that either the chunk has not yet reached it's chunk limit (check db.settings.find( { _id:"chunksize" }) default size is 64MB, see also here) or that the chunk can not be split because the range represented by the chunk can not be further split automatically. You should check the ranges using the sh.status(true) command (the output of the ranges is omitted for some collections in the large output you posted)
However you may split the chunk manually.
There is also a quite good answer on the dba forum.
Shard Key
If you have no unsharded collections, the problem may be the shard key itself. Mongo suggest to use a shard key with high cardinality and a high degree of randomness. Without knowing the value range of your columns, I assume the cardinality is rather low (i.e. 1000 columns) compared to, lets say a timestamp (1 for every single entry, making up to a LOT of different values).
Further, the data should be evenly distributed. So lets say you have 10 possible columns. But there are a lot more entries with a particular value for the column name all that entries would be written to the same shard. For example
entries.count({column: "A"} = 10 -> shard 0
entries.count({column: "B"} = 10 -> shard 1
...
entries.count({column: "F"} = 100 -> shard 5
The sh.status() command should give you some more information about the chunks.
If you use the object id or a timestamp - which are values that are monotonically increasing - will lead to data being written to the same chunk as well.
So Mongo suggests to use a compound key which will lead to a higher cardinality (value-range of field1 x value-range of field2). In your case you could combine the column name with a timestamp.
But either way, you're out of luck for your current installation, as you can not change the shard key afterwards.
DB Design
The verbose output you printed also indicates, you have several dbs/collections with same schema or purpose which occur to me to be sort of manually partitioned. Is there a particular reason for this? This could have an effect on the distribution of the data in the cluster as well as every collection start to be filled on the primary node. There is at least one collection with just a single chunk in the primary, and some with 3 or 4 chunks in total, all having at least one chunk on the primary (i.e. the z_best_times_*).
Preferrably you should only have a single collection for one purpose and probably use a compound shard key (i.e. hashed timestamp in addition).
I have a single-host database which grew up to 95% of disk space while I was not watching. To remedy the situation I created a process that automatically removes the old records from the biggest collection, so the data usage fell to about 40% of disk space. I figured I was safe as long as the data size doesn't grow near the size of preallocated files, but after a week I was proven wrong:
Wed Jan 23 18:19:22 [FileAllocator] allocating new datafile /var/lib/mongodb/xxx.101, filling with zeroes...
Wed Jan 23 18:25:11 [FileAllocator] done allocating datafile /var/lib/mongodb/xxx.101, size: 2047MB, took 347.8 secs
Wed Jan 23 18:25:14 [conn4243] serverStatus was very slow: { after basic: 0, middle of mem: 590, after mem: 590, after connections: 590, after extra info: 970, after counters: 970, after repl: 970, after asserts: 970, after dur: 1800, at end: 1800 }
This is the output of db.stats(): (note that the numbers are in MB because of scale)
> db.stats(1024*1024)
{
"db" : "xxx",
"collections" : 47,
"objects" : 189307130,
"avgObjSize" : 509.94713418348266,
"dataSize" : 92064,
"storageSize" : 131763,
"numExtents" : 257,
"indexes" : 78,
"indexSize" : 29078,
"fileSize" : 200543,
"nsSizeMB" : 16,
"ok" : 1
}
Question: What can I do to stop MongoDB from allocating new datafiles?
Running repair is difficult because I would have to install new disk. Would running compact help? If yes, should I be running it regularly and how can I tell when I should run it?
UPDATE: I guess I am missing something fundamental here... Could someone please elaborate on connection between data files, extents, collections and database, and how space is allocated when needed?
Upgrade to 2.2.2 - 2.2.0 has an idempotency bug in replication and no longer recommended for production.
See here for general info http://docs.mongodb.org/manual/faq/storage/#faq-disk-size
The only way to recover space back from mongodb is to either sync a new node over the network - in which case the documents are copied over the the new file system and stored anew without fragmentation. Or to use the repair command - but for that you need double the disk space that you are using on disk. The data files are copied, defragged and compacted and copied back over the original. The compact command is badly named and only defrags - it doesn't recover disk space back from mongo.
Going forward, use usePowerOf2Sizes command (new in 2.2.x) http://docs.mongodb.org/manual/reference/command/collMod/
If you use that command and allocate say an 800 byte document, 1024 bytes will be allocated on disk. If you then delete that doc and insert a new one - say 900 bytes, that doc can fit in the 1024 byte space. Without this option enabled, the 800 byte doc might only have 850 bytes on disk - so when it's deleted and the 900 byte doc inserted, new space has to be allocated. And if that is then deleted you will end up with two free space - 850 bytes and 950 bytes which are never joined (unless compact or repair is used) - so then insert a 1000 byte doc and you need to allocate another chunk of disk. usePowerOf2Sizes helps this situation a lot by using standard bucket sizes.