Machine hangs when mongoDB db.copyDatabase(...) takes all vailable RAM - mongodb

When I try to copy a database from one mongoDB server to another (About 100GB) the mongo daemon process takes 99% of the available RAM (Windows 64bit 16GB). As a result the system becomes very slow and sometimes unstable.
Is there any way to avoid it?
MongoDB 2.0.6

Albert.
MongoDB is very much an "in ram" application. Mongo has all of your database memory mapped for usage but normally only the most recently used data will be in RAM (called your working set) and mongo will page out to get any data not in RAM as needed. Normally mongo's behaviour is only to have as much as it needs in RAM, however when you do something like a DB Copy all of the data is needed - thus the mongod consuming all your ram.
There is no ideal solution to this, but if desperately needed you could use WSRM http://technet.microsoft.com/en-us/library/cc732553.aspx to try and limit the amount of RAM consumed by the process. This will have the effect of making the copy take longer and may cause other issues.

Related

How do I have Mongo 3.0 / WiredTiger load my whole database into RAM?

I have a static database (that will never even receive a write) of around 5 GB, while my server RAM is 30 GB. I'm focusing on returning complicated aggregations to the user as fast as possible, so I don't see a reason why I shouldn't have (a) the indexes and (b) the entire dataset stored entirely in RAM, and (c) automatically stored there whenever the Mongo server boots up. Currently my main bottleneck is running group commands to find unique elements out of millions of rows.
My question is, how can I do either (a), (b), or (c) while running on the new Mongo/WiredTiger? I know the "touch" command doesn't work with WiredTiger, so most information on the Internet seems out of date. Are (a), (b), or (c) already done automatically? Should I not be doing each of these steps with this use case?
Normaly you shouldn't have to do anything. The disk pages are loaded in RAM upon request and stay there. If there is no more free memory the older (unused) pages get unloaded to be used by other programs that need them.
If you must have your whole db in ram you could use a ramdisk and tell mongo to use it as a storage device.
I would recommend that you revise your indices and/or data structures. Having the correct ones can make a huge difference in performance. We are talking about seconds vs hours.

mongod clean memory used in ram

I have a huge amount of data in my mongodb. It's filled with tweets (50 GB) and my Ram is 8 GB. When querying it retrieves all tweets and mongodb starts filling the ram, when it reaches 8 GB it starts moving files to disk. This is the part where it gets really slowwwww. So i changed the query from skipping and starting using indexes. Now i have indexes and i query only 8GB to my program, save the id of the last tweet used in a file and the program stops. Then restart the program and it goes get the id of the tweet from the file. But mogod server still is ocupping the ram with the first 8GB, that no longer will be used, because i have a index to the last. How can i clean the memory of the mongo db server without restarting it?
(running in a win)
I am a bit confused by your logic here.
So i changed the query from skipping and starting using indexes. Now i have indexes and i query only 8GB to my program, save the id of the last tweet used in a file and the program stops.
Using ranged queries will not help the amount of data you have to page in (in fact it might worsen it because of the index), it merely makes the query faster server side by using an index for huge skips (like 42K+ row skip). If you are dong the same as that skip() but in index then (without a covered index) then you are still paging in exactly the same.
It is slow due to memory mapping and your working set. You have more data than RAM and not only that but you are using more of that data than you have RAM as such you are page faulting probably all the time.
Restarting the program will not solve this, nor will clearing its data OS side (with restart or specific command) because of your queries. You probably need to either:
Think about your queries so that your working set is more in line to your memory
Or shard your data across many servers so that you don't have to build up your primary server
Or get a bigger primary server (moar RAM!!!!!)
Edit
The LRU of your OS should be swapping out old data already since MongoDB is using its fully allocated lot, which means that if that 8GB isn't swapped it is because your working set is taking that full 8GB (most likely with some swap on the end).

mongodb memory usage is going high even if only insertions are made

I am using mongodb for only inserting documents. There are no indexes created for the collection I use. But I see that memory used by Mongodb is going high. Machine is having 20GB of RAM which is completely used. I would like to know the reason for this and is this normal?
There is an excellent discussion of how MongoDB uses storage and memory here:
http://docs.mongodb.org/manual/faq/storage/
Specifically to your case, Mongo memory maps the files that data and indexes live in (the ones inside of /data/db directory by default) which means that the files are mapped to OS's virtual address space and then whenever MongoDB accesses a page (any part of the file) that page will get pulled into RAM and it will stay there until all of RAM that's made available to mongod process by the OS is used (at that point least-recently-used pages will be swapped out of RAM).
You are inserting data and therefore you are writing to data files - those pages you are writing to need to be in RAM (mongo writes to files but since they are memory mapped it gets to interact with memory as if it's disk storage). If mongod is using 20GB+ of RAM that means your data plus your indexes (plus some overhead for other things) are 20GB or more.

MongoDB consumes a lot of memory

For more than a month is my war with mongoDB. Until I lose =] ...
Battle 1. Battle 2.
And now a new problem. Again, not enough memory.
Initially, this was solved by simply increasing the memory at a rate of VPS. Then journal = false. But now I got to the top of your plan and continue to increase the memory is not possible.
For my base are lacking 4 GB of memory.
How should I choose a database for the project, was nowhere written that there are so many mongoDB memory. With about 10 million records in the mongoDB missing 4 GB of memory, when my MySQL database with 10 million easily copes with 1.4 GB of memory.
The problem as I understand it, a large number of index fields. But since I can not log into the database, respectively, can not remove them. They needed me in the early stages of development, now they are not important to me.
Tell me please, can I remove them somehow?
There is a dump of the database is completely whole folder database / data / db
On my PC with 4 GB of memory database does not start on a VPS with 4GB same.
As an alternative, I think to take a test period at some VPS / VDS to run mongo and delete keys.
Do you know a web hosting with a test period and 6 GB of memory?
Or if there is an alternative, could you say what?
The issues has very little to do with the size of your data set. MongoDB uses memory mapped files for its storage engine. As such it'll start swapping in pages of hot data into memory when it can and it does so fairly aggressively (or more accurately, the OS memory management does).
Basically it uses as much memory as is available to it and there's very little you can do to avoid it. All data pages (be it actual data or indexes) that are accessed during operation will be swapped into memory if there is space available.
There are plenty of references to this on the internet and on mongodb.org by the way. Saying it isn't mentioned anywhere isn't really true.

Fastest way to update large amount of data

I have milions of rows in mongo collection and need to update all of them.
I've written a mongo shell (JS) script like this:
db.Test.find().forEach(function(row) {
// change data and db.Test.save()
});
which (i guess) should be faster then e.g. updating via any language driver due to possible latency between web server and mongo server itself and just because the fact, that driver is "something on the top" and mongo is "something in the basement".
Even though it can update aproximately 2 100 rec./sec on quad-core 2.27GHz processor with 4GB RAM.
As i know mongoimport can handle around 40k rec./sec (on the same machine), i don't think mentioned speed is anything "fast".
Is there any faster way?
There are two possible limiting factors here:
Single write lock: MongoDB only has one write lock, this may be the determining factor.
Disk Access: if data being updated is not actively in memory it will need to be loaded from disk which will cause a slow-down.
Is there any faster way?
The answer here depends on the bottleneck. Try running iostat and mongostat to see where the bottleneck lies. If iostat shows high disk IO, then you're being held back by the disk. If mongostat shows a high "lock%" then you've maxed out access to the global write lock.
If you've maxed out IO, there is no simple code fix. If you've maxed out the write lock, there is no simple code fix. If neither of these is an issue, it may be worth trying another driver.
As i know mongoimport can handle around 40k rec./sec (on the same machine)
This may not be a fair comparison, many people run mongoimport on a fresh database and the data is generally just loaded into RAM.
I would start by checking iostat / mongostat.