How to reset MongoDB's collection statistics? - mongodb

Today I've been working on a performance test with MongoDB. Once I managed to use all the left space of my hard disk so the test was halted at the middle. So I removed some of the files and restarted the test after a db.dropDatabase();. But I noticed that the results of db.collection.stats(); seems to be wrong now.
My question is, how can I make MongoDB reset / recalculate statistics of a collection?

Sounds like mongodb is keeping space for the data and indexes it "knows" you will need when you run the test again, even though there is no data there at the moment.
What files did you delete? If you really don't need the data, you could stop mongod, and delete the other files corresponding to the database - but this is only safe if you are running in a test environment, and not sharing your database.

I think you're looking for db.collectionName.drop() function, then to reimport your collection using mongoimport --db dbName --collection collectionName --file fileName to view whether or not those values are correct, quick guess though is that they are correct.

Related

Moving existing MongoDb deployment to directoryperdb by moving files instead of dump/restore

I'm currently trying to move my existing MongoDb deployment to the --directoryperdb option to make use of different, mounted volumes. In the docs this is achieved via mongodump and restore, however on a large database with over 50GB compressed data this takes a really long time and indices need to be completely rebuilt.
Is there a way to move the WiredTiger files inside the current /data/db path, just how you would when just changing the db path? I've tried copying the corresponding collection- and index- files into their subdirectory, which doesn't work. Creating dummy collections and replacing them with the old files and then running --repair works but I don't know how long it takes since I only tested it with a few docs large collection, this also seems very hacky with a lot of things that could go wrong (for example data loss).
Any advice on how to do this, or is this something that simply should not be done?

mongoimport without dropping the data first

I reset my database every night with a mongoimport command. Unfortunately, I understand that it drops the database first then fills it again.
This means that my database is being queried while half-filled. Is there a way to make the mongoimport atomic ? This would be achieved by first filling another collection, dropping the first then renaming the second.
Is that a builtin feature of mongoimport ?
Thanks,
It's unclear what behaviour you want from your nightly process.
If your nightly process is responsible for creating a new dataset then dropping everything first makes sense. But if your nightly process is responsible for adding to an existing dataset then that might suggest using mongorestore (without --drop) since mongorestore's behaviour is:
mongorestore can create a new database or add data to an existing database. However, mongorestore performs inserts only and does not perform updates. That is, if restoring documents to an existing database and collection and existing documents have the same value _id field as the to-be-restored documents, mongorestore will not overwrite those documents.
However, those concerns seem to be secondary to your need to import / restore into your database while it is still in use. I don't think either mongoimport or mongorestore are viable 'write mechanisms' for use when your database is online and available for reads. From your question, you are clearly aware that issues can arise from this but there is no Mongo feature to resolve this for you. You can either:
Take your system offline during the mongoimport or mongorestore and then bring it backonline once that process is complete and verified
Use mongoimport or mongorestore to create a side-by-side database and then once this database is ready switch your application to read from that database. This is a variant of a Blue/Green or A/B deployment model.

Mongodb back up old data then remove it from database periodically

I have a project that currently stores GPS tracking data in MongoDB, and it grows really fast. In order to slow it down, I want to automatically backup old data then remove it from the database monthly. The old data must older than 3 months.
Is there any solution to accomplish that?
This question was partially answered earlier.
After using this approach, you can backup the collection with your old data using mongodump for example:
mongodump -host yourhost [-u user] -d yourdb -c collOldDocs

Copying a MongoDB record for record

We have a MongoDB sitting at 600GB. We've deleted a lot of documents, and in the hopes of shrinking it, we repaired it onto a 2TB drive.
It ran for hours, eventually running out of the 2TB space. When I looked at the repair directory, it had created way more files than the original database??
Anyway, I'm trying to look for alternative options. My first thought was to create a new MongoDB, and copy each row from the old to the new. Is it possible to do this, and what's the fastest way?
I have a lot of success in copying database using the db.copyDatabase command:
link to mongodb copyDatabase
I have also used MongoVUE, which is a software that makes it easy to copy databases from one location to another - MonogoVUE (which is jus ta graphical interface on top of monogo).
If you have no luck with copyDatabase, I can suggest you try to dump and restore the database to an external file, something like mongodump or lvcreate
Here is a full read on backup and restore which should allow you to copy the database easily: http://docs.mongodb.org/manual/core/backups/

Export from mongo database file to bson

I have a mongo database db.ns, db.0, db.1, ... db.7
Accidentally I remove all the data from a collection, but in the database files (explorer with vim) it's all (or part of) the data.
After trying to recover the data moving to another mongodb instance, or mongod --restore, also, I try with the mongodump, but the collection appears empty.
I try to recover from scratch, directly from the files. I try with bsondump for each one, and for a single file (cat db.ns db.1 ... > bigDB) but nothing.
I don't know what other ways are from recover the data from a mongo database file.
Any suggestion?? Thx!!!
[SOLVED]
I will try to explain what I do to "solved" the problem.
First. Theory.
In this SlideShare, can see a little of how files MongoDB database work.
http://www.slideshare.net/mdirolf/inside-mongodb-the-internals-of-an-opensource-database
Options:
When you remove accidentally a collection:
the first thing that you have to do is quickly copy all the database (normally in /data/db or /var/lib/mongodb) and stop the service.
Remove the journal directory try to recover from this copy and pray ;D
You can see more about that, here:
mongodb recovery removed records
In my case, this did not work for me.
In Journaly case, mongo no update its database files directly only their indexes.
So that, you can access to the files (appointed as database.ns, database.0, database.1 ...) and try to recover this.
These files are as cleaved BSONs and binary. So, you can open, and see all the information
In my case, I create a easy function in PHP that first read the file, and explode the file in smallers files.
Before, takes one to one and apply some regular expresions to remove Hexadecimal values, explode the info into the registers (you can see the "_id" key to do that) and do some others task to clean the info.
And finally, I have to process manually all the preprocessed info to obtain all the information.
I think, I have lost, at least, the 15-25% of the information. But I prefer to think that I have recovered the 75% of the lost info.
Caution:
This is not a easy and secure way to solve this problem. In my case, the db only recive information, and not modify or update this.
With this method, a lot of information will be lost, Mongo IDs, integers, dates, can't be recovered.
The proccess is 100% manually, you can spend your time on automating certain tasks, but will depend on your database structure.