Export from mongo database file to bson - mongodb

I have a mongo database db.ns, db.0, db.1, ... db.7
Accidentally I remove all the data from a collection, but in the database files (explorer with vim) it's all (or part of) the data.
After trying to recover the data moving to another mongodb instance, or mongod --restore, also, I try with the mongodump, but the collection appears empty.
I try to recover from scratch, directly from the files. I try with bsondump for each one, and for a single file (cat db.ns db.1 ... > bigDB) but nothing.
I don't know what other ways are from recover the data from a mongo database file.
Any suggestion?? Thx!!!

[SOLVED]
I will try to explain what I do to "solved" the problem.
First. Theory.
In this SlideShare, can see a little of how files MongoDB database work.
http://www.slideshare.net/mdirolf/inside-mongodb-the-internals-of-an-opensource-database
Options:
When you remove accidentally a collection:
the first thing that you have to do is quickly copy all the database (normally in /data/db or /var/lib/mongodb) and stop the service.
Remove the journal directory try to recover from this copy and pray ;D
You can see more about that, here:
mongodb recovery removed records
In my case, this did not work for me.
In Journaly case, mongo no update its database files directly only their indexes.
So that, you can access to the files (appointed as database.ns, database.0, database.1 ...) and try to recover this.
These files are as cleaved BSONs and binary. So, you can open, and see all the information
In my case, I create a easy function in PHP that first read the file, and explode the file in smallers files.
Before, takes one to one and apply some regular expresions to remove Hexadecimal values, explode the info into the registers (you can see the "_id" key to do that) and do some others task to clean the info.
And finally, I have to process manually all the preprocessed info to obtain all the information.
I think, I have lost, at least, the 15-25% of the information. But I prefer to think that I have recovered the 75% of the lost info.
Caution:
This is not a easy and secure way to solve this problem. In my case, the db only recive information, and not modify or update this.
With this method, a lot of information will be lost, Mongo IDs, integers, dates, can't be recovered.
The proccess is 100% manually, you can spend your time on automating certain tasks, but will depend on your database structure.

Related

Moving existing MongoDb deployment to directoryperdb by moving files instead of dump/restore

I'm currently trying to move my existing MongoDb deployment to the --directoryperdb option to make use of different, mounted volumes. In the docs this is achieved via mongodump and restore, however on a large database with over 50GB compressed data this takes a really long time and indices need to be completely rebuilt.
Is there a way to move the WiredTiger files inside the current /data/db path, just how you would when just changing the db path? I've tried copying the corresponding collection- and index- files into their subdirectory, which doesn't work. Creating dummy collections and replacing them with the old files and then running --repair works but I don't know how long it takes since I only tested it with a few docs large collection, this also seems very hacky with a lot of things that could go wrong (for example data loss).
Any advice on how to do this, or is this something that simply should not be done?

Copying a MongoDB record for record

We have a MongoDB sitting at 600GB. We've deleted a lot of documents, and in the hopes of shrinking it, we repaired it onto a 2TB drive.
It ran for hours, eventually running out of the 2TB space. When I looked at the repair directory, it had created way more files than the original database??
Anyway, I'm trying to look for alternative options. My first thought was to create a new MongoDB, and copy each row from the old to the new. Is it possible to do this, and what's the fastest way?
I have a lot of success in copying database using the db.copyDatabase command:
link to mongodb copyDatabase
I have also used MongoVUE, which is a software that makes it easy to copy databases from one location to another - MonogoVUE (which is jus ta graphical interface on top of monogo).
If you have no luck with copyDatabase, I can suggest you try to dump and restore the database to an external file, something like mongodump or lvcreate
Here is a full read on backup and restore which should allow you to copy the database easily: http://docs.mongodb.org/manual/core/backups/

mongodb create database file automatically after certain period

Mongodb Database generate files automatically after certain period as follow
Doc.0
Doc.1
Doc.2
Doc.3
Doc.4
but Doc.ns file never regenerate like above file
I'm not sure exactly what, if anything, you are specifying as a problem. This is expected behavior. MongoDB allocates new data files as the data grows. The .ns file, which stores namespace information, does not grow like data files, and shouldn't need to.

MongoDB, modifydate of data files not updated (Windows)

I just noticed something strange with MongoDb physical files on Windows.
When i update/insert data in my MongoDB database, the modify date of the related *.0 files in 'data' folder is not updated.
I understand that the size is pre-allocated so it stays the same, but why the modify date is not at least updated ?
This is quite an issue for me cause i store the mongo files in Google drive, so nothing is updated due to this weird behavior.
Does anyone have an explanation ?
Thank you
As an answer to my own question, i came with the solution of using **mongodump** to keep my data up to date on Google Drive.
I scheduled a task to dump frequently my mongo database into same data folder, so this folder is backed up, it just need some extra work using **mongorestore** on other machines but it's quite easy steps.

Repair Corrupt database postgresql

I have multiple errors with my postgresql db, which resulted after a power surge:
I cannot access most tables from my database. When I try for example select * from ac_cash_collection, I get the foolowing error:
ERROR: missing chunk number 0 for toast value 118486855 in pg_toast_2619
when I try pg_dump I get the following error:
Error message from server: ERROR: relation "public.st_stock_item_newlist" does not exist
pg_dump: The command was: LOCK TABLE public.st_stock_item_newlist IN ACCESS SHARE MODE
I went ahead and tried to run reindex of the whole database, I actually I left it runnng, went to sleep, and I found it had not done anything in the morning, so I had to cancel it.
I need some help to fix this as soon as possible, Please help.
Before you do anything else, http://wiki.postgresql.org/wiki/Corruption and act on the instructions. Failure to do so risks making the problem worse.
There are two configuration parameters listed in the Fine Manual that might be of use: ignore_system_indexes and zero_damaged_pages. I have never used them, but I would if I were desparate ...
I don't know if they help against toast-tables. In any case, if setting them causes your database(s) to become usable again, I would {backup + drop + restore} to get all tables and catalogs into newborn shape again. Success!
If you have backups, just restore from them.
If not - you've just learned why you need regular backups. There's nothing PostgreSQL can do if hardware misbehaves.
In addition, if you ever find yourself in this situation again, first stop PostgreSQL and take a complete file-level backup of everything - all tablespaces, WAL etc. That way you have a known starting point.
So - if you still want to recover some data.
Try dumping individual tables. Get what you can this way.
Drop indexes if they cause problems
Dump sections of tables (id=0..9999, 1000..19999 etc) - that way you can identify where some rows may be corrupted and dump ever-smaller sections to recover what's still good.
Try dumping just certain columns - large text values are stored out-of-line (in toast tables) so avoiding them might get the rest of your data out.
If you've got corrupted system tables then you're getting into a lot of work.
That's a lot of work, and then you'll need to go through and audit what you've recovered and try to figure out what's missing/incorrect.
There are more things you can do (creating empty blocks in some cases can let you dump partial data) but they're all more complicated and fiddly and unless the data is particularly valuable not worth the effort.
Key message to take away from this - make sure you take regular backups, and make sure they work.
Before you do ANYTHING ELSE, take a complete file-system-level copy of the damaged database.
http://wiki.postgresql.org/wiki/Corruption
Failure to do so destroys evidence about what caused the corruption, and means that if your repair efforts go badly and make things worse you can't undo them.
Copy it now!
If few/specific files are corrupted, following tricks might help.
Restore Older dump in different node or a second installation.
Copy required files from RESTORED/second installation to FAILED node.
Stop & Start PSQL.
From Today's experience!
Error message from server: ERROR: could not read block 226448 in file "base/12345/12345.1": Input/output error
try (probably it will fail)
cp base/12345/12345.1 /root/backup/12345.1-orig
try mv base/12345/12345.1 /root/backup/12345.1-orig #expecting this to finish. Else do rm -rf base/12345/12345.1 /root/backup/12345.1-orig
Finally,
Magic of tar. (if below tar completes, you have luck!)
tar -zcvf my_backup.tar.gz /var/lib/postgresql/xx/main/xx
Extract the corrupted file from TAR.
Replace it in original locationbase/12345/12345.1.
Stop & Start PGSQL
IMPORTANT: Please try googling and do try vaccum, reindex and disk checks like fsck etc before getting to this stage.
Also, always take a Filesystem backup before doing any TRIAL and ERROR method :)