Create mongo backup/restore without indexes

Create mongo backup/restore without indexes - mongodb

Im wondering how can I create a mongodump/mongorestore without backing up, restoring the indexes?
and how to incrementally restore a mongo db without restoring the indexes?

The mongodump utility creates a binary export of data from MongoDB and saves index definitions and collection options in a metadata.json associated with each database dumped. The index details do not take any significant space in your backup, and will normally be used by mongorestore to re-ensure indexes after each data for each collection is imported from a dump.
If you want to avoid creating any new secondary indexes after the restore completes, mongorestore has a --noIndexRestore option.
Note: The default _id index is required, and always created.
incrementally restore a mongo db without restoring the indexes?
The option for --noIndexRestore applies whether or not you are restoring into an existing database. If you mongorestore into an existing database with indexes using the --noIndexRestore option, no new index definitions will be added but existing indexes will still be updated as data is inserted.
Incremental backup & restore is really a separate question unless you have a simplistic use case: inserting new documents from successive dumps.
As at MongoDB 2.6, the mongorestore utility only inserts documents (i.e. there is no option for updates/upserts). You can use mongorestore to insert multiple dumps into an existing collection, but any documents causing duplicate key exceptions (eg. _id) will be skipped.
I would normally expect that an incremental backup & restore implies taking a delta of changes (all inserts/updates/deletions since a prior backup) and being able to re-apply those to an older copy of the same data. To achieve an incremental backup, you need a history of changes to data, which in MongoDB's case would be provided by the replica set operation log (oplog).

sudo mongorestore --db dbName ./dumpPath/ --noIndexRestore

Related

How to avoid auto-updating of MongoDB ObjectIds after restoring the dump using mongorestore command?

I am trying to restore my database dump using mongorestore command. And got to know that some ObjectIds are updating automatically after restoring and am facing data dependency issues with that. Does ObjectIds updates itself while restoring? If yes, is there any alternative to stop that issue?

MongoDB: restore a collection in the mongo shell

I am trying to import/restore a single collection from within MongoDB (i.e. mongorestore cannot be accessed, I think ...?).
Is it possible? What is the command? Ideally, I'd like to include indexes as well. The backup has been produced by mongodump.
Specifically, I am using the IntelliShell from the excellent MongoChef. I perform other commands in this as well, such as renaming existing collections first.

MongoDB 2.2: why didn't replication catch up a collection following a dump/restore?

We have a three-server replicaset running MongoDB 2.2 on Ubuntu 10.04, and recently had to upgrade the hard drive for each server where one particular database resides. This database contains log information for web service requests, where they write to collections in hourly buckets using the current timestamp to determine the name, e.g. log_yyyymmddhh.
I performed this process:
backup the database on the primary server with mongodump --db log_db
take a secondary server offline, replace the disk
bring the secondary server up in standalone mode (i.e. comment out the replSet entry
in /etc/mongodb.conf before starting the service)
restore the database on the secondary server with mongorestore --drop --db log_db
add the secondary server back into the replicaset and bring it online,
letting replication catch up the hourly buckets that were updated/created
while it had been offline
Everything seemed to go as expected, except that the collection which was the current bucket at the time of the backup was not brought up to date by replication. I had to manually copy that collection over by hand to get it up to date. Note that collections which were created after the backup were synched just fine.
What did I miss in this process that caused MongoDB not to get things back in synch for that one collection? I assume something got out of whack with regard to the oplog?
Edit 1:
The oplog on the primary showed that its earliest timestamp went back a couple of days, so there should have been plenty of space to maintain transactions for a few hours (which was the time the secondary was offline).
Edit 2:
Our MongoDB installation uses two disk partitions: /dev/sda1 and /dev/sdb1. The primary MongoDB directory /var/lib/mongodb/ is on /dev/sda1, and holds several databases, while the log database resides by itself on /dev/sdb1. There's a sym link /var/lib/mongodb/log_db which points to a directory on /dev/sdb1. Since the log db was getting full, we needed to upgrade the disk for /dev/sdb1.

You should be using mongodump with the --oplog option. Running a full database backup with mongodump on a replicaset that is updating collections at the same time may not leave you with a consistent backup. This becomes worse with larger databases, more collections and more frequent updates/inserts/deletes.
From the documentation for your version (2.2) of MongoDB (it's the same for 2.6 but just to be as accurate as possible):
--oplog
Use this option to ensure that mongodump creates a dump of the
database that includes an oplog, to create a point-in-time snapshot of
the state of a mongod instance. To restore to a specific point-in-time
backup, use the output created with this option in conjunction with
mongorestore --oplogReplay.
Without --oplog, if there are write operations during the dump
operation, the dump will not reflect a single moment in time. Changes
made to the database during the update process can affect the output
of the backup.
http://docs.mongodb.org/v2.2/reference/mongodump/
This is not covered well in most MongoDB tutorials around backups and restores. Generally you are better off if you can perform a live snapshot of the storage volume your database resides on (assuming your storage solution has a live snapshot ability compatible with MongoDB). Failing that, your next best bet is taking a secondary offline and then performing a snapshot or backup of the database files. Mongodump on a live database is increasingly a less optimal solution for larger databases due to performance issues.
I'd definitely take a look at the MongoDB overview of backup options: http://docs.mongodb.org/manual/core/backups/

I would guess this has to do with the oplog not being long enough, although it seems like you checked that and it looked reasonably big.
Still, when adding new members to a replica set you shouldn't be snapshotting and restoring them. It's better to simply add a new member and let replication happen by itself. This is described in the Mongo docs and is the process I've always followed.

How does restoring a db backup affect the oplog?

I have a standalone MongoDb instance. It has many databases in it. I am though only concerned with backingup/restoring one of those databases, lets call it DbOne.
Using the instructions in (http://www.mongodb.com/blog/post/dont-let-your-standalone-mongodb-server-stand-alone), I can create an oplog on this standalone server.
Using the tool Tayra, I can record/store the oplog entries. Being able to create incremental backups is the main reason I enabled the oplog on my standalone instance.
I intend to take full backups once a day, using the command
mongodump --db DbOne --oplog
From my understanding, this backup will contain a point-in-time snapshot of my db.
Assuming I want to discard all updates since this backup, I delete all backedup oplog and I restore only this full backup, using the command
mongorestore --drop --db DbOne --oplogReplay
At this point, do I need to do something to the oplog collection in the local db? Will mongodb automatically drop the entries pertaining to this db from the oplog? Because if not, then wouldn't Tayra end up finding those oplog entries and backup them all over again?
Tbh, I haven't tried this yet on my machine. I am hoping someone can point to a document that lists supported/expected behaviour in this scenario.

I had experimented with a MongoDb server, setup as a replica set with only 1 member, shortly after asking the question. I however forgot to answer this question.
I took a backup using mongodump --db DbOne --oplog. I executed some additional updates. Keeping the mongodb server as is, ie still running under replication, if I run the mongorestore command, then it would create thousands of oplog entries, one for each document of each collection in the db. It was a big mess.
The alternative was to shutdown MongoDb and start it as a standalone instance (ie not running as a replica set). Now if I were to restore using mongorestore the oplog wouldnt be touched. This was bad, because the oplog now contained entries that were not present in the actual collections.
I wanted a mechanism that would restore both my database as well as oplog info in the local database to the time the backup took place. mongodump doesnt backup the local database.
Eventually I had stop using mongodump and instead switched to backing up the entire data directory (after stopping mongodb). Once we switch to AWS, I could use the EBS Snapshot feature to perform the same.

I understand you want a link to the docs about mongorestore:
http://docs.mongodb.org/manual/reference/program/mongorestore/
From what I understand you want to make a point in time backup and then restore that backup. The commands you listed above will do that:
1)mongodump --db DbOne --oplog
2)mongorestore --drop --db DbOne --oplogReplay
However, please note that the "point in time" that the backup was effectively taken at it is when the dump ends, not the moment the command started. This is a fine detail that might not matter to you, but is included for completeness.
Let me know if there is anything else.
Best,
Charlie

Mongodump with --oplog for hot backup

I'm looking for the right way to do a Mongodb backup on a replica set (non-sharded).
By reading the Mongodb documentation, I understand that a "mongodump --oplog" should be enough, even on a replica (slave) server.
From the mongodb / mongodump documentation :
--oplog
Use this option to ensure that mongodump creates a dump of the database that includes an oplog, to create a point-in-time snapshot of the state of a mongod instance. To restore to a specific point-in-time backup, use the output created with this option in conjunction with mongorestore --oplogReplay.
Without --oplog, if there are write operations during the dump operation, the dump will not reflect a single moment in time. Changes made to the database during the update process can affect the output of the backup
I'm still having a very hard time to understand how Mongodb can backup and keep writing on the database and make a consistent backup, even with --oplog.
Should I lock my collections first or is it safe to run "mongodump --oplog ?
Is there anything else I should know about?
Thanks.

The following document explains how mongodump with –oplog option works to create a point in time backup.
http://docs.mongodb.org/manual/tutorial/backup-databases-with-binary-database-dumps/
However, using mongodump and mongorestore to back up and restore MongodDB can be slow. If file system snapshot is an option, you may want to consider using snapshot for MongoDB backup. Information from the following link details two snapshot options for performing hot backup of MongoDB.
http://docs.mongodb.org/manual/tutorial/backup-databases-with-filesystem-snapshots/
You can also look into MongoDB backup service.
http://www.10gen.com/products/mongodb-backup-service

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse