How to Backup from mongoDB without locking tables - mongodb

There is a Replica set (primary, secondary, arbiter) with 300GB data. i want to make daily backup without lock. The Replica is placedWe use Windows 2008R2, so seems not possible to use lvm tools.
If i want to make folder copy on secondary, it needed to shut down mongod first (because its not possible copy mongod.lock while mongod is running).
What is the best solution to make fastest daily backup

I don't know if it is feasible for you, but you can add another member to replica set. This member would be hidden, so it would not be used for queries or writing operations. You can stop this server every day for make your database backups.

because it is a replicate cluster i use mongodump with the --oplog option. runs pretty quick on linux. and i think it may have some advantages in a multi tenant server over over tar or snap. disadvange is that the indexes are built when you do the mongorestore

Related

Mongod backup without replicate sets or db.fsyncLock()

I need to make a daily backups of mongodb database/s. I am running Mongo 3.4 on a 64 bit system so journaling is enabled by default.
I don't have replica sets for this setup so I want to manually (
actually schedule a cron job) copy or make an image of the mongo dbs from disk. I don't want to use the mongodump command.
Questions
Because of journaling am I able to simple do a cp -R /data/db/* /backup? Or do I need to do a db.fsyncLock() I'm not concerned about loss of some small amount of data that could be persistent in memory when I do the cp. More that the copy will work if I need to use it in the future I.e. it won't contains errors etc.
If I do need to do a db.fsyncLock() before copying the databases what happens if there are reads and writes that continue to occur? If they block will they perform as normal once the db.fsyncUnlock() occurs?
If i do need to use mongodump is it true that using _ids other than normal ObjectIds will cause issues? One of my databases doesn't use normal ObjectIds

How Sync between replica sets in mongoDB achieved. Automatic or Manual triggering Needed?

Going through MongoDB documentation I missed clarity regarding the above question. Please provide commands if any , are used for manual triggering a sync among replica sets.
Replica sets are always automatically synced, but if you need to do a manual re-sync you have a couple of options as explained here https://docs.mongodb.org/manual/tutorial/resync-replica-set-member/
So basically you can stop the member you want to re-sync and empty its data directory. When you will restart it, Mongo will automatically start the sync process.
Stop the member’s mongod instance. To ensure a clean shutdown, use the db.shutdownServer() method from the mongo shell or on Linux systems, the mongod --shutdown option.
Delete all data and sub-directories from the member’s data directory. By removing the data dbPath, MongoDB will perform a complete resync. Consider making a backup first.
Another way MongoDB suggests is that you can copy the data from another member, done that MongoDB will start syncing the rest of the data with the master. Similar to the first solution but faster because you have some data yet and you don't need to start from scratch.

MongoDB 2.2: why didn't replication catch up a collection following a dump/restore?

We have a three-server replicaset running MongoDB 2.2 on Ubuntu 10.04, and recently had to upgrade the hard drive for each server where one particular database resides. This database contains log information for web service requests, where they write to collections in hourly buckets using the current timestamp to determine the name, e.g. log_yyyymmddhh.
I performed this process:
backup the database on the primary server with mongodump --db log_db
take a secondary server offline, replace the disk
bring the secondary server up in standalone mode (i.e. comment out the replSet entry
in /etc/mongodb.conf before starting the service)
restore the database on the secondary server with mongorestore --drop --db log_db
add the secondary server back into the replicaset and bring it online,
letting replication catch up the hourly buckets that were updated/created
while it had been offline
Everything seemed to go as expected, except that the collection which was the current bucket at the time of the backup was not brought up to date by replication. I had to manually copy that collection over by hand to get it up to date. Note that collections which were created after the backup were synched just fine.
What did I miss in this process that caused MongoDB not to get things back in synch for that one collection? I assume something got out of whack with regard to the oplog?
Edit 1:
The oplog on the primary showed that its earliest timestamp went back a couple of days, so there should have been plenty of space to maintain transactions for a few hours (which was the time the secondary was offline).
Edit 2:
Our MongoDB installation uses two disk partitions: /dev/sda1 and /dev/sdb1. The primary MongoDB directory /var/lib/mongodb/ is on /dev/sda1, and holds several databases, while the log database resides by itself on /dev/sdb1. There's a sym link /var/lib/mongodb/log_db which points to a directory on /dev/sdb1. Since the log db was getting full, we needed to upgrade the disk for /dev/sdb1.
You should be using mongodump with the --oplog option. Running a full database backup with mongodump on a replicaset that is updating collections at the same time may not leave you with a consistent backup. This becomes worse with larger databases, more collections and more frequent updates/inserts/deletes.
From the documentation for your version (2.2) of MongoDB (it's the same for 2.6 but just to be as accurate as possible):
--oplog
Use this option to ensure that mongodump creates a dump of the
database that includes an oplog, to create a point-in-time snapshot of
the state of a mongod instance. To restore to a specific point-in-time
backup, use the output created with this option in conjunction with
mongorestore --oplogReplay.
Without --oplog, if there are write operations during the dump
operation, the dump will not reflect a single moment in time. Changes
made to the database during the update process can affect the output
of the backup.
http://docs.mongodb.org/v2.2/reference/mongodump/
This is not covered well in most MongoDB tutorials around backups and restores. Generally you are better off if you can perform a live snapshot of the storage volume your database resides on (assuming your storage solution has a live snapshot ability compatible with MongoDB). Failing that, your next best bet is taking a secondary offline and then performing a snapshot or backup of the database files. Mongodump on a live database is increasingly a less optimal solution for larger databases due to performance issues.
I'd definitely take a look at the MongoDB overview of backup options: http://docs.mongodb.org/manual/core/backups/
I would guess this has to do with the oplog not being long enough, although it seems like you checked that and it looked reasonably big.
Still, when adding new members to a replica set you shouldn't be snapshotting and restoring them. It's better to simply add a new member and let replication happen by itself. This is described in the Mongo docs and is the process I've always followed.

MongoDB replica set to stand alone backup and restore

For development reasons, I need to backup a production replica set mongodb and restore it on a stand alone, different machine test instance.
Some docs are talking about the opposite ( standalone 2 replica-set ), but I cannot find his downgrade/rollback way.
What's the way to go, in this case ?
No matter how many nodes you have in a replica set, each of them holds the same data.
So getting the data is easy - just use mongodump (preferably against the secondary, for performance reasons) and then mongorestore into a new mongod for your development stand-alone system.
mongodump does not pick up any replication related collections (they live in database called local). If you end up taking a file system snapshot of a replica node rather than using mongodump, be sure to drop the local database when you restore the snapshot into your production stand-alone server and then restart mongod so that it will properly detect that it is not part of a replica set.

Does mongodump lock the database?

I'm in the middle of setting up a backup strategy for mongo, was just curious to know if mongodump locks the database before performing the database dump?
I found this on mongo's google group:
Mongodump does a simple query on the live system and does not require
a shutdown. Like all queries it requires a read lock while running but
doesn't not block any more than normal queries.
If you have a replica set you will probably want to use the --oplog
flag to do your backups.
See the docs for more information
http://docs.mongodb.org/manual/administration/backups/
Additionally I found this previously asked question
MongoDB: mongodump/restore vs. backup up files directly
Excerpt from above question
Locking and copying files is only an option when you don't have heavy
write load.
mongodump can be run against live server. It will create some
additional load, so don't do it on peak hours. Also, it is advised to
do it on a secondary node (if you don't use replica sets, you should).
There are some complications when you have a DB so large that no
single machine can hold it. See this document.
Also, if you have replica set, you take down one of secondaries and copy its files directly. See http://www.mongodb.org/display/DOCS/Backups:
Mongdump does not lock the db. It means other read and write operations will continue normally.
Actually, both mongodump and mongorestore are non-blocking. So if you want to mongodump mongorestore a db then its your responsibility to make sure that it is really a desired snapshot backup/restore. To do this, you must stop all other write operations while taking/restoring backups with mongodump/mongorestore. If you are running a sharded environment then its recommended you stop the balancer also.