How to carry over indexes while doing file backup

We have a cluster of mongo servers. We stop one of the mongo servers and copy all the data files except admin and journal and tar into one file.
We untar the file in our test environments. Though we get all the data carried over properly, we don't get the indexes carried over.
Any suggestions?


What files should I source control with MongoDB

I've been using a MongoDB instance in my docker compose script. I want to set it up so I can keep my database from PC to PC but have all the same data.
There seems to be quite a bit of files in a MongoDB docker installation, .lock, .turtle, .wt, .bson, journal etc.
Is there a rule of thumb of what I should store and would I should ignore in my repo? It's been unclear to me, I don't want to store anyfiles that could effect booting on another docker container.
Best is to preserve everything under the mongod dbPath folder , but some files/folders can be removed afcourse like - the folder contain collected metrics during operation necessary for performance analysis at later stage , but in general the already collected stats are not necessary for the mongod process to be executed.

MongoDB restore from file backup

I have a backup of /data/db that contains all .wt files along with journal directory etc. I have stopped the db, replaced the current db directory with the one backed up and started the db. This works, Mongo starts up but when I "show databases" there are no results. The local machine (that was backed up) did not have authentication enabled. The machine I am using to attempt the restore does have it enabled, I am able to start the mongo client without any authentication.
Is there another step to this process?
Is the authentication difference an issue?

Link mongo-data to /data/db folder to a volume Mongodb Docker

I accidentally deleted a volume of docker mongo-data:/data/db , i have a copy of that folder , now the problem is when i run docker-compose up mongodb container doesn't start and gives an error of mongo_1 exited with code 14 below more details of the error and the mongo-data folder , can you someone help me please
in docker-compose.yml
- ./mongo-data:/data/db
Restore from backup files
A step-by-step process to repair the corrupted files from a failed mongodb in a docker container:
! Before you start, make copy of the files. !
Make sure you know which version of the image was running in the container
Spawn new container with to run the repair process as follows
docker run -it -v <data folder>:/data/db <image-name>:<image-version> mongod --repair
Once the files are repaired, you can start the containers from the docker-compose
If the repair fails, it usually means that the files are corrupted beyond repair. There is still a chance to repair it with exporting the data as described here.
How to secure proper backup files
The database is constantly working with the files, so the files are constantly changed on the disks. In addition, the database will keep some of the changes in the internal memory buffers before they are flushed to the filesystem. Although the database engines are doing very good job to assure the the database can recover from abrupt failure by using the 2-stage commit process (first update the transaction-log than the datafile), when the files are copied there could be a corruption that will prevent the database from recovery.
Reason for such corruption is that the copy process is not aware of the database written process progress, and this creates a racing condition. With very simple words, while the database is in middle of writing, the copy process will create a copy of the file(s) that is half-updated, hence it will be corrupted.
When the database writer is in middle of writing to the files, we call them hot files. hot files are term from the OS perspective, and MongoDB also uses a term hot backup which is a term from MongoDB perspective. Hot backup means that the backup was taken when the database was running.
To take a proper snapshot (assuring the files are cold) you need to follow the procedure explained here. In short, the command db.fsyncLock() that is issued during this process will inform the database engine to flush all buffers and stop writing to the files. This will make the files cold, however the database remains hot, hence the difference between the terms hot files and hot backup. Once the copy is done, the database is informed to start writing to the filesystem by issuing db.fsyncUnlock()
Note the process is more complex and can change with different version of the databse. Here I give a simplification of it, in order to illustrate the point about the problems with the file snapshot. To secure proper and consistent backup, always follow the documented procedure for the database version that you use.
Suggested backup method
Preferred backup should always be the data dump method, since this assures that you can restore even in case of upgraded/downgraded database engines. MongoDB provides very useful tool called mongodump that can be used to create database backups by dumping the data, instead by copy of the files.
For more details on how to use the backup tools, as well as for the other methods of backup read the MongoDB Backup Methods chapter of the MondoDB documentation.

MongoDB 2.2: why didn't replication catch up a collection following a dump/restore?

We have a three-server replicaset running MongoDB 2.2 on Ubuntu 10.04, and recently had to upgrade the hard drive for each server where one particular database resides. This database contains log information for web service requests, where they write to collections in hourly buckets using the current timestamp to determine the name, e.g. log_yyyymmddhh.
I performed this process:
backup the database on the primary server with mongodump --db log_db
take a secondary server offline, replace the disk
bring the secondary server up in standalone mode (i.e. comment out the replSet entry
in /etc/mongodb.conf before starting the service)
restore the database on the secondary server with mongorestore --drop --db log_db
add the secondary server back into the replicaset and bring it online,
letting replication catch up the hourly buckets that were updated/created
while it had been offline
Everything seemed to go as expected, except that the collection which was the current bucket at the time of the backup was not brought up to date by replication. I had to manually copy that collection over by hand to get it up to date. Note that collections which were created after the backup were synched just fine.
What did I miss in this process that caused MongoDB not to get things back in synch for that one collection? I assume something got out of whack with regard to the oplog?
Edit 1:
The oplog on the primary showed that its earliest timestamp went back a couple of days, so there should have been plenty of space to maintain transactions for a few hours (which was the time the secondary was offline).
Edit 2:
Our MongoDB installation uses two disk partitions: /dev/sda1 and /dev/sdb1. The primary MongoDB directory /var/lib/mongodb/ is on /dev/sda1, and holds several databases, while the log database resides by itself on /dev/sdb1. There's a sym link /var/lib/mongodb/log_db which points to a directory on /dev/sdb1. Since the log db was getting full, we needed to upgrade the disk for /dev/sdb1.
You should be using mongodump with the --oplog option. Running a full database backup with mongodump on a replicaset that is updating collections at the same time may not leave you with a consistent backup. This becomes worse with larger databases, more collections and more frequent updates/inserts/deletes.
From the documentation for your version (2.2) of MongoDB (it's the same for 2.6 but just to be as accurate as possible):
Use this option to ensure that mongodump creates a dump of the
database that includes an oplog, to create a point-in-time snapshot of
the state of a mongod instance. To restore to a specific point-in-time
backup, use the output created with this option in conjunction with
mongorestore --oplogReplay.
Without --oplog, if there are write operations during the dump
operation, the dump will not reflect a single moment in time. Changes
made to the database during the update process can affect the output
of the backup.
This is not covered well in most MongoDB tutorials around backups and restores. Generally you are better off if you can perform a live snapshot of the storage volume your database resides on (assuming your storage solution has a live snapshot ability compatible with MongoDB). Failing that, your next best bet is taking a secondary offline and then performing a snapshot or backup of the database files. Mongodump on a live database is increasingly a less optimal solution for larger databases due to performance issues.
I'd definitely take a look at the MongoDB overview of backup options:
I would guess this has to do with the oplog not being long enough, although it seems like you checked that and it looked reasonably big.
Still, when adding new members to a replica set you shouldn't be snapshotting and restoring them. It's better to simply add a new member and let replication happen by itself. This is described in the Mongo docs and is the process I've always followed.

Copying DB's over in MongoDB

I am going to be doing some major DB restructuring in MongoDB that converts a bunch of records. I've ran this script against a copied DB locally and it works fine, so it should also work on the production database.
Is there an easy way of copying a DB instance into a new DB? I figured I could shut down MongoDB and copy the files into a new directory, rename them to the new DB, and then move them into the MongoDB data directory.
It's usually pretty slow to copy large DB files like that, so I wondered if there was a Mongo-specific way of copying DB's.
In case it is sufficient to copy only some collections, you can copy them as follows without shutting down the db server:
Otherwise use mongodump and mongorestore.