What files should I source control with MongoDB - mongodb

I've been using a MongoDB instance in my docker compose script. I want to set it up so I can keep my database from PC to PC but have all the same data.
There seems to be quite a bit of files in a MongoDB docker installation, .lock, .turtle, .wt, .bson diagnostic.data, journal etc.
Is there a rule of thumb of what I should store and would I should ignore in my repo? It's been unclear to me, I don't want to store anyfiles that could effect booting on another docker container.

Best is to preserve everything under the mongod dbPath folder , but some files/folders can be removed afcourse like diagnostic.data - the folder contain collected metrics during operation necessary for performance analysis at later stage , but in general the already collected stats are not necessary for the mongod process to be executed.

Related

Link mongo-data to /data/db folder to a volume Mongodb Docker

I accidentally deleted a volume of docker mongo-data:/data/db , i have a copy of that folder , now the problem is when i run docker-compose up mongodb container doesn't start and gives an error of mongo_1 exited with code 14 below more details of the error and the mongo-data folder , can you someone help me please
in docker-compose.yml
volumes:
- ./mongo-data:/data/db
Restore from backup files
A step-by-step process to repair the corrupted files from a failed mongodb in a docker container:
! Before you start, make copy of the files. !
Make sure you know which version of the image was running in the container
Spawn new container with to run the repair process as follows
docker run -it -v <data folder>:/data/db <image-name>:<image-version> mongod --repair
Once the files are repaired, you can start the containers from the docker-compose
If the repair fails, it usually means that the files are corrupted beyond repair. There is still a chance to repair it with exporting the data as described here.
How to secure proper backup files
The database is constantly working with the files, so the files are constantly changed on the disks. In addition, the database will keep some of the changes in the internal memory buffers before they are flushed to the filesystem. Although the database engines are doing very good job to assure the the database can recover from abrupt failure by using the 2-stage commit process (first update the transaction-log than the datafile), when the files are copied there could be a corruption that will prevent the database from recovery.
Reason for such corruption is that the copy process is not aware of the database written process progress, and this creates a racing condition. With very simple words, while the database is in middle of writing, the copy process will create a copy of the file(s) that is half-updated, hence it will be corrupted.
When the database writer is in middle of writing to the files, we call them hot files. hot files are term from the OS perspective, and MongoDB also uses a term hot backup which is a term from MongoDB perspective. Hot backup means that the backup was taken when the database was running.
To take a proper snapshot (assuring the files are cold) you need to follow the procedure explained here. In short, the command db.fsyncLock() that is issued during this process will inform the database engine to flush all buffers and stop writing to the files. This will make the files cold, however the database remains hot, hence the difference between the terms hot files and hot backup. Once the copy is done, the database is informed to start writing to the filesystem by issuing db.fsyncUnlock()
Note the process is more complex and can change with different version of the databse. Here I give a simplification of it, in order to illustrate the point about the problems with the file snapshot. To secure proper and consistent backup, always follow the documented procedure for the database version that you use.
Suggested backup method
Preferred backup should always be the data dump method, since this assures that you can restore even in case of upgraded/downgraded database engines. MongoDB provides very useful tool called mongodump that can be used to create database backups by dumping the data, instead by copy of the files.
For more details on how to use the backup tools, as well as for the other methods of backup read the MongoDB Backup Methods chapter of the MondoDB documentation.

mongodb - How to carry over indexes while doing file backup

We have a cluster of mongo servers. We stop one of the mongo servers and copy all the data files except admin and journal and tar into one file.
We untar the file in our test environments. Though we get all the data carried over properly, we don't get the indexes carried over.
Any suggestions?

MongoDB does not see database or collections after migrating from localhost to EBS volume

full disclosure: I am a complete n00b to mongodb and am just getting my feet wet with using mongo on AWS (but have 2 decades working in IT so not a total n00b :P)
I setup an EBS volume and installed mongo on a EC2 instance.
My problem is that I provisioned too small an EBS volume initially.
When I realized this I:
created a new larger EBS volume
mounted it on the server
stopped mongo ( $ sudo service mongod stop)
copied all my /data/db files into the new volume
updated conf files and fstab (dbpath, logpath, pidfilepath and mount point for new volume respectively)
restarted mongod
When I execute: $ sudo service mongod start
- everything runs fine.
- I can futz about in the admin and local databases.
However, when I run the mongos command: > show databases
- I only see the admin and local.
- the database I copied into the new volume (named encompass) is not listed.
I still have a working local copy of the database so my data is not lost, just not sure how best to move mongo data around other than:
A) start all over importing the data to the db on the AWS server (not what I would like since it is already loaded in my local db)
B) copy the local db to the new EBS volume again (also not preferred but better that importing all the data from scratch again!).
NOTE: originally I secure copied the data into the EBS volume with this command:
$ scp -r -i / / ec2-user#:/
then when I copied between volumes I used a vanilla cp command.
Did I miss something here?
The best I could find on SO and the web was this process (How to scale MongoDB?), but perhaps I missed a switch in a command or a nuance to the process that rendered my database files inert/useless?
Any idea how I can get mongo to see my other database files and collections?
Or did I make a irreversible error somewhere along the way?
Thanks for any help!!
Are you sure you conf file is being loaded? You can, for a test, load mongod.exe and specify the path directly to your db for a test, i.e.:
mongod --dbpath c:\mongo\data\db (unix syntax may vary a bit, this is windows)
run this from the command line and see what, if anything, mongo complains about.
A database has a very finicky algorithm that is easy to damage. Before copying from one database to another you should probably seed the database, a few dummy entries will tell you the database is working.

should I let mongodb make use of the new hard disk in this way?

I have a mongodb v2.4.6 running on ubuntu 13.04. It is known that mongodb store all data in /var/lib/mongodb. Now the mongodb is running out of the hard disk. Fortunately, I got a new hard disk which is installed, fdisked, formated and got a name /dev/sda3. Unfortunately I don't know how to let the mongodb make use of the new hard disk because my knowledge on ubuntu and mongodb is very limited. After some research in internet, it seems that I should execute the following command
sudo mount /dev/sda3 /var/lib/mongodb
Is this what I need to do to let mongodb use the new disk? If so, will mongodb automatically and intelligently increase its data to this disk? Is there any othere things I should do? Thank you.
Unfortunately this one will not be that straightforward. Even if you succeed with the mounting it will not move the files at all. What you can do is to
mount the disk elsewhere (mkdir /var/lib/mongodb1, mount /dev/sda3 /var/lib/mongodb1)
stop mongo
copy the files from /var/lib/mongodb to /var/lib/mongodb1 (only helps if the new disk is bigger)
reconfigure mongo to use as db dir the new directory or swap the names with mv commands
start mongo
if everything went fine, mongo started and so on,(check it first!!!) you can delete the old data.
If you have a disk which is the same size so with moving the data you will run into the same problem, if you need larger space then a single disk you should play around with RAID and/or LVM and more disks.

Where are the db files? at /var/lib/mongodb I cant find any increase in size. I ran very big loop to create lakhs of objects

I am using UBUNTU, from /etc/mongod.conf, I found that /var/lib/mongdb is the path for data.
I have found some files like collectionname.0, .1, .ns in that directory. but when I run a very big loop(lakhs), I am able to get them back using mongo shell, but that mongodb directory size is not increasing, so there must be someother place where this data is being stored
What is that place?
There is no another place. As indicated by #itsbruce, in Ubuntu it's /var/lib/mongodb.
On a non-packaged installation (on Linux), i.e. without a /etc/mongodb.conf file, the default is /data/db/ (unless otherwise specified).
You can modify the switch "--dbpath" on the command line to designate a different directory for the mongod instance to store its data. Typical locations include: /srv/mongodb, /var/lib/mongodb or /opt/mongodb.
(Windows systems use the \data\db directory.) If you installed using a package management system.
I recommend using the db.collection.stats command as outlined here to monitor the size of your collection as you insert data. The link also explains what each field (in the output) means.
That is the correct data location for MongoDB on Ubuntu. MongoDB pre-allocates filespace. Are you sure you have generated more data than would fit into the initial pre-allocated files? Try blowing away any existing data files and restarting Mongo with the --noprealloc flag. Then add data.