Migrate mongodb collections and indexes excluding data - mongodb

I have two environments DEV and STAGE and I need to take all of my collections from a DEV database to a database in STAGE.
Currently using mongodump to get all of my collections and indexes, which appears to work very well.
>mongodump --host 192.168.#.# --port 27019 --db MyDB
Then I am using mongorestore to populate STAGE with the appropriate collections and indexes.
>mongorestore --host 192.168.#.# --port 27019 --db test "C:\Program Files\MongoDB 2.6 Standard\bin\dump\MyDB"
My collections and indexes come across perfectly, however, my collection content comes as well. I have not found a way to exclude my actual data inside my collections...is this doable? Can I remove files from the output of mongodump to only have my collections and their 'indexes'.

MongoDB is schemaless so basically there is no point of restoring empty collections. Collections are created on the fly when you do a write.
I understand you just want to restore collection's metadata as indexes etc., I don't know what driver are you using but I would suggest you deal with this problem on the application level by writing a routine that creates the indexes etc.
Also removing the bson files and keeping only the metadata files from mongodump as you suggest will work (or at least it works in my case with mongo V 3.0.5 with wiredtiger engine) but is not documented as far as I know.
An other alternative could be to use -query option on mongodump to specify documents to include i.e: {_id:a_non_existing_id} but this option is applicable only to collection level dumps.

Related

Faster way to remove all entries from mongodb collection by dropping collection and recreating schema

When I want to remove all objects from my mongoDB collection comments I do this with this command:
mongo $MONGODB_URI --eval 'db.comments.deleteMany({});'
However, this is super slow when there are millions of records inside the collection.
In a relational db like Postgres I'd simply copy the structure of the collection, create a comments2 collection, drop the comments collection, and rename comments2 to comments.
Is this possible to do in MongoDB as well?
Or are there any other tricks to speed up the progress?
Thanks, the answers inspired my own solution. I forgot that MongoDB doesn't have a schema like a relationalDB.
So what I did is this:
1. dump an empty collection + the indexes of the collection
mongodump --host=127.0.0.1 --port=7001 --db=coral --collection=comments --query='{"id": "doesntexist"}' --out=./dump
This will create a folder ./dump with the contents comments.bson (empty) and comments.metadata.json
2. Drop the comments collection
mongo mongodb://127.0.0.1:7001/coral --eval 'db.comments.drop();'
3. Import new data new_comments.json (different from comments.bson)
mongoimport --uri=mongodb://127.0.0.1:7001/coral --file=new_comments.json --collection comments --numInsertionWorkers 12
This is way faster than first adding the indexes, and then importing.
4. Add indexes back
mongorestore --uri=mongodb://127.0.0.1:7001/coral --dir dump/coral --nsInclude coral.comments --numInsertionWorkersPerCollection 12
Note that --numInsertionWorkers speeds up to process by dividing the work over 12 cpus.
How many cpus do you have can be found on OSx with:
sysctl -n hw.ncpu
db.cities.aggregate([{ $match: {} }, { $out: "collection2" }]) in case you can login to the mongo prompt and simply drop the previous collection.
Otherwise, the approach you have posted is the one.
mongoexport.exe /host: /port: /db:test /collection:collection1 /out:collection1.json
mongoimport.exe /host: /port: /db:test /collection:collection2 /file:collection1.json
Thanks,
Neha
For mongodb version >=4.0 you can do this via db.comments.renameCollection("comments2") ,but it is kind of resource intensive operation and for bigger collections better you do mongodump/mongorestore. So the best action steps are:
mongodump -d x -c comments -out dump.bson
>use x
>db.comments.drop()
mongorestore -d x -c comments2 dump.bson
Plese, note deleteMany({}) is even more resource intensive operation since it will create oplog single entry for every document you delete and propagate to all replicaSet members.

How to clone a collection from one MongoDB to another on same server

I'm using Mongo 3.2. I have two databases on my localhost named client1 and client2.
Now client1 contains a collection named users.
I want to clone this collection to client2.
I have tried:-
use client2
db.cloneCollection('localhost:27017', 'client1.users',
{ 'active' : true } )
This outputs
{
"ok" : 0.0,
"errmsg" : "can't cloneCollection from self"
}
Is cloning a collection from one db to another on the same server prohibited?
Few things :
In general cloneCollection is used for different mongo instances but not to copy on same instances.
Also if you're using v4.2 you should stop using copyDB & cloneCollection cause they're deprecated compatibility-with-v4.2 & start using mongodump and mongorestore or mongoexport & mongoimport.
I would suggest to use mongodump & mongorestore :
Cause mongodump would preserve MongoDB's data types i.e.; bson types.
mongodump creates a binary where as mongoexport would convert bson to json & again mongoimport will convert json to bson while writing, which is why they're slow. You can use mongoexport & mongoimport when you wanted to analyze your collections data visually or use json data for any other purpose.
You can run below script in shell
declare - a collections = ("collectionName1" "collectionName2")
for i in "${collections[#]}"
do
echo "$i"
mongodump --host "All-shards" --username=uname --password password --ssl --authenticationDatabase admin --db dbname --collection "$i"
mongorestore --host=host-shard-name --port=27017 --username=uname --password=psswrd --ssl --authenticationDatabase=admin --db=dbname --collection= "$i" ./dump/dbName/"$i".bson;
done
To use mongodump, you must run mongodump against a running mongod or mongos instance. So these commands are being run expecting mongo is properly installed & path setup is good, if not you can navigate to mongo folder & run like ./mongodump & ./mongorestore. Above script will be useful if you wanted to backup multiple collections, You need specify few things in script like :
mongodump--host "All-shards" -> Here you need to specify all shards if your MongoDB is a replica set, if not you can specify localhost:27017.
mongorestore --host=host-shard-name -> You've to specify one shard of replica set, else your localhost, Few things here can be optional --ssl, --username, --password.
So mongodump will create a folder named dump for first time which will have the sub-folders with dbNames & each sub-folder will has bson files respective to their collection names dumped, So you need to refer dbName in restore command & collection name will be taken from variable i -> ./dump/dbName/"$i".bson
Note : MongoDB v3.2 is so old & in cloud based MongoDB service Mongo-atlas it has already reached it's end of lifecycle, So please upgrade asap. If you're looking for a free mongo instance or starting with MongoDB - you can try atlas.
db.cloneCollection() copies data directly between MongoDB instances.
https://docs.mongodb.com/v3.2/reference/method/db.cloneCollection/
That means you cannot clone inside the same mongod instance. Use mongoexport and mongoimport to clone your collection.
Since 4.2 MongoDb introduces $merge operator which allows copy from db1.collection to db2.collection.

MongoDB migration

Hello I have an ubuntu 14.04 server that is running mongodb 2.4.14. I need to move the mongo instance to a new server. I have installed mongo 3.4.2 on the new server and need to move the databases over. I am pretty new with mongo. I have 2 databases that are pretty big but when I do a mongo dump the file is nowhere near the site of the databases that mongo is showing.I cannot figure out how to get mongoexport to work. What would be the best way to move those databases? If possible can we just export the data from mongo and then import it?
You'll need to give more information on your issue with mongodump and what mongodump parameters you were using.
Since you are doing a migration, you'll want to use mongodump and not mongoexport. mongoexport only outputs a JSON/CSV format of a collection. Because of this, mongoexport cannot retain certain datatypes that exist in BSON and thus MongoDB does not suggest that anyone uses mongoexport for full backups; this consideration is listed on mongo's site.
mongodump will be able to accurately create a backup of your database/collection that mongorestore will be able to restore that dump to your new server.
If you haven't already, check out Back Up and Restore with MongoDB Tools

MongoDB: restore a collection in the mongo shell

I am trying to import/restore a single collection from within MongoDB (i.e. mongorestore cannot be accessed, I think ...?).
Is it possible? What is the command? Ideally, I'd like to include indexes as well. The backup has been produced by mongodump.
Specifically, I am using the IntelliShell from the excellent MongoChef. I perform other commands in this as well, such as renaming existing collections first.

How can I backup a MongoDB GridFS database the easiest way?

Like the title says, I have a MongoDB GridFS database with a whole range of file types (e.g., text, pdf, xls), and I want to backup this database the easiest way.
Replication is not an option. Preferably I'd like to do it the usual database way of dumping the database to file and then backup that file (which could be used to restore the entire database 100% later on if needed). Can that be done with mongodump? I also want the backup to be incremental. Will that be a problem with GridFS and mongodump?
Most importantly, is that the best way of doing it? I am not that familiar with MongoDB, will mongodump work as well as mysqldump does with MySQL? Whats the best practice for MongoDB GridFS and incremental backups?
I am running Linux if that makes any difference.
GridFS stores files in two collections: fs.files and fs.chunks.
More information on this may be found in the GridFS Specification document:
http://www.mongodb.org/display/DOCS/GridFS+Specification
Both collections may be backed up using mongodump, the same as any other collection. The documentation on mongodump may be found here:
http://www.mongodb.org/display/DOCS/Import+Export+Tools#ImportExportTools-mongodump
From a terminal, this would look something like the following:
For this demonstration, my db name is "gridFS":
First, mongodump is used to back the fs.files and fs.chunks collections to a folder on my desktop:
$ bin/mongodump --db gridFS --collection fs.chunks --out /Desktop
connected to: 127.0.0.1
DATABASE: gridFS to /Desktop/gridFS
gridFS.fs.chunks to /Desktop/gridFS/fs.chunks.bson
3 objects
$ bin/mongodump --db gridFS --collection fs.files --out /Desktop
connected to: 127.0.0.1
DATABASE: gridFS to /Desktop/gridFS
gridFS.fs.files to /Users/mbastien/Desktop/gridfs/gridFS/fs.files.bson
3 objects
Now, mongorestore is used to pull the backed-up collections into a new (for the purpose of demonstration) database called "gridFScopy"
$ bin/mongorestore --db gridFScopy --collection fs.chunks /Desktop/gridFS/fs.chunks.bson
connected to: 127.0.0.1
Thu Jan 19 12:38:43 /Desktop/gridFS/fs.chunks.bson
Thu Jan 19 12:38:43 going into namespace [gridFScopy.fs.chunks]
3 objects found
$ bin/mongorestore --db gridFScopy --collection fs.files /Desktop/gridFS/fs.files.bson
connected to: 127.0.0.1
Thu Jan 19 12:39:37 /Desktop/gridFS/fs.files.bson
Thu Jan 19 12:39:37 going into namespace [gridFScopy.fs.files]
3 objects found
Now the Mongo shell is started, so that the restore can be verified:
$ bin/mongo
MongoDB shell version: 2.0.2
connecting to: test
> use gridFScopy
switched to db gridFScopy
> show collections
fs.chunks
fs.files
system.indexes
>
The collections fs.chunks and fs.files have been successfully restored to the new DB.
You can write a script to perform mongodump on your fs.files and fs.chunks collections periodically.
As for incremental backups, they are not really supported by MongoDB. A Google search for "mongodb incremental backup" reveals a good mongodb-user Google Groups discussion on the subject:
http://groups.google.com/group/mongodb-user/browse_thread/thread/6b886794a9bf170f
For continuous back-ups, many users use a replica set. (Realizing that in your original question, you stated that this is not an option. This is included for other members of the Community who may be reading this response.) A member of a replica set can be hidden to ensure that it will never become Primary and will never be read from. More information on this may be found in the "Member Options" section of the Replica Set Configuration documentation.
http://www.mongodb.org/display/DOCS/Replica+Set+Configuration#ReplicaSetConfiguration-Memberoptions