Using mongomirror to sync collections within Atlas - mongodb

I want to migrate a collection from one Mongo Atlas Cluster to another. How do I go about in doing that?

There are 2 poosible approaches here:
Migration with downtime: stop the service, export the data from the collection to some 3rd location, and then import the data into the new collection on the new cluster, and resume the service.
But there's a better way: using the MongoMirror utility. With this utility, you can sync collections across clusters without any downtime. the utility first syncs the db (or selected collections from it) and then ensures subsequent writes to the source are synced to the dest.
following is the syntax I used to get it to run:
./mongomirror --host atlas-something-shard-0/prod-mysourcedb--shard-00-02-pri.abcd.gcp.mongodb.net:27017 \
--username myUserName \
--password PASSWORD \
--authenticationDatabase admin \
--destination prod-somethingelse-shard-0/prod-mydestdb-shard-00-02-pri.abcd.gcp.mongodb.net:27017 \
--destinationUsername myUserName \
--destinationPassword PASSWORD \
--includeNamespace dbname.collection1 \
--includeNamespace dbname.collection2 \
--ssl \
--forceDump
Unfortunately, there are MANY pitfalls here:
ensure your user has the correct role. this is actually covered in the docs so read the relevant section closely.
to correctly specify the host and destination fields, you'll need to obtain both the RS name and the primary instance name. One way to get these is to use the mongosh tool and run rs.conf() on both source and destination clusters. The RS name is specified as "_id" in the command's output, and the instances are listed as "members" in the output. You'll want to take the primary instance's "host' field. the end result should look like RS-name/primary-instance-host:port
IF you specify replica-set, you MUST specify the PRIMARY instance. Failing to do so will result in an obscure error (EOF something).
I recommend adding the forceDump flag (at least until you manage to get it to run for the first time).
If you specify non-existing collections, the utility will only give one indication that they don't exist and then go on to "sync" these, rather than failing.

Related

mongorestore - AtlasError - getting non-bucket system collections is unsupported

Failed: (AtlasError) getting non-bucket system collections is unsupported
I'm trying to use mongorestore to migrate data from one database to another. Both on Atlas. The dump is working fine, but mongorestore is outputting the below messages. I have no clue what this means and a google search renders nothing remotely close. I've added hyphens to the beginning of each line of the output to make it more readable.
-- using write concern: &{majority false 0}
-- will listen for SIGTERM, SIGINT, and SIGKILL
-- connected to node type: replset
-- The --db and --collection flags are deprecated for this use-case; please use --nsInclude instead, i.e. with --nsInclude=${DATABASE}.${COLLECTION}
-- got error from options parsing: (AtlasError) getting non-bucket system collections is unsupported
-- Failed: (AtlasError) getting non-bucket system collections is unsupported
-- 0 document(s) restored successfully. 0 document(s) failed to restore.
The command I'm running
mongorestore --uri="mongodb+srv://username:password#hostname.mongodb.net/$DEV_DATABASE" \
--preserveUUID \
--drop \
--nsFrom="$PROD_DATABASE.*" \
--nsTo="$DEV_DATABASE.*" \
--verbose \
"dump/$PROD_DATABASE"
I've also tried creating an archive file with mongodump and using that with --archive="filename", as well as piping stdout to mongorestore. I've also checked that the user I'm using has the correct privileges. They have the role of Atlas admin, which I'm assuming is correct. The dev cluster I'm trying to restore to is an M0 if that makes any difference.
I should also point out that I have minimal Mongo management experience, so I'm sure there's something I've overlooked. Thanks for your help.
MongoDB records the collection UUIDs in a separate system collection.
The --preserveUUID option instructs mongorestore to create the collection, and force it to use the UUID from the source system.
The error message indicates that Atlas is refusing to allow you to access or modify that system collection.
Run without the --preserveUUID option when restoring to Atlas.

MongoDB- backing up and restoring users and roles

What are best practices for synching users and roles between Mongo instances?
On the same Windows machine, I am trying to copy MongoDB users and roles in the admin database from one Mongo instance to another. Authentication is 'on' for each instance. No combination of mongodump\mongorestore or mongoexport\mongoimport I have tried works. With mongodump\restore, the restore step displays:
assuming users in the dump directory are from <= 2.4 (auth version 1)
Failed: the users and roles collections in the dump have an incompatible auth version with target server: cannot restore users of auth version 1 to a server of auth version 5
I found no command line option to tell it not to do this silly thing. I have Mongo version 4 and that's it installed.
You would think --dumpDbUsersAndRoles and --restoreDbUsersAndRoles would
be symmetrical, but they are not.
I was able to run this,
mongoexport -p 27017 -u admin --password please -d admin --collection system.roles --out myRoles.json
However, when trying mongoimport
mongoimport -p 26017 -u admin --password please -d admin --collection "system.roles" --file myRoles.json
the output displays
error validating settings: invalid collection name: collection name 'system.roles' is not allowed to begin with 'system.'
Primer
Users are attached to databases. Ideally, you have your database specific users stored in the respective database. All “global” users should go into admin. The good part: replica sets take care of syncing those users to each member of the replica set.
Solution
That being said, it seems to be quite obvious on how to deal with this.
For a worst case scenario, it is much easier to have a .js ready which simply recreates the 3-4 global roles instead
of fiddling with system.* collections in the admin database. This has the advantage that you can also do other setup stuff automatically, like sharding setup if TSHTF and you need to rebuild your cluster from scratch.
use admin;
db.createRole([...])
db.createRole([...])
// do other stuff, like sharding setup
Run it against the primary of your replica set or a mongos instance (if you have a sharded cluster) using
mongo daHost:27017/admin myjsfile.js
after you set up your machines but before you enable authentication.
Another option would be to use Ansible for user creation.
As for dump and restore, you might want to leave out the collection name.

Can I restore data from mongo oplog?

my mongodb was hacked today, all data was deleted, and hacker requires some amount to get it back, I will not pay him, cause I know he will not send me back my database.
But I have had oplog turn on, I see it contains over 300 000 documents, saving all operations.
Is there any tool that can restore my data from this logs?
Depending on how far back your oplog is, you may be able to restore the deployment. I would recommend taking a backup of the current state of your dbpath just in case.
Note that there are many variables in play for doing a restore like this, so success is never a guarantee. It can be done using mongodump and mongorestore, but only if your oplog goes back to the beginning of time (i.e. when the deployment was first created). If it does, you may be able to restore your data. If it does not, you'll see errors during the process.
Secure your deployment before doing anything else. This situation arises due to a lack of security. There are extensive security features available in MongoDB. Check out the Security Checklist page for details.
Dump the oplog collection using mongodump --host <old_host> --username <user> --password <pwd> -d local -c oplog.rs -o oplogDump.
Check the content of the oplog to determine the timestamp when the offending drop operation occur by using bsondump oplogDump/local/oplog.rs.bson. You're looking for a line that looks approximately like this:
{"ts":{"$timestamp":{"t":1502172266,"i":1}},"t":{"$numberLong":"1"},"h":{"$numberLong":"7041819298365940282"},"v":2,"op":"c","ns":"test.$cmd","o":{"dropDatabase":1}}
This line means that a dropDatabase() command was executed on the test database.
Keep note of the t value in {"$timestamp":{"t":1502172266,"i":1}}.
Restore to a secure new deployment using mongorestore --host <new_host> --username <user> --password <pwd> --oplogReplay --oplogLimit=1502172266 --oplogFile=oplogDump/local/oplog.rs.bson oplogDump
Note the parameter to oplogLimit, which is basically telling mongorestore to stop replaying the oplog once it hit that timestamp (which is the timestamp of the dropDatabase command in Step 3.
The oplogFile parameter is new to MongoDB 3.4. For older versions, you would need to copy the oplogDump/local/oplog.rs.bson to the root of the dump directory to a file named oplog.bson, e.g. oplogDump/oplog.bson and remove the oplogFile parameter from the example command above.
After Step 4, if your oplog goes back to the beginning of time and you stop the oplog replay at the right time, hopefully you should see your data at the point just before the dropDatabase command was executed.

How to get a consistent MongoDB backup for a single node setup

I'm using MongoDB in a pretty simple setup and need a consistent backup strategy. I found out the hard way that wrapping a mongodump in a lock/unlock is a bad idea. Then I read that the --oplog option should be able to provide consistency without lock/unlock. However, when I tried that, it said that I could only use the --oplog option on a "full dump." I've poked around the docs and lots of articles but it still seems unclear on how to dump a mongo database from a single point in time.
For now I'm just going with a normal dump but I'm assuming that if there are writes during the dump it would make the backup not from a single point in time, correct?
mongodump -h $MONGO_HOST:$MONGO_PORT -d $MONGO_DATABASE -o ./${EXPORT_FILE} -u backup -p password --authenticationDatabase admin
In production environment, MongoDB is typically deployed as replica set(s) to ensure redundancy and high availability. There are a few options available for point in time backup if you are running a standalone mongod instance.
One option as you have mentioned is to do a mongodump with –oplog option. However, this option is only available if you are running a replica set. You can convert a standalone mongod instance to a single node replica set easily without adding any new replica set members. Please check the following document for details.
http://docs.mongodb.org/manual/tutorial/convert-standalone-to-replica-set/
This way, if there are writes while mongodump is running, they will be part of your backup. Please see Point in Time Operation Using Oplogs section from the following link.
http://docs.mongodb.org/manual/tutorial/backup-databases-with-binary-database-dumps/#point-in-time-operation-using-oplogs
Be aware that using mongodump and mongorestore to back up and restore MongodDB can be slow.
File system snapshot is another option. Information from the following link details two snapshot options for performing hot backup of MongoDB.
http://docs.mongodb.org/manual/tutorial/backup-databases-with-filesystem-snapshots/
You can also look into MongoDB backup service.
http://www.10gen.com/products/mongodb-backup-service
In addition, mongodump with oplog options does not work with single db/collection at this moment. There are plans to implement the feature. You can follow the ticket and vote for the feature under the More Actions button.
https://jira.mongodb.org/browse/SERVER-4273

Heroku: Storing local MongoDB to MongoLab

It might be a dead simple question yet I still wanted to ask. I've created a Node.js application and deployed it on Heroku. I've also set up the database connection without having any trouble as well.
However, I cannot get the load the local data in my MongoDB to MongoLab I use on heroku. I've searched google and could not find a useful solution so I ended up trying these commands;
mongodump
And:
mongorestore -h mydburl:mydbport -d mydbname -u myusername -p mypassword --db Collect.1
Now when I run the command mongorestore, I received the error;
ERROR: multiple occurrences
Import BSON files into MongoDB.
When I take a look at the DB file for MongoDB I've specified and used during the local development, I see that there are files Collect.0, Collect.1 and Collect.ns. Now I know that my db name is 'Collect' since when I use the shell I always type `use Collect`. So I specified the db as Collect.1 in command line but I still receive the same errors. Should I remove all the other collect files or there is another way around?
You can't use 'mongorestore' against the raw database files. 'mongorestore' is meant to work off of a dump file generated by 'mongodump'. First us 'mongodump' to dump your local database and then use 'mongorestore' to restore that dump file.
If you go to the Tools tab in the MongoLab UI for your database, and click 'Import / Export' you can see an example of each command with the correct params for your database.
Email us at support#mongolab.com if you continue to have trouble.
-will
This can done by two steps.
1.Dump the database
mongodump -d mylocal_db_name -o dump/
2.Restore the database
mongorestore -h xyz.mongolab.com:12345 -d remote_db_name -u username -p password dump/mylocal_db_name/