This query is related to building a small MongoDB test database from a large existing database.
My plan to execute this is as follows:
a) Use mongodump with an aggregate query which specifies my conditions for the records to be copied over to the test database.
Will this idea work? From what I have read on forums, using a MongoDB query as is in a mongodump command will not work.
Any guidance on this is most appreciated.
You can use the following command to get the subset of the DB.
mongodump --query "your query here"
For more information read the mongodump documentation here.
Related
I need to clean a mongodb collection of 200Tb, and delete older timstamp. I am trying to build a new collection from the new, and run a delete query, since, running a del on the present collection that is in use, will slow down the other requests to it. I have thought of cloning a new collection either by taking a dump of the following collection, or by create a read and and write script, such that, it will read from the present collection and write to the cloned collection. My question is is a read/write operation of a batch ex: 1000 read and write faster than a dump ?
EDIT:
I found this, this and this article, and want to know, if writing a script in the above mentioned way the same as creating a ssh pipe of read and write ? ex: is a node/python script to fetch 1000 rows from a collection and insert that to a clone collection the same as ssh *** ". /etc/profile; mongodump -h sourceHost -d yourDatabase … | mongorestore -h targetHost -d yourDatabase ?
I would suggest this approach:
Rename the collection. Your application will immediately create a new empty collection with the old name when it tries to insert some data. You may create some indexes.
Run mongoexport/mongoimport to import the valid data, i.e. skip the outdated.
Yes, in general mongodump/mongorestore might be faster, however at mongoexport you can define a query and limit the data which is exported. Could be like this:
mongoexport --uri "..." --db=yourDatabase --collection=collection --query='{timestamp: {$gt: ISODate("2022-01-010")}}' | mongoimport --uri "..." --db=yourDatabase --collection=collection --numInsertionWorkers=10
Utilize parameter numInsertionWorkers to run multiple workers. It will speed up your inserts.
So you run a sharded cluster? If yes, then you should use sh.splitAt() on the new collection, see How to copy a collection from one database to another in MongoDB
I have a mongodb database with version 3.6.3. I have another mongodb database (on another machine) using version 4.4.5 with no documents in it. I want to put the data from the v3.6.3 into the v4.4.5 database. Can I safetly do this using mongoexport and then mongoimport or do I need to perform more steps?
Yes, mongoexport writes the documents out to a JSON file, and mongoimport can read that file and insert the documents to the new database.
These will transfer only the documents, but not index information. You many want to consider mongodump/mongorestore if you also need to move indexes.
My application is using RethinkDb. Everything is running fine, but a new required needs to migrate the db into MongoDb.
Is this possible? How do I migrate the tables/collections, data, indexes, etc?
How about blob types, auto increments. ids?
Thanks!
Is this possible? How do I migrate the tables/collections, data, indexes, etc?
One way to migrate data from RethinkDB to MongoDB is to export data from RethinkDB using rethinkdb dump command, and then use mongoimport to import into MongoDB. For example:
rethinkdb dump -e dbname.tableName
This would generate an archive file:
rethinkdb_dump_<datetime>.tar.gz
After uncompressing the archive file, you can then use mongoimport as below:
mongoimport --jsonArray --db dbName --collection tableName ./rethinkdb_dump_<datetime>/dbName/tableName.json
Unfortunately for the indexes, the format between RethinkDB and MongoDB is quite different. The indexes are stored within the same archived file:
./rethinkdb_dump_<datetime>/dbName/tableName.info
Although you can still write a Python script to read the info file, and use MongoDB Python driver (PyMongo) to create the indexes in MongoDB. See also create_indexes() method for more information.
One of the reasons in suggesting to use Python, is because RethinkDB also has a Client Python driver. So technically, you can also skip the export stage and write a script to connect your RethinkDB to MongoDB.
Hello I have an ubuntu 14.04 server that is running mongodb 2.4.14. I need to move the mongo instance to a new server. I have installed mongo 3.4.2 on the new server and need to move the databases over. I am pretty new with mongo. I have 2 databases that are pretty big but when I do a mongo dump the file is nowhere near the site of the databases that mongo is showing.I cannot figure out how to get mongoexport to work. What would be the best way to move those databases? If possible can we just export the data from mongo and then import it?
You'll need to give more information on your issue with mongodump and what mongodump parameters you were using.
Since you are doing a migration, you'll want to use mongodump and not mongoexport. mongoexport only outputs a JSON/CSV format of a collection. Because of this, mongoexport cannot retain certain datatypes that exist in BSON and thus MongoDB does not suggest that anyone uses mongoexport for full backups; this consideration is listed on mongo's site.
mongodump will be able to accurately create a backup of your database/collection that mongorestore will be able to restore that dump to your new server.
If you haven't already, check out Back Up and Restore with MongoDB Tools
I use mongochef as a UI client for my mongo database. Now I have collection which consists of 12,000 records. I want to export them using the mongochef.
I have tried with the export option(available) which is working fine up to 3000 documents. But if the number of records gets increasing the system is getting hung up.
Can you please let me know the best way to export all the documents in a nice way using mongochef.
Thanks.
Finally I came to conclusion to use the mongo using terminal which the best way to use(efficient).
read about the primary and secondary databases and executed the following query:
mongoexport --username user --password pass --host host --db database --collection coll --out file_name.json