How do i dump a collection into an insert statement - mongodb

I want to export a collection into an Insert statement that contain all the collection's documents.
mongo's dump functionality/export functionality will not give me the wanted result.
My motivation is to create a verbose dump file that the entire team that works on the project can easily read and execute it.

Related

How to save data in MongoDB with one-to-many relations?

I need to save texts which are split into statements (can be 5 or 5000 statements) and I'm thinking about a way to do it in MongoDB for maximum performance.
Right now I simply write every text (with all the statements) into the texts' collection of MongoDB. I create a property statementin the text document where I write all the statements as an array (with anidassigned in my Node.Js app viauuid.v1`).
So the data looks like this:
{"_id":{"$oid":"61548955c1bc78934cc3ef10"},
"statements":
[
{"content":"stop thinking ideas and bring stop thinking and decide ", "id":"ID1"},
{"content":" stop thinking ", "id":"ID2"},
{"content":" orange apple", "id":"ID3"}
],
"user":{"$oid":"61536842a8097fbad18fe633"},
"contextName":"lokhead","__v":0}
There is also another option: to create a separate collection for statements and to write the statements with the right context id to that collection.
So there are 3 options and I wanted to ask your advice on the best one in terms of performance:
Write all the statements into the text document directly
Create a new collection statements and write the statements there with the related context.id
Create a new collection statements and write the statements there, while creating a one-to-many relation in the texts collection document, which lists all the IDs of the statements added into the DB.
What would the most efficient way with the best data structure?

Replace ObjectId('11111111') with ObjectId('2222222') at any occurence

I have a mongo collection in which at any depth or within array elements there may occur ObjectId('11111111') as a value.
I need to replace it everywhere with ObjectId('2222222').
Is there an easy way of doing this in mongo?
Dump the collection to extended json, open the dump in a text editor, replace the values, then load the dump back into the database.

Insert all documents from one collection into another collection in MongoDB database

I have a python script that collects data everyday and inserts it into a MongoDB collection (~10M documents). Sometimes the job fails and I am left with partial data which is not useful to me. I would like to insert the data into a staging collection first and then copy or move all documents from the staging collection into the final collection only when the job finishes and the data is complete. I cannot seem to find a straight forward solution for doing this as a "bulk" type operation, but it seems there should be one.
In SQL it would be something like this:
INSERT INTO final_table
SELECT *
FROM staging_table
I thought that db.collection.copyTo() would work for this but it seems it makes the destination collection a clone of the source collection.
Additionally, I know from this: mongodb move documents from one collection to another collection that I can do something like the following:
var documentsToMove = db.collectionA.find({});
documentsToMove.forEach(function(doc) {
db.collectionB.insert(doc);
}
But it seems like there should be a more efficient way.
So, How can I take all documents from one collection and insert them into another collection in the most efficient manner?
NOTE: the final collection has data in it already. The new documents that I want to move over would be adding to this data, e.g if my staging collection has 2 documents and my final collection has 10 documents, I would have 12 documents in my final collection after I move the staging data over.
You can use db.cloneCollection(); see mondb cloneCollection
if you no longer need the staging collection you can simply use the renaming option.
switch to admin db
db.runCommand({renameCollection:"staging.CollectionA",to:"targetdb.CollectionB"})

is there any way to restore predefined schema to mongoDB?

I'm beginner with mongoDB. i want to know is there any way to load predefined schema to mongoDB? ( for example like cassandra that use .cql file for this purpose)
If there is, please intruduce some document about structure of that file and way for restoring.
If there is not, how i can create an index only one time when I create a collection. I think it is wrong if i create index every time I call insert method or run my program.
p.s: I have a multi-threaded program that every thread insert and update my mongo collection. I want to create index only one time.
Thanks.
To create an index on a collection you need to use ensureIndex command. You need to only call it once to create an index on a collection.
If you call ensureIndex repeatedly with the same arguments, only the first call will create an index, all subsequent calls will have no effect.
So if you know what indexes you're going to use for your database, you can create a script that will call that command.
An example insert_index.js file that creates 2 indexes for collA and collB collections:
db.collA.ensureIndex({ a : 1});
db.collB.ensureIndex({ b : -1});
You can call it from a shell like this:
mongo --quiet localhost/dbName insert_index.js
This will create those indexes on a database named dbName on your localhost. It's worth noticing that if your database and/or collections are not yet created, this will create both the database and the collections for which you're adding the indexes.
Edit
To clarify a little bit. MongoDB is schemaless so you can't restore it's schema.
You can only create indexes and collections (by using createCollection helper).
MongoDB is basically schemaless so there is no definition of a schema or namespaces to be restored.
In the case of indexes, these can be created at any time. There does not need to be a collection present or even the required fields for the index as this will all be sorted out as the collections are created and when documents are inserted that matches the defined fields.
Commands to create an index are generally the same with each implementation language, for example:
db.collection.ensureIndex({ a: 1, b: -1 })
Will define the index on the target collection in the target database that will reference field "a" and field "b", the latter in descending order. This will happen even if the collection or even the database does not exist as yet, or in fact will establish a blank namespace in that case.
Subsequent calls to the same index creation method do not actually re-create the index. Where the same index is specified to one that already exists it is effectively skipped as a "no-operation".
As such, you can simply feed all your required index creation statements at application startup and anything that is not already present will be created. Anything that already exists will be left alone.

Is there a better way to export a mongodb query to a new collection?

What I want:
I have a master collection of products, I then want to filter them and put them in a separate collection.
db.masterproducts.find({category:"scuba gear"}).copyTo(db.newcollection)
Of course, I realise the 'copyTo' does not exist.
I thought I could do it with MapReduce as results are created in a new collection using the new 'out' parameter in v1.8; however this new collection is not a subset of my original collection. Or can it be if I use MapReduce correctly?
To get around it I am currently doing this:
Step 1:
/usr/local/mongodb/bin/mongodump --db database --collection masterproducts -q '{category:"scuba gear"}'
Step 2:
/usr/local/mongodb/bin/mongorestore -d database -c newcollection --drop packages.bson
My 2 step method just seems rather inefficient!
Any help greatly appreciated.
Thanks
Bob
You can iterate through your query result and save each item like this:
db.oldCollection.find(query).forEach(function(x){db.newCollection.save(x);})
You can create small server side javascript (like this one, just add filtering you want) and execute it using eval
You can use dump/restore in the way you described above
Copy collection command shoud be in mongodb soon (will be done in votes order)! See jira feature.
You should be able to create a subset with mapreduce (using 'out'). The problem is mapreduce has a special output format so your documents are going to be transformed (there is a JIRA ticket to add support for another format, but I can not find it at the moment). It is also going to be very inefficent :/
Copying a cursor to a collection makes a lot of sense, I suggest creating a ticket for this.
there is also toArray() method which can be used:
//create new collection
db.creatCollection("resultCollection")
// now query for type="foo" and insert the results into new collection
db.resultCollection.insert( (db.orginialCollection.find({type:'foo'}).toArray())