Inline/combine other collection into one collection - mongodb

I want to combine two mongodb collections.
Basically I have a collection containing documents that reference one document from another collection. Now I want to have this as a inline / nested field instead of a separate document.
So just to provide an example:
Collection A:
[{
"_id":"90A26C2A-4976-4EDD-850D-2ED8BEA46F9E",
"someValue": "foo"
},
{
"_id":"5F0BB248-E628-4B8F-A2F6-FECD79B78354",
"someValue": "bar"
}]
Collection B:
[{
"_id":"169099A4-5EB9-4D55-8118-53D30B8A2E1A",
"collectionAID":"90A26C2A-4976-4EDD-850D-2ED8BEA46F9E",
"some":"foo",
"andOther":"stuff"
},
{
"_id":"83B14A8B-86A8-49FF-8394-0A7F9E709C13",
"collectionAID":"90A26C2A-4976-4EDD-850D-2ED8BEA46F9E",
"some":"bar",
"andOther":"random"
}]
This should result in Collection A looking like this:
[{
"_id":"90A26C2A-4976-4EDD-850D-2ED8BEA46F9E",
"someValue": "foo",
"collectionB":[{
"some":"foo",
"andOther":"stuff"
},{
"some":"bar",
"andOther":"random"
}]
},
{
"_id":"5F0BB248-E628-4B8F-A2F6-FECD79B78354",
"someValue": "bar"
}]

I'd suggest something simple like this from the console:
db.collB.find().forEach(function(doc) {
var aid = doc.collectionAID;
if (typeof aid === 'undefined') { return; } // nothing
delete doc["_id"]; // remove property
delete doc["collectionAID"]; // remove property
db.collA.update({_id: aid}, /* match the ID from B */
{ $push : { collectionB : doc }});
});
It loops through each document in collectionB and if there is a field collectionAID defined, it removes the unnecessary properties (_id and collectionAID). Finally, it updates a matching document in collectionA by using the $push operator to add the document from B to the field collectionB. If the field doesn't exist, it is automatically created as an array with the newly inserted document. If it does exist as an array, it will be appended. (If it exists, but isn't an array, it will fail). Because the update call isn't using upsert, if the _id in the collectionB document doesn't exist, nothing will happen.
You can extend it to delete other fields as necessary or possibly add more robust error handling if for example a document from B doesn't match anything in A.
Running the code above on your data produces this:
{ "_id" : "5F0BB248-E628-4B8F-A2F6-FECD79B78354", "someValue" : "bar" }
{ "_id" : "90A26C2A-4976-4EDD-850D-2ED8BEA46F9E",
"collectionB" : [
{
"some" : "foo",
"andOther" : "stuff"
},
{
"some" : "bar",
"andOther" : "random"
}
],
"someValue" : "foo"
}

Sadly mapreduce can't produce full documents.
https://jira.mongodb.org/browse/SERVER-2517
No idea why despite all the attention, whining and upvotes they haven't changed it. So you'll have to do this manually in the language of your choice.
Hopefully you've indexed 'collectionAID' which should improve the speed of your queries. Just write something that goes through your A collection one document at a time, loading the _id and then adding the array from Collection B.

There is a much faster way than https://stackoverflow.com/a/22676205/1578508
You can do it the other way round and run through the collection you want to insert your documents in. (Far less executions!)
db.collA.find().forEach(function (x) {
var collBs = db.collB.find({"collectionAID":x._id},{"_id":0,"collectionA":0});
x.collectionB = collBs.toArray();
db.collA.save(x);
})

Related

Projecting nested documents in Mongo

I am trying to find and project from a nested structure. For example, I have the following document where each unit might have an embedded sub-unit:
{
"_id" : 1,
"unit" : {
"_id" : 2,
"unit" : {
"_id" : 3,
"unit" : {
"_id" : 4
}
}
}
}
And I want to get the id's of all the subunits under unit 1:
[{_id:2}, {_id:3}, {_id:4}]
$graphlookup does not seem to handle this kind of nested structure. As far as I understand, it works when the units are saved at a single level without nesting and each keep a reference to its parent unit.
What is the correct way to retrieve the desired result?
Firstly, $graphlookup isn't operator for your problem, because it's recursive search on a collection, not recursive in a document
$graphLookup Performs a recursive search on a collection, with options
for restricting the search by recursion depth and query filter.
Therefore, it didn't recursive search in your document, it only recursive search on a collection (includes multiple documents), it cannot handle your problem.
With your problem, I think it is not responsibility of Mongo, because you've retrieved your wanted document. You want to parse the retrieved document to array of sub-documents, you can do it in your language.
Example if you use JavaScript (Node.JS for backend), you can parse this document to array:
const a = {
"_id": 1,
"unit": {
"_id": 2,
"unit": {
"_id": 3,
"unit": {
"_id": 4
}
}
}
}
const parse = o => {
const { _id } = o;
if (!o.unit) return [{ _id }];
return [{ _id }, ...parse(o.unit) ];
}
console.log(parse(a.unit));
You can not do that from mongodb query. Mongodb will guarantee the document with id :1, and will not recursively search inside the document.
What you can do is: retrieve the document from mongodb, then parse it into a Map object and retrieve the information from that map, recursively.

Can I update the exsisting record of mongodb by its id? [duplicate]

I want update an _id field of one document. I know it's not really good practice. But for some technical reason, I need to update it.
If I try to update it I get:
db.clients.update({ _id: ObjectId("123")}, { $set: { _id: ObjectId("456")}})
Performing an update on the path '_id' would modify the immutable field '_id'
And the update is rejected. How I can update it?
You cannot update it. You'll have to save the document using a new _id, and then remove the old document.
// store the document in a variable
doc = db.clients.findOne({_id: ObjectId("4cc45467c55f4d2d2a000002")})
// set a new _id on the document
doc._id = ObjectId("4c8a331bda76c559ef000004")
// insert the document, using the new _id
db.clients.insert(doc)
// remove the document with the old _id
db.clients.remove({_id: ObjectId("4cc45467c55f4d2d2a000002")})
To do it for your whole collection you can also use a loop (based on Niels example):
db.status.find().forEach(function(doc){
doc._id=doc.UserId; db.status_new.insert(doc);
});
db.status_new.renameCollection("status", true);
In this case UserId was the new ID I wanted to use
In case, you want to rename _id in same collection (for instance, if you want to prefix some _ids):
db.someCollection.find().snapshot().forEach(function(doc) {
if (doc._id.indexOf("2019:") != 0) {
print("Processing: " + doc._id);
var oldDocId = doc._id;
doc._id = "2019:" + doc._id;
db.someCollection.insert(doc);
db.someCollection.remove({_id: oldDocId});
}
});
if (doc._id.indexOf("2019:") != 0) {... needed to prevent infinite loop, since forEach picks the inserted docs, even throught .snapshot() method used.
Here I have a solution that avoid multiple requests, for loops and old document removal.
You can easily create a new idea manually using something like:_id:ObjectId()
But knowing Mongo will automatically assign an _id if missing, you can use aggregate to create a $project containing all the fields of your document, but omit the field _id. You can then save it with $out
So if your document is:
{
"_id":ObjectId("5b5ed345cfbce6787588e480"),
"title": "foo",
"description": "bar"
}
Then your query will be:
db.getCollection('myCollection').aggregate([
{$match:
{_id: ObjectId("5b5ed345cfbce6787588e480")}
}
{$project:
{
title: '$title',
description: '$description'
}
},
{$out: 'myCollection'}
])
You can also create a new document from MongoDB compass or using command and set the specific _id value that you want.
As a very small improvement to the above answers i would suggest using
let doc1 = {... doc};
then
db.dyn_user_metricFormulaDefinitions.deleteOne({_id: doc._id});
This way we don't need to create extra variable to hold old _id.
Slightly modified example of #Florent Arlandis above where we insert _id from a different field in a document:
> db.coll.insertOne({ "_id": 1, "item": { "product": { "id": 11 } }, "source": "Good Store" })
{ "acknowledged" : true, "insertedId" : 1 }
> db.coll.aggregate( [ { $set: { _id : "$item.product.id" }}, { $out: "coll" } ]) // inserting _id you want for the current collection
> db.coll.find() // check that _id is changed
{ "_id" : 11, "item" : { "product" : { "id" : 11 } }, "source" : "Good Store" }
Do not use $match filter + $out as in #Florent Arlandis's answer since $out fully remove data in collection before inserting aggregate result, so effectively you will loose all data that don't match to $match filter

Finding which elements in an array do not exist for a field value in MongoDB

I have a collection users as follows:
{ "_id" : ObjectId("51780f796ec4051a536015cf"), "userId" : "John" }
{ "_id" : ObjectId("51780f796ec4051a536015d0"), "userId" : "Sam" }
{ "_id" : ObjectId("51780f796ec4051a536015d1"), "userId" : "John1" }
{ "_id" : ObjectId("51780f796ec4051a536015d2"), "userId" : "john2" }
Now I am trying to write a code which can provides suggestions of a userId to user in case id provided by user already exists in DB. In same routine I just append values from 1 to 5 to the for example in case user have selected userId to be John, suggested user name array that needs to be checked for Id in database will look like this
[John,John1,John2,John3,John4,John5].
Now I just want to execute it against Db and to find out which of the suggested values do not exist in DB. So instead of selecting any document, I want to select values within suggested array which do not exist for users collection.
Any pointers are highly appreciated.
Your general approach here is you want to find the "distinct" values for the "userId's" that already exist in your collection. You then compare these to your input selection to see the difference.
var test = ["John","John1","John2","John3","John4","John5"];
var res = db.collection.aggregate([
{ "$match": { "userId": { "$in": test } }},
{ "$group": { "_id": "$userId" }}
]).toArray();
res.map(function(x){ return x._id });
test.filter(function(x){ return res.indexOf(x) == -1 })
The end result of this is the userId's that do not match in your initial input:
[ "John2", "John3", "John4", "John5" ]
The main operator there is $in which takes an array as the argument and compares those values against the specified field. The .aggregate() method is the best approach, there is a shorthand mapReduce wrapper in the .distinct() method which directly produces just the values in the array, removing the call to a function like .map() to strip out the values for a given key:
var res = db.collection.distinct("userId",{ "userId": { "$in": test } })
It should run notably slower though, especially on large collections.
Also do not forget to index your "userId" and likely you want this to be "unique" anyway so you really just want the $match or .find() result:
var res = db.collection.find({ "$in": test }).toArray()

Is there any equivalent in MongoDB for MS-SQL command 'SET IDENTITY_INSERT tablename OFF'? [duplicate]

I want update an _id field of one document. I know it's not really good practice. But for some technical reason, I need to update it.
If I try to update it I get:
db.clients.update({ _id: ObjectId("123")}, { $set: { _id: ObjectId("456")}})
Performing an update on the path '_id' would modify the immutable field '_id'
And the update is rejected. How I can update it?
You cannot update it. You'll have to save the document using a new _id, and then remove the old document.
// store the document in a variable
doc = db.clients.findOne({_id: ObjectId("4cc45467c55f4d2d2a000002")})
// set a new _id on the document
doc._id = ObjectId("4c8a331bda76c559ef000004")
// insert the document, using the new _id
db.clients.insert(doc)
// remove the document with the old _id
db.clients.remove({_id: ObjectId("4cc45467c55f4d2d2a000002")})
To do it for your whole collection you can also use a loop (based on Niels example):
db.status.find().forEach(function(doc){
doc._id=doc.UserId; db.status_new.insert(doc);
});
db.status_new.renameCollection("status", true);
In this case UserId was the new ID I wanted to use
In case, you want to rename _id in same collection (for instance, if you want to prefix some _ids):
db.someCollection.find().snapshot().forEach(function(doc) {
if (doc._id.indexOf("2019:") != 0) {
print("Processing: " + doc._id);
var oldDocId = doc._id;
doc._id = "2019:" + doc._id;
db.someCollection.insert(doc);
db.someCollection.remove({_id: oldDocId});
}
});
if (doc._id.indexOf("2019:") != 0) {... needed to prevent infinite loop, since forEach picks the inserted docs, even throught .snapshot() method used.
Here I have a solution that avoid multiple requests, for loops and old document removal.
You can easily create a new idea manually using something like:_id:ObjectId()
But knowing Mongo will automatically assign an _id if missing, you can use aggregate to create a $project containing all the fields of your document, but omit the field _id. You can then save it with $out
So if your document is:
{
"_id":ObjectId("5b5ed345cfbce6787588e480"),
"title": "foo",
"description": "bar"
}
Then your query will be:
db.getCollection('myCollection').aggregate([
{$match:
{_id: ObjectId("5b5ed345cfbce6787588e480")}
}
{$project:
{
title: '$title',
description: '$description'
}
},
{$out: 'myCollection'}
])
You can also create a new document from MongoDB compass or using command and set the specific _id value that you want.
As a very small improvement to the above answers i would suggest using
let doc1 = {... doc};
then
db.dyn_user_metricFormulaDefinitions.deleteOne({_id: doc._id});
This way we don't need to create extra variable to hold old _id.
Slightly modified example of #Florent Arlandis above where we insert _id from a different field in a document:
> db.coll.insertOne({ "_id": 1, "item": { "product": { "id": 11 } }, "source": "Good Store" })
{ "acknowledged" : true, "insertedId" : 1 }
> db.coll.aggregate( [ { $set: { _id : "$item.product.id" }}, { $out: "coll" } ]) // inserting _id you want for the current collection
> db.coll.find() // check that _id is changed
{ "_id" : 11, "item" : { "product" : { "id" : 11 } }, "source" : "Good Store" }
Do not use $match filter + $out as in #Florent Arlandis's answer since $out fully remove data in collection before inserting aggregate result, so effectively you will loose all data that don't match to $match filter

How to update the _id of one MongoDB Document?

I want update an _id field of one document. I know it's not really good practice. But for some technical reason, I need to update it.
If I try to update it I get:
db.clients.update({ _id: ObjectId("123")}, { $set: { _id: ObjectId("456")}})
Performing an update on the path '_id' would modify the immutable field '_id'
And the update is rejected. How I can update it?
You cannot update it. You'll have to save the document using a new _id, and then remove the old document.
// store the document in a variable
doc = db.clients.findOne({_id: ObjectId("4cc45467c55f4d2d2a000002")})
// set a new _id on the document
doc._id = ObjectId("4c8a331bda76c559ef000004")
// insert the document, using the new _id
db.clients.insert(doc)
// remove the document with the old _id
db.clients.remove({_id: ObjectId("4cc45467c55f4d2d2a000002")})
To do it for your whole collection you can also use a loop (based on Niels example):
db.status.find().forEach(function(doc){
doc._id=doc.UserId; db.status_new.insert(doc);
});
db.status_new.renameCollection("status", true);
In this case UserId was the new ID I wanted to use
In case, you want to rename _id in same collection (for instance, if you want to prefix some _ids):
db.someCollection.find().snapshot().forEach(function(doc) {
if (doc._id.indexOf("2019:") != 0) {
print("Processing: " + doc._id);
var oldDocId = doc._id;
doc._id = "2019:" + doc._id;
db.someCollection.insert(doc);
db.someCollection.remove({_id: oldDocId});
}
});
if (doc._id.indexOf("2019:") != 0) {... needed to prevent infinite loop, since forEach picks the inserted docs, even throught .snapshot() method used.
Here I have a solution that avoid multiple requests, for loops and old document removal.
You can easily create a new idea manually using something like:_id:ObjectId()
But knowing Mongo will automatically assign an _id if missing, you can use aggregate to create a $project containing all the fields of your document, but omit the field _id. You can then save it with $out
So if your document is:
{
"_id":ObjectId("5b5ed345cfbce6787588e480"),
"title": "foo",
"description": "bar"
}
Then your query will be:
db.getCollection('myCollection').aggregate([
{$match:
{_id: ObjectId("5b5ed345cfbce6787588e480")}
}
{$project:
{
title: '$title',
description: '$description'
}
},
{$out: 'myCollection'}
])
You can also create a new document from MongoDB compass or using command and set the specific _id value that you want.
As a very small improvement to the above answers i would suggest using
let doc1 = {... doc};
then
db.dyn_user_metricFormulaDefinitions.deleteOne({_id: doc._id});
This way we don't need to create extra variable to hold old _id.
Slightly modified example of #Florent Arlandis above where we insert _id from a different field in a document:
> db.coll.insertOne({ "_id": 1, "item": { "product": { "id": 11 } }, "source": "Good Store" })
{ "acknowledged" : true, "insertedId" : 1 }
> db.coll.aggregate( [ { $set: { _id : "$item.product.id" }}, { $out: "coll" } ]) // inserting _id you want for the current collection
> db.coll.find() // check that _id is changed
{ "_id" : 11, "item" : { "product" : { "id" : 11 } }, "source" : "Good Store" }
Do not use $match filter + $out as in #Florent Arlandis's answer since $out fully remove data in collection before inserting aggregate result, so effectively you will loose all data that don't match to $match filter