$lookup : computed foreinField workaround? - mongodb

For an existing mongo database, the link between 2 collections is done by :
collA : field collB_id
collB : field _id = ObjectId("a string value")
where collB_id is _id.valueOf()
i.e. : the value of collB_id of collA is "a string value"
but in a $lookup :
localField: "collB_id",
foreignField: _id.valueOf(),
don't work, so what can I do ?
Mongodb v3.6

If i understood you correctly, you have two collections where documents from first collection (collA) reference documents from second collection (collB). And the problem is that you store reference as a string value of that objectId, so you cant use $lookup to join those docs.
collA:
{
"_id" : ObjectId(...),
"collB_id" : "123456...",
...
}
collB:
{
"_id" : ObjectId("123456..."),
...
}
If you are using mongo 4.0+ you can do it with following aggregation:
db.getCollection('collA').aggregate([
{
$addFields: {
converted_collB_id: { $toObjectId: "$collB_id" }
}
},
{
$lookup: {
from: 'collB',
localField: 'converted_collB_id',
foreignField: '_id',
as: 'joined'
}
}
]);
Mongo 4.0 introduces new aggregation pipeline operators $toObjectId and $toString.
That allows you to add new field which will be an ObjectId created from the string value stored in collB_id, and use that new field as localField in lookup.
I would strongly advise you not to store ObjectIds as strings.
You already experienced the problem with $lookup.
Also, size of ObjectId is 12 bytes while its hex representation is 24 bytes (that is twice the size). You will probably want to index that field as well, so you want it to be as small as possible.
ObjectId also contains timestamp which you can get by calling getTimestamp()
Make your life easier by using native types when possible!
Hope this helps!

Related

How concurrency/locking works on documents while passing mongoDB aggregation pipeline

Consider we have two collection coll1 and coll2. I am applying some aggregation stages to the coll1
db.coll1.aggregate([
{ $match: { ... } },
{ $lookup:{
from: "coll2",
localField: "_id",
foreignField: "_id",
as: "coll2"
}
}
// followed by other stages
//last stage being $merge
{ $merge : { into: "coll3", on: "_id"} }
])
So, my query is:
While the aggregation is in progress, is it possible the underlying collection, in this case coll1, is allowed to be modified/updated ? In either case, please help to understand how it works (went through mongoDb docs, but could not understand)
How does it write the final coll3 ? In sense, does it write all in one shot or one document as it finish the pipeline ?
In regards to spring-data-mongodb, I am successfully able to call mongoOperation.aggregate() for the above aggregation pipeline, but it returns aggregationResult object with Zero mappedResults.( When checked in db, coll3 is getting created).
Does $merge not return any such details ?
I am using mongoDb 4.2

MongoDB $lookup to replace only the ID in an array of objects

I have the following example JSON-object saved in my mongodb-collection named "Profile"
{
name: "Test",
relations: [
{
personid: <MongoDB-ID>,
type: "Friend",
duration: 5
},
{
personid: <MongoDB-ID>,
type: "Family",
duration: 9
},
]
}
I've used the mongoose-Aggregate function because i need to add artificial fields based on caluclation in the documents saved. At the end of my aggregation i use the $lookup-function to replace the property "personID" in the objects inside of the "relations"-array.
{
$lookup:
{
from: PersonModel.collection.name,
localField: 'relations.personid',
foreignField: '_id',
as: "relations.personid"
}
}
Because i want each "person" in the array of objects to replaced within the populated from the specific person-document.
This does not work as expected.
I also tried to call ".populate()" on the result returned by the aggregate function which also not worked.
Setting localField to relations.personid is not supported. localField either needs to point to an array where each member is used for the join or be a plain value. The usual way of getting around this is to $unwind first, perform the lookup, and then $group back if needed.

Native MongoDB query joining 2 tables using ObjectID as foreign field

I have a mongo database that I didn't create, but need to extract some additional data from. I have access to the database, and can look at and query the tables, but I have been unable to join two tables to provide all the necessary information in my query output.
I have 2 tables that I am interested in. First Setting, which has documents that looks like this (truncated):
{
"_id" : ObjectId("5aff382669153dc20edf945a"),
"key" : "mgmt",
"site_id" : "5aff382669153dc20edf9456",
"advanced_feature_enabled" : false,
"auto_upgrade" : true
}
and site, that looks like this (truncated):
{
"_id" : ObjectId("5aff382669153dc20edf9456"),
"desc" : "Joe's Crab Shack",
}
site_id in the setting table is a foreign key that refers to the hex component _id in the site table.
I would like the output to resemble this, but I have been unsuccessful:
{
"_id" : ObjectId("5aff382669153dc20edf9456"),
"desc" : "Joe's Crab Shack",
"setting" : [
"_id" : ObjectId("5aff382669153dc20edf945a"),
"key" : "mgmt",
"site_id" : "5aff382669153dc20edf9456",
"advanced_feature_enabled" : false,
"auto_upgrade" : true
]
}
I would like to be able to do only a database query, currently using Robo 3T, and not have to resort to scripting or programming. The closest that I have been able to get to the desired outcome is below. This query returns the site documents, with all the setting documents that don't have a site_id foreign key.
db.site.aggregate([
{
$lookup:
{
from: "setting",
localField:"_id.str",
foreignField:"site_id",
as: "setting"
}
}
])
I'm sure that I am missing something simple, but I am very new to MongoDB, and am a little lost due to the terminology differences between SQL and Mongo. For any that are wondering, the database is actually the back end for a large, multisite Ubiquiti controller, and I am looking to create "reports" that provide more insight to devices that require upgrades, what settings don't meet our "default" configuration, etc.
If site_id in setting collection is of type ObjectId, then you can try this
db.site.aggregate([
{
$lookup: {
from: "setting",
localField: "_id",
foreignField: "site_id",
as: "setting"
}
}
])
and you can test it here Mongo Playground
If site_id is of type string, then you cannot do that lookup, as _id and site_id should be matched in type too (ObjectIds), so you can change the setting schema to make the site_id of type ObjectId and refers to site schema
instead of this
site_id: String
use this
site_id: { type: ObjectId, ref: 'site' }
Update
If site_id is a string and you are not able to change the setting schema, then we can do some work around
we can add a new property to the site document in the aggregate pipeline, which will store the _id of the document but as a string not ObjectId
something like this
db.site.aggregate([
{
$addFields: {
siteIdString: { // this is the stringified version of the _id
$toString: "$_id"
}
}
},
{
$lookup: {
from: "setting",
localField: "siteIdString", // then use this string id in lookup stage
foreignField: "site_id",
as: "setting"
}
}
])
you can test it here Mongo Playground2
hope it helps

Why MongoDb sort is slow with lookup collections

I have two collections in my mongodb database as follows:
employee_details with approximately 330000 documents which has department_id as a reference from departments collection
departments collections with 2 fields _id and dept_name
I want to join the above two collections using department_id as foreign key by using lookup method. Join works fine but the mongo query execution takes long time when I add sort.
Note: The execution is fast If I remove the sort object or If I remove the lookup method.
I have referred several posts in different blogs and SO, but none of them give a solution with sort.
My query is given below:
db.getCollection("employee_details").aggregate([
{
$lookup: {
from: "departments",
localField: "department_id",
foreignField: "_id",
as: "Department"
}
},
{ $unwind: { path: "$Department", preserveNullAndEmptyArrays: true } },
{ $sort: { employee_fname: -1 } },
{ $limit: 10 }
]);
Can someone give a method to make the above query to work without delay, as my client cannot compromise with the performance delay. I hope there is some method to fix the performance issue as nosql is intented to handle large database.
Any indexing methods is available there? so that I can use it along with my same collection structure.
Thanks in advance.
Currently lookup will be made for every employee_details which means for 330000 times, but if we first sort and limit before lookup, it will be only 10 times. This will greatly decrease query time.
db.getCollection('employee_details').aggregate([
{$sort : {employee_fname: -1}},
{$limit :10},
{
$lookup : {
from : "departments",
localField : "department_id",
foreignField : "_id",
as : "Department"
}
},
{ $unwind : { path: "$Department", preserveNullAndEmptyArrays: true }},
])
After trying this, if you even want to decrease the response time you can define an index on the sort field.
db.employee_details.createIndex( { employee_fname: -1 } )

Is it possible to compare string with ObjectId via $lookup

table1 has a field string "value" and table2 has a field "value" as ObjectId, Is it possible to do a query like this or how to write
table1.aggregate([
{
$lookup: {
from: "table2",
localField: "value",
foreignField: "_id",
as: "test"
}
}
])
As far I know to join collections using $lookup operator in MongoDB data type should be same. If type mismatch then $lookup will not work. So to join you should use those field that are same type because it check equality.
The $lookup stage does an equality match between a field from the
input documents with a field from the documents of the “joined”
collection
If localField type object then foreignField should be object
If localField type string then foreignField should be string
If localField type number then foreignField should be number
$lookup Documentation