Mongodb, aggregate query with $lookup - mongodb

Got two collecetions, tags and persons.
tags model:
{
en: String,
sv: String
}
person model:
{
name: String,
projects: [
title: String,
tags: [
{
type: Schema.ObjectId,
ref: 'tag'
}
]
]
}
I want query that returns all tags that is in use in the person model. All documents.
Sometehing like
var query = mongoose.model('tag').find({...});
Or should I somehow use the aggregate approach to this?

For any particular person document, you can use the populate() function like
var query = mongoose.model("person").find({ "name": "foo" }).populate("projects.tags");
And if you want to search for any persons that have any tag with 'MongoDB' or 'Node JS' for example, you can include the query option in the populate() function overload as:
var query = mongoose.model("person").find({ "name": "foo" }).populate({
"path": "projects.tags",
"match": { "en": { "$in": ["MongoDB", "Node JS"] } }
});
If you want all tags existing in "project.tags" for all persons, then aggregation framework is the way to go. Consider running this pipeline on the person collection and uses the $lookup operator to do a left join on the tags collection:
mongoose.model('person').aggregate([
{ "$unwind": "$projects" },
{ "$unwind": "$projects.tags" },
{
"$lookup": {
"from": "tags",
"localField": "projects.tags",
"foreignField": "_id",
"as": "resultingTagsArray"
}
},
{ "$unwind": "$resultingTagsArray" },
{
"$group": {
"_id": null,
"allTags": { "$addToSet": "$resultingTagsArray" },
"count": { "$sum": 1 }
}
}
]).exec(function(err, results){
console.log(results);
})
For any particular person then apply a $match pipeline as the first step to filter the documents:
mongoose.model('person').aggregate([
{ "$match": { "name": "foo" } },
{ "$unwind": "$projects" },
{ "$unwind": "$projects.tags" },
{
"$lookup": {
"from": "tags",
"localField": "projects.tags",
"foreignField": "_id",
"as": "resultingTagsArray"
}
},
{ "$unwind": "$resultingTagsArray" },
{
"$group": {
"_id": null,
"allTags": { "$addToSet": "$resultingTagsArray" },
"count": { "$sum": 1 }
}
}
]).exec(function(err, results){
console.log(results);
})
Another workaround if you are using MongoDB versions >= 2.6 or <= 3.0 which do not have support for the $lookup operator is to populate the results from the aggregation as:
mongoose.model('person').aggregate([
{ "$unwind": "$projects" },
{ "$unwind": "$projects.tags" },
{
"$group": {
"_id": null,
"allTags": { "$addToSet": "$projects.tags" }
}
}
], function(err, result) {
mongoose.model('person')
.populate(result, { "path": "allTags" }, function(err, results) {
if (err) throw err;
console.log(JSON.stringify(results, undefined, 4 ));
});
});

If you are using MongoDb version 3.2 then you can use $lookup which performs an left outer join.

Related

Get Data from another collection (string -> ObjectId)

Let's say I have these two collections:
// Members:
{
"_id":{
"$oid":"60dca71f0394f430c8ca296d"
},
"church":"60dbb265a75a610d90b45c6b",
"name":"Julio Verne Cerqueira"
},
{
"_id":{
"$oid":"60dca71f0394f430c8ca29a8"
},
"nome":"Ryan Steel Oliveira",
"church":"60dbb265a75a610d90b45c6c"
}
And
// Churches
{
"_id": {
"$oid": "60dbb265a75a610d90b45c6c"
},
"name": "Saint Antoine Hill",
"active": true
},
{
"_id": {
"$oid": "60dbb265a75a610d90b45c6b"
},
"name": "Jackeline Hill",
"active": true
}
And I want to query it and have a result like this:
// Member with Church names
{
"_id":{
"$oid":"60dca71f0394f430c8ca296d"
},
"church":"Jackeline Hill",
"name":"Julio Verne Cerqueira"
},
{
"_id":{
"$oid":"60dca71f0394f430c8ca29a8"
},
"church":"Saint Antoine Hill",
"nome":"Ryan Steel Oliveira"
}
If I try a Lookup, I have the following Result: (It is getting the entire churches collection).
How would I do the query, so it gives me only the one church that member is related to?
And, if possible, how to Sort the result in alphabetical order by church then by name?
Obs.: MongoDB Version: 4.4.10
There is matching error in the $lookup --> $pipeline --> $match.
It should be:
$match: {
$expr: {
$eq: [
"$_id",
"$$searchId"
]
}
}
From the provided documents, members to churchies relationship will be 1 to many. Hence, when you join members with churchies via $lookup, the output church will be an array with only one churchies document.
Aggregation pipelines:
$lookup - Join members collection (by $$searchId) with churchies (by _id).
$unwind - Deconstruct church array field to multiple documents.
$project - Decorate output document.
$sort - Sort by church and name ascending.
db.members.aggregate([
{
"$lookup": {
"from": "churchies",
"let": {
searchId: {
"$toObjectId": "$church"
}
},
"pipeline": [
{
$match: {
$expr: {
$eq: [
"$_id",
"$$searchId"
]
}
}
},
{
$project: {
name: 1
}
}
],
"as": "church"
}
},
{
"$unwind": "$church"
},
{
$project: {
_id: 1,
church: "$church.name",
name: 1
}
},
{
"$sort": {
"church": 1,
"name": 1
}
}
])
Sample Mongo Playground

MongoDB: Populate reference in $group when aggregating?

I have a collection that I need to group by year. My aggregation pipeline is as such:
const WorkHistoryByYear = await this.workHistroyModel.aggregate([
{
$group: {
_id: '$year',
history: {
$push: '$$ROOT'
}
}
}
]);
Which works as expected, returning:
[{
"_id": 2003,
"history": [
{
"_id": "600331b3d84ac418877a0e5a",
"tasksPerformed": [
"5fffb180a477c4f78ad67331",
"5fffb18aa477c4f78ad67332"
],
"year": 2003
},
{
"_id": "600331dcd84ac418877a0e5c",
"tasksPerformed": [
"5fffb180a477c4f78ad67331"
],
"year": 2003
}
]
}]
but I'd like to populate a field if possible.
The WorkHistory schema has a field, tasksPerformed which is an array of ObjectId references. Here is the Task schema:
export const TaskSchema = new Schema({
active: {
type: Schema.Types.Boolean,
default: true,
},
title: {
type: Schema.Types.String,
required: true,
},
order: {
type: Schema.Types.Number,
index: true,
}
});
Is it possible to populate the referenced models within the aggregation? $lookup seems to be what I need, but I have yet to get that to work when following the documentation.
I don't do a lot of database work, so I'm having some difficulty finding the right operator(s) to use, and I've seen similar questions, but not a definitive answer.
Edit:
After adding the code from #varman's answer, my return is now:
{
"_id": 2003,
"history": {
"_id": "600331b3d84ac418877a0e5a",
"tasksPerformed": [
"5fffb180a477c4f78ad67331",
"5fffb18aa477c4f78ad67332"
],
"year": 2003,
"history": {
"tasksPerformed": []
}
}
}
I converted the ObjectId references to strings in an effort to help the matching, but I'm still coming up short.
You can do the lookup to join both collections
$unwind to deconstruct the array. (Array to Objects)
$lookup to join collections
$group to reconstruct the deconstructed array again
The script for the above result is
db.workHistory.aggregate([
{
"$unwind": "$history"
},
{
"$lookup": {
"from": "taskSchema",
"localField": "history.tasksPerformed",
"foreignField": "_id",
"as": "history.tasksPerformed"
}
},
{
"$group": {
"_id": "$_id",
"history": {
"$push": "$history"
}
}
}
])
Working Mongo playground
But before grouping, you have collection look like this
db={
"workHistory": [
{
"_id": 2003,
"history": [
{
"_id": "600331b3d84ac418877a0e5a",
"tasksPerformed": [
"5fffb180a477c4f78ad67331",
"5fffb18aa477c4f78ad67332"
],
"year": 2003
},
{
"_id": "600331dcd84ac418877a0e5c",
"tasksPerformed": [
"5fffb180a477c4f78ad67331"
],
"year": 2003
}
]
}
],
"taskSchema": [
{
"_id": "5fffb180a477c4f78ad67331",
"active": true,
"title": "first"
},
{
"_id": "5fffb18aa477c4f78ad67332",
"active": true,
"title": "second"
}
]
}
Since $unwind is expensive, we could have done aggregation
db.workHistory.aggregate([
{
"$lookup": {
"from": "taskSchema",
"localField": "tasksPerformed",
"foreignField": "_id",
"as": "tasksPerformed"
}
},
{
$group: {
_id: "$year",
history: {
$push: "$$ROOT"
}
}
}
])
Working Mongo playground
you could also do it without an unwind:
db.WorkHistory.aggregate(
[
{
$lookup: {
from: "Tasks",
localField: "tasksPerformed",
foreignField: "_id",
as: "tasksPerformed"
}
},
{
$group: {
_id: "$year",
history: { $push: "$$ROOT" }
}
}
])

mongodb aggregation pipeline not returning proper result and slow

I have three collections users, products and orders , orders type has two possible values "Cash" and "Online". One users can have single/multiple products and products have none/single/multiple orders. I want to text search on users collection on name. Now I want to write a query which will return all matching users on text search highest text score first, it might be possible one user's name is returning top score but don't have any products and orders.
I have written a query but it's not returning users who has text score highest but don't have any products/orders. It's only returning users who has record present in all three collections. And also performance of this query is not great taking long time if a user has lot of products for example more than 3000 products. Any help appreciated.
db.users.aggregate(
[
{
"$match": {
"$text": {
"$search": "john"
}
}
},
{
"$addFields": {
"score": {
"$meta": "textScore"
}
}
},
{
"$sort": {
"Score": {
"$meta": "textScore"
}
}
},
{
"$skip": 0
},
{
"$limit": 6
},
{
"$lookup": {
"from": "products",
"localField": "userId",
"foreignField": "userId",
"as": "products"
}
},
{ $unwind: '$products' },
{
"$lookup": {
"from": "orders",
"let": {
"products": "$products"
},
"pipeline": [
{
"$match": {
"$expr": {
"$and": [
{
"$in": [
"$productId",
["$$products.productId"]
]
},
{
"$eq": [
"$orderType",
"Cash"
]
}
]
}
}
}
],
"as": "orders"
}
},
{ $unwind: 'orders' },
{
$group: {
_id: "$_id",
name: { $first: "$name" },
userId: { $first: "$userId" },
products: { $addToSet: "$products" },
orders: { $addToSet: "$orders" },
score: { $first: "$score" },
}
},
{ $sort: { "score": -1 } }
]
);
Issue:
Every lookup produces an array which holds the matched documents. When no documents are found, the array would be empty. Unwinding that empty array would break the pipeline immediately. That's the reason, we are not getting user records with no products/orders. We would need to preserve such arrays so that the pipeline execution can continue.
Improvements:
In orders lookup, the $eq can be used instead of $in, as we already
unwinded the products array and each document now contains only
single productId
Create an index on userId in products collection to make the query more efficient
Following is the updated query:
db.users.aggregate([
{
"$match": {
"$text": {
"$search": "john"
}
}
},
{
"$addFields": {
"score": {
"$meta": "textScore"
}
}
},
{
"$skip": 0
},
{
"$limit": 6
},
{
"$lookup": {
"from": "products",
"localField": "userId",
"foreignField": "userId",
"as": "products"
}
},
{
$unwind: {
"path":"$products",
"preserveNullAndEmptyArrays":true
}
},
{
"$lookup": {
"from": "orders",
"let": {
"products": "$products"
},
"pipeline": [
{
"$match": {
"$expr": {
"$and": [
{
"$eq": [
"$productId",
"$$products.productId"
]
},
{
"$eq": [
"$orderType",
"Cash"
]
}
]
}
}
}
],
"as": "orders"
}
},
{
$unwind: {
"path":"$orders"
"preserveNullAndEmptyArrays":true
}
},
{
$group: {
_id: "$_id",
name: {
$first: "$name"
},
userId: {
$first: "$userId"
},
products: {
$addToSet: "$products"
},
orders: {
$addToSet: "$orders"
},
score: {
$first: "$score"
}
}
},
{
$sort: {
"score": -1
}
}
]);
To get more information on unwind, please check https://docs.mongodb.com/manual/reference/operator/aggregation/unwind/

Merge the original array of objects into the "as" field after a $lookup

I have a hero collection where each hero document looks like the following:
{
_id:'the-name-of-the-hero',
name: 'Name of Hero',
(...), //other properties to this hero
relations: [
{
hero: 'the-id-of-another-hero',
relationType: 'trust'
},
{
hero: 'yet-another-id-of-another-hero',
relationType: 'hate'
}
]
}
The relations.hero points to an _id of another hero. I needed to grab some more information of the related heroes, therefore I used aggregate $lookup to match each against the "hero" collection, to grab it's name (and other data, but project simplified for the question). Here the currently working query, docummented:
let aggregate = db.collection('hero').aggregate([
// grabbing an specific hero
{ $match: { _id } },
//populate relations
{
$lookup: {
from: 'hero',
let: { letId: '$relations.hero' }, //create a local variable for the pipeline to use
// localField: "relations.hero", //this would bring entire hero data, which is unnecessary
// foreignField: "_id", //this would bring entire hero data, which is unnecessary
pipeline: [
//match each $relations.hero (as "$$letId") in collection hero's (as "from") $_id
{ $match: { $expr: { $in: ['$_id', '$$letId'] } } },
//grab only the _id and name of the matched heroes
{ $project: { name: 1, _id: 1 } },
//sort by name
{ $sort:{ name: 1 } }
],
//replace the current relations with the new relations
as: 'relations',
},
}
]).toArray(someCallbackHere);
In short, $lookup on hero collection using a pipeline that match each of relations.hero and bring back only the _id and name (which has the real name to be printed on UI) and replace current relations with this new relations, generating the document as:
{
_id:'the-name-of-the-hero',
name: 'Name of Hero',
(...), //other properties to this hero
relations: [
{
_id: 'the-id-of-another-hero',
name: 'The Real Name of Another Hero',
},
{
_id: 'yet-another-id-of-another-hero',
name: 'Yet Another Real Name of Another Hero',
}
]
}
The question:
What can I add on the pipeline to make it merge the matched heroes with the original relations, in order to not only have the projected _id and name, but also the original relationType? That is, have the following result:
{
_id:'the-name-of-the-hero',
name: 'Name of Hero',
(...), //other properties to this hero
relations: [
{
_id: 'the-id-of-another-hero',
name: 'The Real Name of Another Hero',
relationType: 'trust' //<= kept from the original relations
},
{
_id: 'yet-another-id-of-another-hero',
name: 'Yet Another Real Name of Another Hero',
relationType: 'hate' //<= kept from the original relations
}
]
}
I tried exporting as: 'relationsFull' and then tried to $push with $mergeObjects as part of a next step into the aggregation but no luck. I tried to do the same as a pipeline step (instead of a new aggregate step) but always end up relations as empty array..
How would I write a new aggregation step to merge old relations objects with the new looked-up relations?
Note: Consider MongoDB 3.6 or later (that is, $unwind array is not needed, at least for the $lookup). I'm querying using Node.js driver, if that info matters.
You can use below aggregation
db.collection("hero").aggregate([
{ "$match": { _id } },
{ "$unwind": "$relations" },
{ "$lookup": {
"from": "hero",
"let": { "letId": "$relations.hero" },
"pipeline": [
{ "$match": { "$expr": { "$eq": ["$_id", "$$letId"] } } },
{ "$project": { "name": 1 } }
],
"as": "relation"
}},
{ "$unwind": "$relation" },
{ "$addFields": { "relations.name": "$relation.name" }},
{ "$group": {
"_id": "$_id",
"relations": { "$push": "$relations" },
"name": { "$first": "$name" },
"rarity": { "$first": "$rarity" },
"classType": { "$first": "$classType" }
}}
])
Or alternate you can use this as well
db.collection("hero").aggregate([
{ "$match": { _id } },
{ "$lookup": {
"from": "hero",
"let": { "letId": "$relations.hero" },
"pipeline": [
{ "$match": { "$expr": { "$in": ["$_id", "$$letId"] } } },
{ "$project": { "name": 1 } }
],
"as": "lookupRelations"
}},
{ "$addFields": {
"relations": {
"$map": {
"input": "$relations",
"as": "rel",
"in": {
"$mergeObjects": [
"$$rel",
{ "name": { "$arrayElemAt": ["$lookupRelations.name", { "$indexOfArray": ["$lookupRelations._id", "$$rel._id"] }] }}
]
}
}
}
}}
])
Well, I think we should use different name for the as field.From there, we can use the following expression the the $addFields stage.
{
"$addFields": {
"relations": {
"$reduce": {
"input": {
"$reduce": {
"input": {
"$zip": {
"inputs": [
"$relations",
"$relheros"
]
}
},
"initialValue": [
],
"in": {
"$concatArrays": [
"$$value",
"$$this"
]
}
}
},
"initialValue": {
},
"in": {
"$mergeObjects": [
"$$value",
"$$this"
]
}
}
}
}
}
Note that the relheros here is the as field.
We really should not $unwind and $group here, before $unwind is cheap but $group is expensive.

$lookup for each element in array mongo aggregation

I have users schema with people field containing viewers property which contains an array of ObjectIds as a reference of users collection itself for example
{
"username": "user1",
"password": "password",
"email": "user1#gmail.com",
"people": [{
"id": 1,
"viewers": ["ObjectId....", "ObjectId...."]
},
{
"id": 2,
"viewers": ["ObjectId....", "ObjectId...."]
}
]
}
What I need to do is $lookup for each element inside viewers the problem is when I use below pipeline is pushing the same viewers for each document
{
$unwind: {
path: "$people.viewers",
preserveNullAndEmptyArrays: true
}
},
{
$lookup: {
from: "users",
localField: "people.viewers",
foreignField: "_id",
as: "people"
}
},
{
$unwind: {
path: "$viewers",
preserveNullAndEmptyArrays: true
}
},
{
$project: {
username: 1,
"viewers.username": 1
}
},
{
$group: {
_id: "$_id",
username: { $first: "$username" },
people: { $first: "$people" },
viewers: { $push: "$viewers" }
}
},
{
$project: {
username: 1,
"people.id": 1,
"people.viewers": "$viewers"
}
}
What is the wrong in that aggregation
You can try below aggregation in mongodb 3.4 and above
User.aggregate([
{ "$unwind": { "path": "$people", "preserveNullAndEmptyArrays": true }},
{ "$lookup": {
"from": "users",
"localField": "people.viewers",
"foreignField": "_id",
"as": "people.viewers"
}},
{ "$group": {
"_id": "$_id",
"username": { "$first": "$username" },
"email": { "$first": "$email" },
"people": { "$push": "$people" }
}}
])