MongoDB add to joining collection field from base one - mongodb

I have two collections:
Games with schema:
_id: ObjectId('gameId'),
questions: [
{
position: 1,
question_id: ObjectId('baz')
},
{
position: 2,
question_id: ObjectId('ban')
},
]
Questions with schema:
_id: ObjectId('baz'),
text: 'FooBar'
And now I'd like to join questions to games with adding to each question record value of question_position.
So, I have query like this:
db.games.aggregate([
{
$lookup: {
from: 'questions',
localField: 'questions.question_id',
foreignField: '_id',
as: 'question_data',
},
}])
Which return me all required info, with correct join according to questions array,
_id: ObjectId('gameId'),
questions: [
{
position: 1,
question_id: ObjectId('baz')
},
{
position: 2,
question_id: ObjectId('ban')
}
],
question_data: [
{
_id: ObjectId('baz'),
text: 'FooBar',
},
{
_id: ObjectId('ban'),
text: 'FooBar2',
}
]
but I'm totally can't figure out how to add into joined questions it's position according to game.
To look it like this:
_id: ObjectId('gameId'),
questions: [
{
position: 1,
question_id: ObjectId('baz')
},
{
position: 2,
question_id: ObjectId('ban')
}
],
question_data: [
{
_id: ObjectId('baz'),
text: 'FooBar',
position: 1,
},
{
_id: ObjectId('ban'),
text: 'FooBar2',
position: 2,
}
]
I've tried with $unwind for question array in game collection, played a little with $project in aggregation but still no result.
So, my question is, how to add field from base collection to joined data from another collection

You need to first $unwind the questions array and then need to apply $lookup and finally use $group to rollback again into the array.
db.games.aggregate([
{ "$unwind": "$questions" },
{ "$lookup": {
"from": "questions",
"localField": "questions.question_id",
"foreignField": "_id",
"as": "question_data"
}},
{ "$unwind": "$question_data" },
{ "$addFields": {
"question_data.position": "$questions.position",
"question_data.question_id": "$questions.question_id"
}},
{ "$group": {
"_id": "$_id",
"questions": { "$push": "$questions" },
"question_data": { "$push": "$question_data" }
}}
])

Related

MongoDB: Populate reference in $group when aggregating?

I have a collection that I need to group by year. My aggregation pipeline is as such:
const WorkHistoryByYear = await this.workHistroyModel.aggregate([
{
$group: {
_id: '$year',
history: {
$push: '$$ROOT'
}
}
}
]);
Which works as expected, returning:
[{
"_id": 2003,
"history": [
{
"_id": "600331b3d84ac418877a0e5a",
"tasksPerformed": [
"5fffb180a477c4f78ad67331",
"5fffb18aa477c4f78ad67332"
],
"year": 2003
},
{
"_id": "600331dcd84ac418877a0e5c",
"tasksPerformed": [
"5fffb180a477c4f78ad67331"
],
"year": 2003
}
]
}]
but I'd like to populate a field if possible.
The WorkHistory schema has a field, tasksPerformed which is an array of ObjectId references. Here is the Task schema:
export const TaskSchema = new Schema({
active: {
type: Schema.Types.Boolean,
default: true,
},
title: {
type: Schema.Types.String,
required: true,
},
order: {
type: Schema.Types.Number,
index: true,
}
});
Is it possible to populate the referenced models within the aggregation? $lookup seems to be what I need, but I have yet to get that to work when following the documentation.
I don't do a lot of database work, so I'm having some difficulty finding the right operator(s) to use, and I've seen similar questions, but not a definitive answer.
Edit:
After adding the code from #varman's answer, my return is now:
{
"_id": 2003,
"history": {
"_id": "600331b3d84ac418877a0e5a",
"tasksPerformed": [
"5fffb180a477c4f78ad67331",
"5fffb18aa477c4f78ad67332"
],
"year": 2003,
"history": {
"tasksPerformed": []
}
}
}
I converted the ObjectId references to strings in an effort to help the matching, but I'm still coming up short.
You can do the lookup to join both collections
$unwind to deconstruct the array. (Array to Objects)
$lookup to join collections
$group to reconstruct the deconstructed array again
The script for the above result is
db.workHistory.aggregate([
{
"$unwind": "$history"
},
{
"$lookup": {
"from": "taskSchema",
"localField": "history.tasksPerformed",
"foreignField": "_id",
"as": "history.tasksPerformed"
}
},
{
"$group": {
"_id": "$_id",
"history": {
"$push": "$history"
}
}
}
])
Working Mongo playground
But before grouping, you have collection look like this
db={
"workHistory": [
{
"_id": 2003,
"history": [
{
"_id": "600331b3d84ac418877a0e5a",
"tasksPerformed": [
"5fffb180a477c4f78ad67331",
"5fffb18aa477c4f78ad67332"
],
"year": 2003
},
{
"_id": "600331dcd84ac418877a0e5c",
"tasksPerformed": [
"5fffb180a477c4f78ad67331"
],
"year": 2003
}
]
}
],
"taskSchema": [
{
"_id": "5fffb180a477c4f78ad67331",
"active": true,
"title": "first"
},
{
"_id": "5fffb18aa477c4f78ad67332",
"active": true,
"title": "second"
}
]
}
Since $unwind is expensive, we could have done aggregation
db.workHistory.aggregate([
{
"$lookup": {
"from": "taskSchema",
"localField": "tasksPerformed",
"foreignField": "_id",
"as": "tasksPerformed"
}
},
{
$group: {
_id: "$year",
history: {
$push: "$$ROOT"
}
}
}
])
Working Mongo playground
you could also do it without an unwind:
db.WorkHistory.aggregate(
[
{
$lookup: {
from: "Tasks",
localField: "tasksPerformed",
foreignField: "_id",
as: "tasksPerformed"
}
},
{
$group: {
_id: "$year",
history: { $push: "$$ROOT" }
}
}
])

perform lookup on array from another collection in MongoDB

I have a collection of Orders. each order has a list of Items, and each Item has catalog_id, which is an ObjectId pointing to the Catalogs collection.
I need an aggregate query that will retrieve certain orders - each order with its Items in extended fashion including the Catalog name and SKU. i.e:
Original data structure:
Orders: [{
_id : ObjectId('ord1'),
items : [{
catalog_id: ObjectId('xyz1'),
qty: 5
},
{
catalog_id: ObjectId('xyz2'),
qty: 3
}]
Catalogs: [{
_id : ObjectId('xyz1')
name: 'my catalog name',
SKU: 'XxYxZx1'
},{
_id : ObjectId('xyz2')
name: 'my other catalog name',
SKU: 'XxYxZx2'
}
]
ideal outcome would be:
Orders: [{
_id : ObjectId('ord1'),
items : [{
catalog_id: ObjectId('xyz1'),
catalog_name: 'my catalog name',
catalog_SKU: 'XxYxZx1' ,
qty: 5
},
{
catalog_id: ObjectId('xyz2'),
catalog_name: 'my other catalog name',
catalog_SKU: 'XxYxZx2' ,
qty: 3
}
]
What I did so far was:
db.orders.aggregate(
[
{
$match: {merchant_order_id: 'NIM333'}
},
{
$lookup: {
from: "catalogs",
//localField: 'items.catalog_id',
//foreignField: '_id',
let: { 'catalogId' : 'items.catalog_id' },
pipeline: [
{
$match : {$expr:{$eq:["$catalogs._id", "$$catalogId"]}}
},
{
$project: {"name": 1, "merchant_SKU": 1 }
}
],
as: "items_ex"
},
},
])
but items_ex comes out empty for some reason i cannot understand.
You need to first $unwind the items and reconstruct the array back using $group to match the exact position of qty with the catalogs_id inside the items array
db.orders.aggregate([
{ "$match": { "merchant_order_id": "NIM333" }},
{ "$unwind": "$items" },
{ "$lookup": {
"from": "catalogs",
"let": { "catalogId": "$items.catalog_id", "qty": "$items.qty" },
"pipeline": [
{ "$match": { "$expr": { "$eq": ["$_id", "$$catalogId"] } }},
{ "$project": { "name": 1, "merchant_SKU": 1, "qty": "$$qty" }}
],
"as": "items"
}},
{ "$unwind": "$items" },
{ "$group": {
"_id": "$_id",
"items": { "$push": "$items" },
"data": { "$first": "$$ROOT" }
}},
{ "$replaceRoot": {
"newRoot": {
"$mergeObjects": ["$data", { "items": "$items" }]
}
}}
])
MongoPlayground
You're missing a dollar sign when you define your pipeline variable. There should be:
let: { 'catalogId' : '$items.catalog_id' },
and also this expression returns an array to you need $in instead of $eq:
{
$lookup: {
from: "catalogs",
let: { 'catalogId' : 'items.catalog_id' },
pipeline: [
{
$match : {$expr:{$in:["$_id", "$$catalogId"]}}
},
{
$project: {"name": 1, "merchant_SKU": 1 }
}
],
as: "items_ex"
}
}
Mongo Playground

MongoDb: Getting $avg in aggregate for complex data

I'm trying to get an average rating in my Mongo aggregate and am having trouble accessing the nested array. I've gotten my aggregation to give the following array. I'm trying to have city_reviews return an array of averages.
[
{
"_id": "Dallas",
"city_reviews": [
//arrays of restaurant objects that include the rating
//I would like to get an average of the rating in each review, so these arrays will be numbers (averages)
[ {
"_id": "5b7ead6d106f0553d8807276",
"created": "2018-08-23T12:41:29.791Z",
"text": "Crackin good place. ",
"rating": 4,
"store": "5b7d67d5356114089909e58d",
"author": "5b7d675e356114089909e58b",
"__v": 0
}, {review2}, {review3}]
[{review1}, {review2}, {review3}],
[{review1}. {review2}],
[{review1}, {review2}, {review3}, {review4}],
[]
]
},
{
"_id": "Houston",
"city_reviews": [
// arrays of restaurants
[{review1}, {review2}, {review3}],
[{review1}, {review2}, {review3}],
[{review1}, {review2}, {review3}, {review4}],
[],
[]
]
}
]
I would like to do an aggregation on this that returns an array of averages within the city_reviews, like this:
{
"_id": "Dallas",
"city_reviews": [
// arrays of rating averages
[4.7],
[4.3],
[3.4],
[],
[]
]
}
Here's what I've tried. It's giving me back averageRating of null, because $city_reviews is an array of object and I'm not telling it to go deep enough to capture the rating key.
return this.aggregate([
{ $lookup: { from: 'reviews', localField: '_id', foreignField: 'store', as:
'reviews' }},
{$group: {_id: '$city', city_reviews: { $push : '$reviews'}}},
{ $project: {
averageRating: { $avg: '$city_reviews'}
}}
])
Is there a way to work with this line so I can return arrays of averages instead of the full review objects.
averageRating: { $avg: '$city_reviews'}
EDIT: Was asked for entire pipeline.
return this.aggregate([
{ $lookup: { from: 'reviews', localField: '_id', foreignField: 'store', as: 'reviews' }},
{$group: {
_id: '$city',
city_reviews: { $push : '$reviews'}}
},
{ $project: {
photo: '$$ROOT.photo',
name: '$$ROOT.name',
reviews: '$$ROOT.reviews',
slug: '$$ROOT.slug',
city: '$$ROOT.city',
"averageRatingIndex":{
"$map":{
"input":"$city_reviews",
"in":[{"$avg":"$$this.rating"}]
}
},
}
},
{ $sort: { averageRating: -1 }},
{ $limit: 5 }
])
My first query was to connect two models together:
{ $lookup: { from: 'reviews', localField: '_id', foreignField: 'store', as: 'reviews' }},
Which resulted in this:
[ {
"_id": "5b7d67d5356114089909e58d",
"location": {},
"tags": [],
"created": "2018-08-22T13:23:23.224Z",
"name": "Lucia",
"description": "Great name",
"city": "Dallas",
"photo": "ab64b3e7-6207-41d8-a670-94315e4b23af.jpeg",
"author": "5b7d675e356114089909e58b",
"slug": "lucia",
"__v": 0,
"reviews": []
},
{..more object like above}
]
Then, I grouped them like this:
{$group: {
_id: '$city',
city_reviews: { $push : '$reviews'}}
}
This returned what my original question is about. Essentially, I just want to have a total average rating for each city. My accepted answer does answer my original question. I'm getting back this:
{
"_id": "Dallas",
"averageRatingIndex": [
[ 4.2 ],
[ 3.6666666666666665 ],
[ null ],
[ 3.2 ],
[ 5 ],
[ null ]
]
}
I've tried to use the $avg operator on this to return one, final average that I can display for each city, but I'm having trouble.
You can use $map to with $avg to output avg.
{"$project":{
"averageRating":{
"$map":{
"input":"$city_reviews",
"in":[{"$avg":"$$this.rating"}]
}
}
}}
With respect to your optimization request, I don't think there's a lot of room for improvement beyond the version that you already have. However, the following pipeline might be faster than your current solution because of the initial $group stage which should result in way less $lookups. I am not sure how MongoDB will optimize all of that internally so you might want to profile the two versions against a real data set.
db.getCollection('something').aggregate([{
$group: {
_id: '$city', // group by city
"averageRating": { $push: "$_id" } // create array of all encountered "_id"s per "city" bucket - we use the target field name to avoid creation of superfluous fields which would need to be removed from the output later on
}
}, {
$lookup: {
from: 'reviews',
let: { "averageRating": "$averageRating" }, // create a variable called "$$ids" which will hold the previously created array of "_id"s
pipeline: [{
$match: { $expr: { $in: [ "$store", "$$averageRating" ] } } // do the usual "joining"
}, {
$group: {
"_id": null, // group all found items into the same single bucket
"rating": { $avg: "$rating" }, // calculate the avg on a per "store" basis
}
}],
as: 'averageRating'
}
}, {
$sort: { "averageRating.rating": -1 }
}, {
$limit: 5
}, {
$addFields: { // beautification of the output only, technically not needed - we do this as the last stage in order to only do it for the max. of 5 documents that we're interested in
"averageRating": { // this is where we reuse the field we created in the first stage
$arrayElemAt: [ "$averageRating.rating", 0 ] // pull the first element inside the array outside of the array
}
}
}])
In fact, the "initial $group stage" approach could also be used in conjunction with #Veerams solution like this:
db.collection.aggregate([{
$group: {
_id: '$city', // group by city
"averageRating": { $push: "$_id" } // create array of all encountered "_id"s per "city" bucket - we use the target field name to avoid creation of superfluous fields which would need to be removed from the output later on
}
}, {
$lookup: {
from: 'reviews',
localField: 'averageRating',
foreignField: 'store',
as: 'averageRating'
},
}, {
$project: {
"averageRating": {
$avg: {
$map: {
input: "$averageRating",
in: { $avg: "$$this.rating" }
}
}
}
}
}, {
$sort: { averageRating: -1 }
}, {
$limit: 5
}])

$lookup when foreignField is in nested array

I have two collections :
Student
{
_id: ObjectId("657..."),
name:'abc'
},
{
_id: ObjectId("593..."),
name:'xyz'
}
Library
{
_id: ObjectId("987..."),
book_name:'book1',
issued_to: [
{
student: ObjectId("657...")
},
{
student: ObjectId("658...")
}
]
},
{
_id: ObjectId("898..."),
book_name:'book2',
issued_to: [
{
student: ObjectId("593...")
},
{
student: ObjectId("594...")
}
]
}
I want to make a Join to Student collection that exists in issued_to array of object field in Library collection.
I would like to make a query to student collection to get the student data as well as in library collection, that will check in issued_to array if the student exists or not if exists then get the library document otherwise not.
I have tried $lookup of mongo 3.6 but I didn`t succeed.
db.student.aggregate([{$match:{_id: ObjectId("593...")}}, $lookup: {from: 'library', let: {stu_id:'$_id'}, pipeline:[$match:{$expr: {$and:[{"$hotlist.clientEngagement": "$$stu_id"]}}]}])
But it thorws error please help me in regard of this. I also looked at other questions asked at stackoverflow like. question on stackoverflow,
question2 on stackoverflow but these are comapring simple fields not array of objects. please help me
I am not sure I understand your question entirely but this should help you:
db.student.aggregate([{
$match: { _id: ObjectId("657...") }
}, {
$lookup: {
from: 'library',
localField: '_id' ,
foreignField: 'issued_to.student',
as: 'result'
}
}])
If you want to only get the all book_names for each student you can do this:
db.student.aggregate([{
$match: { _id: ObjectId("657657657657657657657657") }
}, {
$lookup: {
from: 'library',
let: { 'stu_id': '$_id' },
pipeline: [{
$unwind: '$issued_to' // $expr cannot digest arrays so we need to unwind which hurts performance...
}, {
$match: { $expr: { $eq: [ '$issued_to.student', '$$stu_id' ] } }
}, {
$project: { _id: 0, "book_name": 1 } // only include the book_name field
}],
as: 'result'
}
}])
This might not be a very good answer, but if you can change your schema of Library to:
{
_id: ObjectId("987..."),
book_name:'book1'
issued_to: [
ObjectId("657..."),
ObjectId("658...")
]
},
{
_id: "ObjectId("898...")",
book_name:'book2'
issued_to: [
ObjectId("593...")
ObjectId("594...")
]
}
Then when you do:
{
$lookup: {
from: 'student',
localField: 'issued_to',
foreignField: '_id',
as: 'issued_to_students', // this creates a new field without overwriting your original 'issued_to'
}
},
You should get, based on your example above:
{
_id: ObjectId("987..."),
book_name:'book1'
issued_to_students: [
{ _id: ObjectId("657..."), name: 'abc', ... },
{ _id: ObjectId("658..."), name: <name of this _id>, ... }
]
},
{
_id: "ObjectId("898...")",
book_name:'book2'
issued_to: [
{ _id: ObjectId("593..."), name: 'xyz', ... },
{ _id: ObjectId("594..."), name: <name of this _id>, ... }
]
}
You need to $unwind the issued_to from library collection to match the issued_to.student with _id
db.student.aggregate([
{ "$match": { "_id": mongoose.Types.ObjectId(id) } },
{ "$lookup": {
"from": Library.collection.name,
"let": { "studentId": "$_id" },
"pipeline": [
{ "$unwind": "$issued_to" },
{ "$match": { "$expr": { "$eq": [ "$issued_to.student", "$$studentId" ] } } }
],
"as": "issued_to"
}}
])

How to resolve the many-to-many relation keeping the order of ID array in mongoDB

I have two collections posts and tags on mongoDB.
There is a many-to-many relationship between these collections.
A post can belong to some tags, and a tag can contain some posts.
I am looking for an efficient query method to join posts to tags keeping the order of postIds.
If the data schema is inappropriate, I can change it.
The mongoDB version is 3.6.5
Sample data
db.posts.insertMany([
{ _id: 'post001', title: 'this is post001' },
{ _id: 'post002', title: 'this is post002' },
{ _id: 'post003', title: 'this is post003' }
])
db.tags.insertMany([
{ _id: 'tag001', postIds: ['post003', 'post001', 'post002'] }
])
Desired result
{
"_id": "tag001",
"postIds": [ "post003", "post001", "post002" ],
"posts": [
{ "_id": "post003", "title": "this is post003" },
{ "_id": "post001", "title": "this is post001" },
{ "_id": "post002", "title": "this is post002" }
]
}
What I tried
I tried a query which use $lookup.
db.tags.aggregate([
{ $lookup: {
from: 'posts',
localField: 'postIds',
foreignField: '_id',
as: 'posts'
}}
])
However I got a result which is different from I want.
{
"_id": "tag001",
"postIds": [ "post003", "post001", "post002" ],
"posts": [
{ "_id": "post001", "title": "this is post001" },
{ "_id": "post002", "title": "this is post002" },
{ "_id": "post003", "title": "this is post003" }
]
}
In MongoDB you would attempt to model your data such that you avoid joins (as in $lookups) alltogether, e.g. by storing the tags alongside the posts.
db.posts.insertMany([
{ _id: 'post001', title: 'this is post001', tags: [ "tag001", "tag002" ] },
{ _id: 'post002', title: 'this is post002', tags: [ "tag001" ] },
{ _id: 'post003', title: 'this is post003', tags: [ "tag002" ] }
])
With this structure in place you could get the desired result like this:
db.posts.aggregate([{
$unwind: "$tags"
}, {
$group: {
_id: "$tags",
postsIds: {
$push: "$_id"
},
posts: {
$push: "$$ROOT"
}
}
}])
In this case, I would doubt that you even need the postIds field in the result as it would be contained in the posts array anyway.
You can use a combination of $map and $filter to re-order elements in the posts array in a projection stage:
db.tags.aggregate([
{ $lookup: {
from: 'posts',
localField: 'postIds',
foreignField: '_id',
as: 'posts'
} },
{ $project: {
_id: 1,
postIds: 1,
posts: { $map: {
input: "$postIds",
as: "postId",
in: {
$arrayElemAt: [ { $filter: {
input: "$posts",
as: "post",
cond: { $eq: ["$$post._id", "$$postId"] }
} }, 0 ]
}
} }
} }
])
The missing posts will be filled with null to keep index consistent with postIds.