Can we multiple two different fields from different collections in mongoDB?
any help will be highly appreciated...
Yes, you can using the Aggregation Pipeline $multiply operator. https://docs.mongodb.com/manual/reference/operator/aggregation/multiply/
What you want to do is join two collections together using $lookup https://docs.mongodb.com/manual/reference/operator/aggregation/lookup/. In this case, I'll join the accounts and transactions collections on the account_id field.
Then we can project the fields we want to multiply. In this case, I'm getting the first element in the account array, which represents the account document I'm joining from the accounts collection.
Finally, I can multiply the two fields together.
[{
$lookup: {
from: 'accounts',
localField: 'account_id',
foreignField: 'account_id',
as: 'account'
}
}, {
$project: {
account: {
$arrayElemAt: ["$account", 0]
},
transaction_count: "$transaction_count",
}
}, {
$project: {
product: {
$multiply: ["$transaction_count", "$account.limit"]
}
}
}]
To reproduce my solution above, create a free cluster in Atlas (https://www.mongodb.com/cloud/atlas) and then load the sample data. Navigate to the Cluster's Collections. Then navigate to the sample_analytics database and the transactions collection. Then navigate to the Aggregation tab. Here you can create an Aggregation Pipeline stage by stage. It's incredibly helpful so you can see the output of each stage as you build the next. Below is a screenshot of the Aggregation Pipeline I described in my solution above.
If you don't have experience with the Aggregation Pipeline, I highly recommend MongoDB University's free course: https://university.mongodb.com/courses/M121/about
MongoDB aggregation operations allows us join two collections with $lookup method and compute field operation (i.e $multiply)
Given
"collection": [
{
id: 1,
"total": 5
},
{
id: 2,
"total": 2
}
],
"collection2": [
{
collId: 1,
"total": 3
},
{
collId: 2,
"total": 4
}
]
db.collection.aggregate([
{
$lookup: {
from: "collection2",
let: {
col_id: "$id",
col_total: "$total",
},
pipeline: [
{
$match: {
$expr: {
$eq: [
"$collId",
"$$col_id"
]
}
}
},
{
$project: {
summary: {
$multiply: [
"$total",
"$$col_total"
]
}
}
}
],
as: "result"
}
},
{
$addFields: {
result: {
$let: {
vars: {
tmp: {
$arrayElemAt: [
"$result",
0
]
}
},
in: "$$tmp.summary"
}
}
}
}
])
MongoPlayground
Result
[
{
"_id": ObjectId("5a934e000102030405000000"),
"id": 1,
"result": 15,
"total": 5
},
{
"_id": ObjectId("5a934e000102030405000001"),
"id": 2,
"result": 8,
"total": 2
}
]
Related
How can I count the number of completed houses designed by a specific architect in MongoDB?
I have the next two collections, "plans" and "houses".
Where the only relationship between houses and plans is that houses have the id of a given plan.
Is there a way to do this in MongoDB with just one query?
plans
{
_id: ObjectId("6388024d0dfd27246fb47a5f")
"hight": 10,
"arquitec": "Aneesa Wade",
},
{
_id: ObjectId("1188024d0dfd27246fb4711f")
"hight": 50,
"arquitec": "Smith Stone",
}
houses
{
_id: ObjectId
"plansId": "6388024d0dfd27246fb47a5f" -> string,
"status": "under construction",
},
{
_id: ObjectId
"plansId": "6388024d0dfd27246fb47a5f" -> string,
"status": "completed",
}
What I tried was to use mongo aggregations while using $match and $lookup.
The "idea" with clear errors would be something like this.
db.houses.aggregate([
{"$match": {"status": "completed"}},
{
"$lookup": {
"from": "plans",
"pipeline": [
{
"$match": {
"$expr": {
"$and": [
{ "$eq": [ "houses.plansId", { "$toString": "$plans._id" }]},
{ "plans.arquitec" : "Smith Stone" },
]
}
}
},
],
}
}
If it's a single join condition, simply do a project to object ID to avoid any complicated lookup pipelines.
Example playground - https://mongoplayground.net/p/gaqxZ7SzDTg
db.houses.aggregate([
{
$match: {
status: "completed"
}
},
{
$project: {
_id: 1,
plansId: 1,
status: 1,
plans_id: {
$toObjectId: "$plansId"
}
}
},
{
$lookup: {
from: "plans",
localField: "plans_id",
foreignField: "_id",
as: "plan"
}
},
{
$project: {
_id: 1,
plansId: 1,
status: 1,
plan: {
$first: "$plan"
}
}
},
{
$match: {
"plan.arquitec": "Some One"
}
}
])
Update: As per OP comment, added additional match stage for filtering the final result based on the lookup response.
I have an e-commerce server where I have a products and an orders collection.
Any product document contains a unique productId e.g. prod_123.
Each order document contains a lineItems (array) field which returns the productIds of the purchased products as well as the respective quantity purchased e.g.
[{ productId: 'prod_123', quantity: 2 }, { productId: 'prod_234', quantity: 7 }, ...]
When my client fetches their orders, I want to populate the each of the lineItems elements' productId with the matching product document in the products collection.
I have written a mongoDB aggregation pipeline to achieve this, and this is it so far:
const orderPipeline = [
{
$match: { customerId: 'the customer's ID' },
},
{
$lookup: {
from: 'products',
let: { productIds: '$lineItems.productId' },
pipeline: [
{ $match: { $expr: { $in: ['$productId', '$$productIds'] } } },
//*** somehow, need to add in corresponding `lineItem.quantity` here
],
as: 'products',
},
},
{ $unset: ['lineItems'] },
];
However, as you can see, though the join is taking place, I cannot work out how to add the matched product's quantity to the joined product before I remove lineItems.
How can I add the corresponding quantity to the corresponding matched product?
One approach, that I'm pretty sure will work given the additional constraints mentioned in the comments, would be to leverage the $zip operator. Overall it would work like this:
Perform the $lookup generating an array (products) with the information retrieved from the other collection.
Use an $addFields stage as the place where most of the combination logic happens. It will $zip the two arrays together and then $map over it to $mergeObjects each of the pairs into a single object.
Finish with an $unset stage to remove the original lineItems field (which has already been merged into the recreated products array.
The full pipeline would look something like this:
db.orders.aggregate([
{
$match: {
customerId: 123
},
},
{
$lookup: {
from: "products",
let: {
productIds: "$lineItems.productId"
},
pipeline: [
{
$match: {
$expr: {
$in: [
"$productId",
"$$productIds"
]
}
}
}
],
as: "products",
}
},
{
"$addFields": {
"products": {
"$map": {
"input": {
"$zip": {
"inputs": [
"$lineItems",
"$products"
]
}
},
"in": {
"$mergeObjects": "$$this"
}
}
}
}
},
{
$unset: "lineItems"
}
])
Playground example here
The $map and the $mergeObjects: "$$this" probably look odd at first glance. This is needed because the $zip is going to generate an array of arrays (with 2 entries each), such as this:
"zipped": [
[
{
"productId": "a",
"quantity": 1
},
{
"_id": ObjectId("5a934e000102030405000002"),
"productId": "a"
}
],
[
{
"productId": "b",
"quantity": 2
},
{
"_id": ObjectId("5a934e000102030405000003"),
"productId": "b"
}
],
[
{
"productId": "c",
"quantity": 3
},
{
"_id": ObjectId("5a934e000102030405000004"),
"productId": "c"
}
]
]
(Here is a playground link that shows the output after zipping but before further processing.)
Because of this we need to collapse each of those into a single object, hence the $mergeObjects. And the fact that each object in the outer array is an array (with the two objects we want to merge) is why we can simply use "$$this" as the input expression for the operator.
Let's say I have these two collections:
// Members:
{
"_id":{
"$oid":"60dca71f0394f430c8ca296d"
},
"church":"60dbb265a75a610d90b45c6b",
"name":"Julio Verne Cerqueira"
},
{
"_id":{
"$oid":"60dca71f0394f430c8ca29a8"
},
"nome":"Ryan Steel Oliveira",
"church":"60dbb265a75a610d90b45c6c"
}
And
// Churches
{
"_id": {
"$oid": "60dbb265a75a610d90b45c6c"
},
"name": "Saint Antoine Hill",
"active": true
},
{
"_id": {
"$oid": "60dbb265a75a610d90b45c6b"
},
"name": "Jackeline Hill",
"active": true
}
And I want to query it and have a result like this:
// Member with Church names
{
"_id":{
"$oid":"60dca71f0394f430c8ca296d"
},
"church":"Jackeline Hill",
"name":"Julio Verne Cerqueira"
},
{
"_id":{
"$oid":"60dca71f0394f430c8ca29a8"
},
"church":"Saint Antoine Hill",
"nome":"Ryan Steel Oliveira"
}
If I try a Lookup, I have the following Result: (It is getting the entire churches collection).
How would I do the query, so it gives me only the one church that member is related to?
And, if possible, how to Sort the result in alphabetical order by church then by name?
Obs.: MongoDB Version: 4.4.10
There is matching error in the $lookup --> $pipeline --> $match.
It should be:
$match: {
$expr: {
$eq: [
"$_id",
"$$searchId"
]
}
}
From the provided documents, members to churchies relationship will be 1 to many. Hence, when you join members with churchies via $lookup, the output church will be an array with only one churchies document.
Aggregation pipelines:
$lookup - Join members collection (by $$searchId) with churchies (by _id).
$unwind - Deconstruct church array field to multiple documents.
$project - Decorate output document.
$sort - Sort by church and name ascending.
db.members.aggregate([
{
"$lookup": {
"from": "churchies",
"let": {
searchId: {
"$toObjectId": "$church"
}
},
"pipeline": [
{
$match: {
$expr: {
$eq: [
"$_id",
"$$searchId"
]
}
}
},
{
$project: {
name: 1
}
}
],
"as": "church"
}
},
{
"$unwind": "$church"
},
{
$project: {
_id: 1,
church: "$church.name",
name: 1
}
},
{
"$sort": {
"church": 1,
"name": 1
}
}
])
Sample Mongo Playground
I have two collection:
Competition
{
"_id": "326",
signed_up": [
{"_id": "00001","category": ["First"], "status": true}]
}
and Playing
{
"_id": "6076e504db319b11c077d473",
"competition_id": "326",
"player": {"player_id": "00001","handicap": 6},
"totalScore": 6
}
I want to add playing --> totalScore on competition.signed_up array, based on player_id field:
{
"_id": "326",
signed_up": [
{"_id": "00001","category": ["First"], "status": true, "totalScore": 6]
}
I do not know how to do...
I'm not telling you this is the optimal way, but it seems to work...
Let's start out with the data. I've added one player to the competition, just to make it a little easier to see that things works as expected:
db.competition.insertOne({
"_id": "326",
"signed_up": [{
"_id": "00001",
"category": ["First"],
"status": true
}, {
"_id": "00002",
"category": ["First"],
"status": true
}]
})
db.playing.insertMany([
{
"competition_id": "326",
"player": {
"playing_id": "00001"
},
"totalScore": 6
},
{
"competition_id": "326",
"player": {
"playing_id": "00002"
},
"totalScore": 2
}
]);
Now for the aggregation...
db.competition.aggregate([
// Even though the [documentation](https://docs.mongodb.com/manual/reference/operator/aggregation/lookup/#use--lookup-with-an-array) states that unwinding is no longer necessary,
// I'm not sure if that includes arrays of subdocuments or only arrays of primitives. So I've chosen to unwind anyway...
{
$unwind: "$signed_up"
},
// => { "_id": "326", "signed_up": { "_id": "00001", ....} }
// now we have each player in it's own document and can easily lookup the score from playing collection
{
$lookup: {
from: 'playing',
localField: 'signed_up._id',
foreignField: 'player.playing_id',
as: 'player'
}
},
// => { "_id": "326", "signed_up": {...}, "player": [{ competition_id": "326"...}, ..]}
// now we have the matching competition documents as an array on each document.
// But we know there will only be one match and don't really care for the array,
// so we have to do some gymnastics to get the data we want where we want it
{
$project: {
"signed_up": {
$let: {
vars: {
player: { $arrayElemAt: [ "$player", 0 ] }
},
in: {
$mergeObjects: [
"$signed_up",
{ "totalScore": "$$player.totalScore" }
]
}
}
}
}
},
// => { "_id": "326", "signed_up": { "_id": "00001", .... , "totalScore": 6 } }
// Now we're pretty much done, except that we need to group the documents back
// into the original competition documents
{
$group: {
_id: "$_id",
signed_up: {
$push: "$signed_up"
}
}
}
// => { "_id": "326", "signed_up": [ { "_id": "00001", ....}, {"_id": "00002", ...} ] }
// And that completes the pipeline.
]);
I see that you have the id from the competition document also on the playing document, so I suspect that you need an additional check on the lookup to make sure you get the correct match. The way the code I have works, is that if you have more than one competition, you will get all the competitions for a player added to the playing array after the lookup.
If you take a look at the example Specify Multiple Join Conditions with $lookup in the documentation, you see how you can change the $lookup stage to do a more precise match on the target documents by using a pipeline on the target collection. It also shows how you can include a projection in that pipeline to only return the data that you really want.
Edit
Take a look at the following alternative lookup step:
{
$lookup: {
from: 'playing',
let: { playerid: "$signed_up._id", compid: "$_id" },
pipeline: [
{ $match: {
$expr: {
$and: [
{ $eq: ["$player.playing_id","$$playerid" ] },
{ $eq: ["$competition_id", "$$compid" ] }
]
}
}
},
{ $project: {
_id: 0,
"totalScore": 1
}
}
],
as: 'player'
}
}
This stores the players id and competition id from the current document into two variables. Then it uses those two variables in a pipeline run against the other collection. In addition to the $match to select the right player/competition document, it also includes a $project to get rid of the other fields on the playing documents. It will still return an array of one object, but it might save some bytes of memory usage...
I'm trying to get an average rating in my Mongo aggregate and am having trouble accessing the nested array. I've gotten my aggregation to give the following array. I'm trying to have city_reviews return an array of averages.
[
{
"_id": "Dallas",
"city_reviews": [
//arrays of restaurant objects that include the rating
//I would like to get an average of the rating in each review, so these arrays will be numbers (averages)
[ {
"_id": "5b7ead6d106f0553d8807276",
"created": "2018-08-23T12:41:29.791Z",
"text": "Crackin good place. ",
"rating": 4,
"store": "5b7d67d5356114089909e58d",
"author": "5b7d675e356114089909e58b",
"__v": 0
}, {review2}, {review3}]
[{review1}, {review2}, {review3}],
[{review1}. {review2}],
[{review1}, {review2}, {review3}, {review4}],
[]
]
},
{
"_id": "Houston",
"city_reviews": [
// arrays of restaurants
[{review1}, {review2}, {review3}],
[{review1}, {review2}, {review3}],
[{review1}, {review2}, {review3}, {review4}],
[],
[]
]
}
]
I would like to do an aggregation on this that returns an array of averages within the city_reviews, like this:
{
"_id": "Dallas",
"city_reviews": [
// arrays of rating averages
[4.7],
[4.3],
[3.4],
[],
[]
]
}
Here's what I've tried. It's giving me back averageRating of null, because $city_reviews is an array of object and I'm not telling it to go deep enough to capture the rating key.
return this.aggregate([
{ $lookup: { from: 'reviews', localField: '_id', foreignField: 'store', as:
'reviews' }},
{$group: {_id: '$city', city_reviews: { $push : '$reviews'}}},
{ $project: {
averageRating: { $avg: '$city_reviews'}
}}
])
Is there a way to work with this line so I can return arrays of averages instead of the full review objects.
averageRating: { $avg: '$city_reviews'}
EDIT: Was asked for entire pipeline.
return this.aggregate([
{ $lookup: { from: 'reviews', localField: '_id', foreignField: 'store', as: 'reviews' }},
{$group: {
_id: '$city',
city_reviews: { $push : '$reviews'}}
},
{ $project: {
photo: '$$ROOT.photo',
name: '$$ROOT.name',
reviews: '$$ROOT.reviews',
slug: '$$ROOT.slug',
city: '$$ROOT.city',
"averageRatingIndex":{
"$map":{
"input":"$city_reviews",
"in":[{"$avg":"$$this.rating"}]
}
},
}
},
{ $sort: { averageRating: -1 }},
{ $limit: 5 }
])
My first query was to connect two models together:
{ $lookup: { from: 'reviews', localField: '_id', foreignField: 'store', as: 'reviews' }},
Which resulted in this:
[ {
"_id": "5b7d67d5356114089909e58d",
"location": {},
"tags": [],
"created": "2018-08-22T13:23:23.224Z",
"name": "Lucia",
"description": "Great name",
"city": "Dallas",
"photo": "ab64b3e7-6207-41d8-a670-94315e4b23af.jpeg",
"author": "5b7d675e356114089909e58b",
"slug": "lucia",
"__v": 0,
"reviews": []
},
{..more object like above}
]
Then, I grouped them like this:
{$group: {
_id: '$city',
city_reviews: { $push : '$reviews'}}
}
This returned what my original question is about. Essentially, I just want to have a total average rating for each city. My accepted answer does answer my original question. I'm getting back this:
{
"_id": "Dallas",
"averageRatingIndex": [
[ 4.2 ],
[ 3.6666666666666665 ],
[ null ],
[ 3.2 ],
[ 5 ],
[ null ]
]
}
I've tried to use the $avg operator on this to return one, final average that I can display for each city, but I'm having trouble.
You can use $map to with $avg to output avg.
{"$project":{
"averageRating":{
"$map":{
"input":"$city_reviews",
"in":[{"$avg":"$$this.rating"}]
}
}
}}
With respect to your optimization request, I don't think there's a lot of room for improvement beyond the version that you already have. However, the following pipeline might be faster than your current solution because of the initial $group stage which should result in way less $lookups. I am not sure how MongoDB will optimize all of that internally so you might want to profile the two versions against a real data set.
db.getCollection('something').aggregate([{
$group: {
_id: '$city', // group by city
"averageRating": { $push: "$_id" } // create array of all encountered "_id"s per "city" bucket - we use the target field name to avoid creation of superfluous fields which would need to be removed from the output later on
}
}, {
$lookup: {
from: 'reviews',
let: { "averageRating": "$averageRating" }, // create a variable called "$$ids" which will hold the previously created array of "_id"s
pipeline: [{
$match: { $expr: { $in: [ "$store", "$$averageRating" ] } } // do the usual "joining"
}, {
$group: {
"_id": null, // group all found items into the same single bucket
"rating": { $avg: "$rating" }, // calculate the avg on a per "store" basis
}
}],
as: 'averageRating'
}
}, {
$sort: { "averageRating.rating": -1 }
}, {
$limit: 5
}, {
$addFields: { // beautification of the output only, technically not needed - we do this as the last stage in order to only do it for the max. of 5 documents that we're interested in
"averageRating": { // this is where we reuse the field we created in the first stage
$arrayElemAt: [ "$averageRating.rating", 0 ] // pull the first element inside the array outside of the array
}
}
}])
In fact, the "initial $group stage" approach could also be used in conjunction with #Veerams solution like this:
db.collection.aggregate([{
$group: {
_id: '$city', // group by city
"averageRating": { $push: "$_id" } // create array of all encountered "_id"s per "city" bucket - we use the target field name to avoid creation of superfluous fields which would need to be removed from the output later on
}
}, {
$lookup: {
from: 'reviews',
localField: 'averageRating',
foreignField: 'store',
as: 'averageRating'
},
}, {
$project: {
"averageRating": {
$avg: {
$map: {
input: "$averageRating",
in: { $avg: "$$this.rating" }
}
}
}
}
}, {
$sort: { averageRating: -1 }
}, {
$limit: 5
}])