How to add entity field to joined documents? - mongodb

I have an e-commerce server where I have a products and an orders collection.
Any product document contains a unique productId e.g. prod_123.
Each order document contains a lineItems (array) field which returns the productIds of the purchased products as well as the respective quantity purchased e.g.
[{ productId: 'prod_123', quantity: 2 }, { productId: 'prod_234', quantity: 7 }, ...]
When my client fetches their orders, I want to populate the each of the lineItems elements' productId with the matching product document in the products collection.
I have written a mongoDB aggregation pipeline to achieve this, and this is it so far:
const orderPipeline = [
{
$match: { customerId: 'the customer's ID' },
},
{
$lookup: {
from: 'products',
let: { productIds: '$lineItems.productId' },
pipeline: [
{ $match: { $expr: { $in: ['$productId', '$$productIds'] } } },
//*** somehow, need to add in corresponding `lineItem.quantity` here
],
as: 'products',
},
},
{ $unset: ['lineItems'] },
];
However, as you can see, though the join is taking place, I cannot work out how to add the matched product's quantity to the joined product before I remove lineItems.
How can I add the corresponding quantity to the corresponding matched product?

One approach, that I'm pretty sure will work given the additional constraints mentioned in the comments, would be to leverage the $zip operator. Overall it would work like this:
Perform the $lookup generating an array (products) with the information retrieved from the other collection.
Use an $addFields stage as the place where most of the combination logic happens. It will $zip the two arrays together and then $map over it to $mergeObjects each of the pairs into a single object.
Finish with an $unset stage to remove the original lineItems field (which has already been merged into the recreated products array.
The full pipeline would look something like this:
db.orders.aggregate([
{
$match: {
customerId: 123
},
},
{
$lookup: {
from: "products",
let: {
productIds: "$lineItems.productId"
},
pipeline: [
{
$match: {
$expr: {
$in: [
"$productId",
"$$productIds"
]
}
}
}
],
as: "products",
}
},
{
"$addFields": {
"products": {
"$map": {
"input": {
"$zip": {
"inputs": [
"$lineItems",
"$products"
]
}
},
"in": {
"$mergeObjects": "$$this"
}
}
}
}
},
{
$unset: "lineItems"
}
])
Playground example here
The $map and the $mergeObjects: "$$this" probably look odd at first glance. This is needed because the $zip is going to generate an array of arrays (with 2 entries each), such as this:
"zipped": [
[
{
"productId": "a",
"quantity": 1
},
{
"_id": ObjectId("5a934e000102030405000002"),
"productId": "a"
}
],
[
{
"productId": "b",
"quantity": 2
},
{
"_id": ObjectId("5a934e000102030405000003"),
"productId": "b"
}
],
[
{
"productId": "c",
"quantity": 3
},
{
"_id": ObjectId("5a934e000102030405000004"),
"productId": "c"
}
]
]
(Here is a playground link that shows the output after zipping but before further processing.)
Because of this we need to collapse each of those into a single object, hence the $mergeObjects. And the fact that each object in the outer array is an array (with the two objects we want to merge) is why we can simply use "$$this" as the input expression for the operator.

Related

How to query MongoDB array to see if it has multiple instances of same $elemMatch?

I have a collection where each document has an array (called elements) in it. Each element within this array represents an object with a name and age value. I have some documents where in the array, I will have duplicate objects, or at least objects with some similar values (i.e. the name is the same).
My collection looks like this (notice how the 2nd document has 2 instances of Bob):
[
{
"_id": {
"$oid": "6395471f80495752e7208c63"
},
"elements": [
{
"name": "Alice",
"age": 20
},
{
"name": "Bob",
"age": 21
},
{
"name": "Charlie",
"age": 23
}
]
},
{
"_id": {
"$oid": "6395486980495752e7208c67"
},
"elements": [
{
"name": "Alice",
"age": 20
},
{
"name": "Bob",
"age": 21
},
{
"name": "Bob",
"age": 24
}
]
}
]
I want to be able to build a query with $elemMatch so that if I want to, I can find a document which has multiple instances of the same $elemMatch element, i.e. I want to be able to find a document which has an elements array with 2 Bob's in it.
I have tried doing a query like the one below, but with no success.
db.collection.find({
$and: [
{
elements: {
$elemMatch: {
name: "Bob"
}
}
},
{
elements: {
$elemMatch: {
name: "Bob"
}
}
}
]
})
The intended result of this query would be as follows:
[
{
"_id": ObjectId("6395486980495752e7208c67"),
"elements": [
{
"age": 20,
"name": "Alice"
},
{
"age": 21,
"name": "Bob"
},
{
"age": 24,
"name": "Bob"
}
]
}
]
Here is a MongoPlayground link which may make the problem easier to view.
Your current attempt does not instruct the database to find two different array elements which match the (same) condition. The second array entry in the first sample document is allowed to satisfy both of the (duplicated) conditions that are $anded together, hence it matching and being returned.
To instruct the database to do that additional checking, we'll need to use something like the $reduce operator. Typically these were available in the aggregation framework, but we can pull them in here via $expr. That component of the query might look like this:
$expr: {
$gte: [
{
$reduce: {
input: "$elements",
initialValue: 0,
in: {
$sum: [
"$$value",
{
$cond: [
{
$eq: [
"$$this.name",
"Bob"
]
},
1,
0
]
}
]
}
}
},
2
]
}
Playground example here.
A different approach would be to use $size after processing the elements array via $filter. Something like this:
$expr: {
$gte: [
{
$size: {
$filter: {
input: "$elements",
cond: {
$eq: [
"$$this.name",
"Bob"
]
}
}
}
},
2
]
}
Example playground here.
In either case you will want to retain the original filter so that the database can more efficiently identify candidate documents prior to doing the more expensive processing to identify the final results.
As an aside, $elemMatch isn't strictly necessary here. Assuming elements is always an array and that you're only querying a single condition, the dot notation equivalent ('elements.name': 'Bob') would yield the same as the $elemMatch version.

Pretty $lookup on collection - mongoDB

I have two collection:
Competition
{
"_id": "326",
signed_up": [
{"_id": "00001","category": ["First"], "status": true}]
}
and Playing
{
"_id": "6076e504db319b11c077d473",
"competition_id": "326",
"player": {"player_id": "00001","handicap": 6},
"totalScore": 6
}
I want to add playing --> totalScore on competition.signed_up array, based on player_id field:
{
"_id": "326",
signed_up": [
{"_id": "00001","category": ["First"], "status": true, "totalScore": 6]
}
I do not know how to do...
I'm not telling you this is the optimal way, but it seems to work...
Let's start out with the data. I've added one player to the competition, just to make it a little easier to see that things works as expected:
db.competition.insertOne({
"_id": "326",
"signed_up": [{
"_id": "00001",
"category": ["First"],
"status": true
}, {
"_id": "00002",
"category": ["First"],
"status": true
}]
})
db.playing.insertMany([
{
"competition_id": "326",
"player": {
"playing_id": "00001"
},
"totalScore": 6
},
{
"competition_id": "326",
"player": {
"playing_id": "00002"
},
"totalScore": 2
}
]);
Now for the aggregation...
db.competition.aggregate([
// Even though the [documentation](https://docs.mongodb.com/manual/reference/operator/aggregation/lookup/#use--lookup-with-an-array) states that unwinding is no longer necessary,
// I'm not sure if that includes arrays of subdocuments or only arrays of primitives. So I've chosen to unwind anyway...
{
$unwind: "$signed_up"
},
// => { "_id": "326", "signed_up": { "_id": "00001", ....} }
// now we have each player in it's own document and can easily lookup the score from playing collection
{
$lookup: {
from: 'playing',
localField: 'signed_up._id',
foreignField: 'player.playing_id',
as: 'player'
}
},
// => { "_id": "326", "signed_up": {...}, "player": [{ competition_id": "326"...}, ..]}
// now we have the matching competition documents as an array on each document.
// But we know there will only be one match and don't really care for the array,
// so we have to do some gymnastics to get the data we want where we want it
{
$project: {
"signed_up": {
$let: {
vars: {
player: { $arrayElemAt: [ "$player", 0 ] }
},
in: {
$mergeObjects: [
"$signed_up",
{ "totalScore": "$$player.totalScore" }
]
}
}
}
}
},
// => { "_id": "326", "signed_up": { "_id": "00001", .... , "totalScore": 6 } }
// Now we're pretty much done, except that we need to group the documents back
// into the original competition documents
{
$group: {
_id: "$_id",
signed_up: {
$push: "$signed_up"
}
}
}
// => { "_id": "326", "signed_up": [ { "_id": "00001", ....}, {"_id": "00002", ...} ] }
// And that completes the pipeline.
]);
I see that you have the id from the competition document also on the playing document, so I suspect that you need an additional check on the lookup to make sure you get the correct match. The way the code I have works, is that if you have more than one competition, you will get all the competitions for a player added to the playing array after the lookup.
If you take a look at the example Specify Multiple Join Conditions with $lookup in the documentation, you see how you can change the $lookup stage to do a more precise match on the target documents by using a pipeline on the target collection. It also shows how you can include a projection in that pipeline to only return the data that you really want.
Edit
Take a look at the following alternative lookup step:
{
$lookup: {
from: 'playing',
let: { playerid: "$signed_up._id", compid: "$_id" },
pipeline: [
{ $match: {
$expr: {
$and: [
{ $eq: ["$player.playing_id","$$playerid" ] },
{ $eq: ["$competition_id", "$$compid" ] }
]
}
}
},
{ $project: {
_id: 0,
"totalScore": 1
}
}
],
as: 'player'
}
}
This stores the players id and competition id from the current document into two variables. Then it uses those two variables in a pipeline run against the other collection. In addition to the $match to select the right player/competition document, it also includes a $project to get rid of the other fields on the playing documents. It will still return an array of one object, but it might save some bytes of memory usage...

Getting the $Max value within a nested array

I'm attempting to extract the highest value from an child Array within Object, that is within a parent Array - in a single MongoDB document.
The child Array is called data contained within the list parent Array, where i'm trying to extract the highest number, when compared to the rest of the same values.
I've tried using $Group and $max (example below) among other things - however not getting much success. - I am getting an array returned with all the number values: [2,3]
How do I search through the list Array and data Array to return the highest number?
Expected Output for the below example: {output: 3}
Example in MongoPlayground: https://mongoplayground.net/p/qw9Kz_WVYiS
Mongo DB Setup and Documents
db={
"groups": [
{
"_id": ObjectId("602ed22af42c404096407dda"),
"groupName": "Name"
}
],
"inventory": [
{
"_id": ObjectId("602ed22af42c404096407ddc"),
"linkedGroup": ObjectId("602ed22af42c404096407dda"),
"list": [
{
"_id": ObjectId("602eeb0621a11045638b7082"),
"data": {
"number": 2
},
},
{
"_id": ObjectId("602eec75c37147459ed7b12c"),
"data": {
"number": 3
}
}
]
}
]
}
Query
db.groups.aggregate([
{
"$lookup": {
"from": "inventory",
"localField": "_id",
"foreignField": "linkedGroup",
"as": "inventory_links"
}
},
{
$group: {
_id: 1,
output: {
$max: "$inventory_links.list.data.number"
},
},
}
])
$reduce to find the maximum. With your query, you can add other stages,
{
$addFields: {
_id: 1,
inventory_links: {"$arrayElemAt": ["$inventory_links",0]}
}
},
{
$project: {
output: {
$reduce: {
input: "$inventory_links.list",
initialValue: 0,
in: {
$cond: [
{$gte: [ "$$this.data.number","$$value"]},
"$$this.data.number",
"$$value"
]
}
}
}
}
}
Working Mongo playground

Different Fields Multiplication in MongoDB

Can we multiple two different fields from different collections in mongoDB?
any help will be highly appreciated...
Yes, you can using the Aggregation Pipeline $multiply operator. https://docs.mongodb.com/manual/reference/operator/aggregation/multiply/
What you want to do is join two collections together using $lookup https://docs.mongodb.com/manual/reference/operator/aggregation/lookup/. In this case, I'll join the accounts and transactions collections on the account_id field.
Then we can project the fields we want to multiply. In this case, I'm getting the first element in the account array, which represents the account document I'm joining from the accounts collection.
Finally, I can multiply the two fields together.
[{
$lookup: {
from: 'accounts',
localField: 'account_id',
foreignField: 'account_id',
as: 'account'
}
}, {
$project: {
account: {
$arrayElemAt: ["$account", 0]
},
transaction_count: "$transaction_count",
}
}, {
$project: {
product: {
$multiply: ["$transaction_count", "$account.limit"]
}
}
}]
To reproduce my solution above, create a free cluster in Atlas (https://www.mongodb.com/cloud/atlas) and then load the sample data. Navigate to the Cluster's Collections. Then navigate to the sample_analytics database and the transactions collection. Then navigate to the Aggregation tab. Here you can create an Aggregation Pipeline stage by stage. It's incredibly helpful so you can see the output of each stage as you build the next. Below is a screenshot of the Aggregation Pipeline I described in my solution above.
If you don't have experience with the Aggregation Pipeline, I highly recommend MongoDB University's free course: https://university.mongodb.com/courses/M121/about
MongoDB aggregation operations allows us join two collections with $lookup method and compute field operation (i.e $multiply)
Given
"collection": [
{
id: 1,
"total": 5
},
{
id: 2,
"total": 2
}
],
"collection2": [
{
collId: 1,
"total": 3
},
{
collId: 2,
"total": 4
}
]
db.collection.aggregate([
{
$lookup: {
from: "collection2",
let: {
col_id: "$id",
col_total: "$total",
},
pipeline: [
{
$match: {
$expr: {
$eq: [
"$collId",
"$$col_id"
]
}
}
},
{
$project: {
summary: {
$multiply: [
"$total",
"$$col_total"
]
}
}
}
],
as: "result"
}
},
{
$addFields: {
result: {
$let: {
vars: {
tmp: {
$arrayElemAt: [
"$result",
0
]
}
},
in: "$$tmp.summary"
}
}
}
}
])
MongoPlayground
Result
[
{
"_id": ObjectId("5a934e000102030405000000"),
"id": 1,
"result": 15,
"total": 5
},
{
"_id": ObjectId("5a934e000102030405000001"),
"id": 2,
"result": 8,
"total": 2
}
]

MongoDb: Getting $avg in aggregate for complex data

I'm trying to get an average rating in my Mongo aggregate and am having trouble accessing the nested array. I've gotten my aggregation to give the following array. I'm trying to have city_reviews return an array of averages.
[
{
"_id": "Dallas",
"city_reviews": [
//arrays of restaurant objects that include the rating
//I would like to get an average of the rating in each review, so these arrays will be numbers (averages)
[ {
"_id": "5b7ead6d106f0553d8807276",
"created": "2018-08-23T12:41:29.791Z",
"text": "Crackin good place. ",
"rating": 4,
"store": "5b7d67d5356114089909e58d",
"author": "5b7d675e356114089909e58b",
"__v": 0
}, {review2}, {review3}]
[{review1}, {review2}, {review3}],
[{review1}. {review2}],
[{review1}, {review2}, {review3}, {review4}],
[]
]
},
{
"_id": "Houston",
"city_reviews": [
// arrays of restaurants
[{review1}, {review2}, {review3}],
[{review1}, {review2}, {review3}],
[{review1}, {review2}, {review3}, {review4}],
[],
[]
]
}
]
I would like to do an aggregation on this that returns an array of averages within the city_reviews, like this:
{
"_id": "Dallas",
"city_reviews": [
// arrays of rating averages
[4.7],
[4.3],
[3.4],
[],
[]
]
}
Here's what I've tried. It's giving me back averageRating of null, because $city_reviews is an array of object and I'm not telling it to go deep enough to capture the rating key.
return this.aggregate([
{ $lookup: { from: 'reviews', localField: '_id', foreignField: 'store', as:
'reviews' }},
{$group: {_id: '$city', city_reviews: { $push : '$reviews'}}},
{ $project: {
averageRating: { $avg: '$city_reviews'}
}}
])
Is there a way to work with this line so I can return arrays of averages instead of the full review objects.
averageRating: { $avg: '$city_reviews'}
EDIT: Was asked for entire pipeline.
return this.aggregate([
{ $lookup: { from: 'reviews', localField: '_id', foreignField: 'store', as: 'reviews' }},
{$group: {
_id: '$city',
city_reviews: { $push : '$reviews'}}
},
{ $project: {
photo: '$$ROOT.photo',
name: '$$ROOT.name',
reviews: '$$ROOT.reviews',
slug: '$$ROOT.slug',
city: '$$ROOT.city',
"averageRatingIndex":{
"$map":{
"input":"$city_reviews",
"in":[{"$avg":"$$this.rating"}]
}
},
}
},
{ $sort: { averageRating: -1 }},
{ $limit: 5 }
])
My first query was to connect two models together:
{ $lookup: { from: 'reviews', localField: '_id', foreignField: 'store', as: 'reviews' }},
Which resulted in this:
[ {
"_id": "5b7d67d5356114089909e58d",
"location": {},
"tags": [],
"created": "2018-08-22T13:23:23.224Z",
"name": "Lucia",
"description": "Great name",
"city": "Dallas",
"photo": "ab64b3e7-6207-41d8-a670-94315e4b23af.jpeg",
"author": "5b7d675e356114089909e58b",
"slug": "lucia",
"__v": 0,
"reviews": []
},
{..more object like above}
]
Then, I grouped them like this:
{$group: {
_id: '$city',
city_reviews: { $push : '$reviews'}}
}
This returned what my original question is about. Essentially, I just want to have a total average rating for each city. My accepted answer does answer my original question. I'm getting back this:
{
"_id": "Dallas",
"averageRatingIndex": [
[ 4.2 ],
[ 3.6666666666666665 ],
[ null ],
[ 3.2 ],
[ 5 ],
[ null ]
]
}
I've tried to use the $avg operator on this to return one, final average that I can display for each city, but I'm having trouble.
You can use $map to with $avg to output avg.
{"$project":{
"averageRating":{
"$map":{
"input":"$city_reviews",
"in":[{"$avg":"$$this.rating"}]
}
}
}}
With respect to your optimization request, I don't think there's a lot of room for improvement beyond the version that you already have. However, the following pipeline might be faster than your current solution because of the initial $group stage which should result in way less $lookups. I am not sure how MongoDB will optimize all of that internally so you might want to profile the two versions against a real data set.
db.getCollection('something').aggregate([{
$group: {
_id: '$city', // group by city
"averageRating": { $push: "$_id" } // create array of all encountered "_id"s per "city" bucket - we use the target field name to avoid creation of superfluous fields which would need to be removed from the output later on
}
}, {
$lookup: {
from: 'reviews',
let: { "averageRating": "$averageRating" }, // create a variable called "$$ids" which will hold the previously created array of "_id"s
pipeline: [{
$match: { $expr: { $in: [ "$store", "$$averageRating" ] } } // do the usual "joining"
}, {
$group: {
"_id": null, // group all found items into the same single bucket
"rating": { $avg: "$rating" }, // calculate the avg on a per "store" basis
}
}],
as: 'averageRating'
}
}, {
$sort: { "averageRating.rating": -1 }
}, {
$limit: 5
}, {
$addFields: { // beautification of the output only, technically not needed - we do this as the last stage in order to only do it for the max. of 5 documents that we're interested in
"averageRating": { // this is where we reuse the field we created in the first stage
$arrayElemAt: [ "$averageRating.rating", 0 ] // pull the first element inside the array outside of the array
}
}
}])
In fact, the "initial $group stage" approach could also be used in conjunction with #Veerams solution like this:
db.collection.aggregate([{
$group: {
_id: '$city', // group by city
"averageRating": { $push: "$_id" } // create array of all encountered "_id"s per "city" bucket - we use the target field name to avoid creation of superfluous fields which would need to be removed from the output later on
}
}, {
$lookup: {
from: 'reviews',
localField: 'averageRating',
foreignField: 'store',
as: 'averageRating'
},
}, {
$project: {
"averageRating": {
$avg: {
$map: {
input: "$averageRating",
in: { $avg: "$$this.rating" }
}
}
}
}
}, {
$sort: { averageRating: -1 }
}, {
$limit: 5
}])