MongoDB - $lookup in complex nested array - mongodb

I'm building a new LMS in mongodb and I have the following collections:
Courses
{
"_id": ObjectId("5f6a6b159de1304fb885b194"),
"title": "Course test",
"sections": [
{
"_id": ObjectId("5f6a6b159de1304fb885b195"),
"title": "Section 1 - introduction",
"order": 1,
"modules": [
{
"_id": ObjectId("5f6a6b159de1304fb885b196"),
"module_FK_id": ObjectId("5f6a6b149de1304fb885b135"),
"title": "Module 1",
"order": 1
},
{
"_id": ObjectId("5f6a6b159de1304fb885b198"),
"module_FK_id": ObjectId("5f6a6b149de1304fb885b14a"),
"title": "Module 2",
"order": 2
},
]
},
{
"_id": ObjectId("5f6a6b149de1304fb885b175"),
"title": "Section 2 - How to do something",
"order": 2,
"modules": [
{
"_id": ObjectId("5f6a6b149de1304fb885b141"),
"module_FK_id": ObjectId("5f6a6b149de1304fb885b150"),
"title": "Module 1",
"order": 1
},
{
"_id": ObjectId("5f6a6b149de1304fb885b15f"),
"module_FK_id": ObjectId("5f6a6b149de1304fb885b18e"),
"title": "Module 2",
"order": 2
},
]
},
]
}
Modules (only one as example)
{
"_id": ObjectId("5f6a6b149de1304fb885b135"),
"text": "Lorem ipsum...",
"mediaUrl": "urllinkhere"
}
As shown, I choose to have embedded documents for sections and modules titles but I need also a second collection, modules, because each module contains a large amount of text and my course document may get too big quickly.
Now I need to rebuild the entire document as if it was completely embedded.
Here's an example:
{
"_id": ObjectId("5f6a6b159de1304fb885b194"),
"title": "Course test",
"sections": [
{
"_id": ObjectId("5f6a6b159de1304fb885b195"),
"title": "Section 1 - introduction",
"order": 1,
"modules": [
{
"_id": ObjectId("5f6a6b159de1304fb885b196"),
"module_FK_id": ObjectId("5f6a6b149de1304fb885b135"),
"title": "Module 1",
"order": 1
"text": "Lorem ipsum...",
"mediaUrl": "urllinkhere"
},
// last two fields from collection "modules"
I'm trying different combination of aggregation and lookups but I can't obtain the desired result.
Can anybody help me?

You can construct the aggregation pipeline like this below, just remember to $group in the reverse order of $unwind operations
db.courses.aggregate([
{
$unwind: "$sections"
},
{
$unwind: "$sections.modules"
},
{
"$lookup": {
"from": "modules",
"localField": "sections.modules.module_FK_id",
"foreignField": "_id",
"as": "module_details"
}
},
{
$unwind: {
path: "$module_details",
preserveNullAndEmptyArrays: true
}
},
{
"$project": {
title: 1,
sections: {
_id: "$sections._id",
modules: {
_id: "$sections.modules._id",
module_FK_id: "$sections.modules.module_FK_id",
order: "$sections.modules.order",
title: "$sections.modules.title",
mediaUrl: "$module_details.mediaUrl",
text: "$module_details.text"
},
order: "$sections.order",
title: "$sections.title"
}
}
},
{
"$group": {
"_id": "$sections._id",
main_id: {
$first: "$_id"
},
main_title: {
$first: "$title"
},
order: {
$first: "$sections.order"
},
title: {
$first: "$sections.title"
},
modules: {
$push: "$sections.modules"
}
}
},
{
"$project": {
"_id": "$main_id",
"title": "$main_title",
section: {
_id: "$_id",
modules: "$modules",
title: "$title",
order: "$order",
}
}
},
{
$group: {
_id: "$_id",
title: {
$first: "$title"
},
sections: {
$push: "$section"
}
}
}
])
Working Example

Related

MongoDB Complex Aggregation - Combined Sum & Count

I have a DB in which each document has an array of many different objects, of which I'm interested in working with only 6 specific ones. 5 of which are integers and 1 is categorical (text).
In order to leave only the fields I need for the aggregation, I've used $unwind on the fields array - multiplying each document by the number of fields it has. After this I filtered the specific fields I want using a basic $match.
This is where I hit some trouble - I've managed to write two queries where each gives me half of the end result I need. But I'm unable to combine them together into one general query. Specifically, I have one query that gives me 5 integer fields, each is the $sum of each integer fields, and another query that uses the categorical field in order to $count the number of times each category appears.
The desired output would give me a single document that has 5 k:v fields (1 for each sum calculation), and an additional object that includes k:v fields (where each key is a category and the value is the number of times it appears. this must be its own object because the categories that appear may vary).
The sample data I've added has been striped of most of it's structure and includes only the crucial parts relevant for this query. This is in order to secure our clients privacy.
I've tried solving this from every angle I could think of - and would greatly appreciate any feedback!
The first query:
[{$match: {
fields: {
$elemMatch: {
field_id: 174196148,
'values.start': {
$gte: ISODate('2022-02-01T00:00:00.000Z'),
$lt: ISODate('2022-02-03T00:00:00.000Z')
}
}
}
}}, {$unwind: {
path: '$fields'
}}, {$match: {
$or: [
{
'fields.field_id': 226577699
},
{
'fields.field_id': 225330844
},
{
'fields.field_id': 158472699
},
{
'fields.field_id': 191195626
},
{
'fields.field_id': 219444876
}
]
}}, {$unwind: {
path: '$fields.values'
}}, {$addFields: {
'Specific - Field Value': {
$round: [
{
$toDecimal: '$fields.values.value'
}
]
}
}}, {$group: {
_id: '$fields.label',
SumCalculation: {
$sum: {
$toDecimal: '$Specific - Field Value'
}
}
}}, {$group: {
_id: null,
SumArray: {
$push: {
k: '$_id',
v: '$SumCalculation'
}
}
}}, {$project: {
_id: 0,
Final: {
$arrayToObject: '$SumArray'
}
}}]
The second query:
[{$match: {
fields: {
$elemMatch: {
field_id: 174196148,
'values.start': {
$gte: ISODate('2022-01-01T00:00:00.000Z'),
$lt: ISODate('2022-03-31T00:00:00.000Z')
}
}
}
}}, {$unwind: {
path: '$fields'
}}, {$match: {
'fields.field_id': 177278285
}}, {$unwind: {
path: '$fields.values'
}}, {$group: {
_id: '$fields.values.value.text',
ModelCount: {
$sum: 1
}
}}, {$group: {
_id: null,
Full: {
$push: {
k: '$_id',
v: '$ModelCount'
}
}
}}, {$project: {
_id: 0,
Final: {
$arrayToObject: '$Full'
}
}}]
The desired output:
{
"Final": {
"Business Model": [
{
"K": "Solar Lease",
"V": 3
},
{
"K": "Solar Purchase",
"V": 112
}
],
"System Size - Signed Contract": 73,
"Additional Payment for O&M": 2000,
"O&M Years Included (Paid)": 2,
"Total Price Including VAT": 396660,
"1st Milestone - Down Payment": 30280
}
}
Sample data:
{
"_id": 1946794344,
"fields": [
{
"type": "money",
"field_id": 226577699,
"label": "1st Milestone - Down Payment ",
"values": [
{
"currency": "ILS",
"value": "6120.0000"
}
],
"config": {
"settings": {
"allowed_currencies": [
"ILS"
]
},
"mapping": null,
"label": "1st Milestone - Down Payment "
},
"external_id": "1st-milestone-down-payment-2"
},
{
"type": "money",
"field_id": 225330844,
"label": "Additional Payment for O&M",
"values": [
{
"currency": "ILS",
"value": "0.0000"
}
],
"config": {
"settings": {
"allowed_currencies": [
"ILS"
]
},
"mapping": null,
"label": "Additional Payment for O&M"
},
"external_id": "additional-payment-for-om"
},
{
"type": "money",
"field_id": 158472699,
"label": "Total Price Including VAT",
"values": [
{
"currency": "ILS",
"value": "61270.0000"
}
],
"config": {
"settings": {
"allowed_currencies": [
"ILS"
]
},
"mapping": null,
"label": "Total Price Including VAT"
},
"external_id": "money"
},
{
"type": "number",
"field_id": 191195626,
"label": "System Size - Signed Contract",
"values": [
{
"value": "11.6600"
}
],
"config": {
"settings": {
"decimals": 2
},
"mapping": null,
"label": "System Size - Signed Contract"
},
"external_id": "hspq-hmrkt"
},
{
"type": "number",
"field_id": 219444876,
"label": "O&M Years Included (Paid)",
"values": [
{
"value": "0.0000"
}
],
"config": {
"settings": {
"decimals": 0
},
"mapping": null,
"label": "O&M Years Included (Paid)"
},
"external_id": "om-years-gifted-for-free"
},
{
"type": "category",
"field_id": 177278285,
"label": "Business Model",
"values": [
{
"value": {
"status": "active",
"text": "Solar Purchase",
"id": 6,
"color": "DCEBD8"
}
}
],
"external_id": "mvdl-sqy"
}
]
}

How to use current field in second $match?

Let's say i have 2 collections
// Post collection:
[
{
"_id": "somepost1",
"author": "firstuser",
"title": "First post"
},
{
"_id": "somepost2",
"author": "firstuser",
"title": "Second post"
},
{
"_id": "somepost3",
"author": "firstuser",
"title": "Third post"
}
]
// User collection:
[
{
"_id": "firstuser",
"nickname": "John",
"posts": {
"voted": []
}
},
{
"_id": "seconduser",
"nickname": "Bob",
"posts": {
"voted": [
{
"_id": "somepost1",
"vote": "1"
},
{
"_id": "somepost3",
"vote": "-1"
}
]
}
}
]
And i need to get this result:
[
{
"_id": "somepost1",
"author": {
"_id": "firstuser",
"nickname": "John"
},
"title": "First post",
"myvote": "1"
},
{
"_id": "somepost2",
"author": {
"_id": "firstuser",
"nickname": "John"
},
"title": "Second post",
"voted": "0"
},
{
"_id": "somepost3",
"author": {
"_id": "firstuser",
"nickname": "John"
},
"title": "Third post",
"myvote": "-1"
}
]
How can i make a request with aggregation, which will display this output with dynamic _id of elements?
I have problem with using current _id of post in second $match and setting "myvote" to 0 if there are no element in "posts.voted" associated with current post.
Here what i've tried: https://mongoplayground.net/p/v70ZUioVSpQ
db.post.aggregate([
{
$match: {
author: "firstuser"
}
},
{
$lookup: {
from: "user",
localField: "author",
foreignField: "_id",
as: "author"
}
},
{
$addFields: {
author: {
$arrayElemAt: [
"$author",
0
]
}
}
},
{
$lookup: {
from: "user",
localField: "_id",
foreignField: "posts.voted._id",
as: "Results"
}
},
{
$unwind: "$Results"
},
{
$unwind: "$Results.posts.voted"
},
{
$match: {
"Results.posts.voted._id": "ID OF CURRENT POST"
}
},
{
$project: {
_id: 1,
author: {
_id: 1,
nickname: 1
},
title: 1,
myvote: "$Results.posts.voted.vote"
}
}
])
From the $match docs:
The query syntax is identical to the read operation query syntax
The query syntax does not allow usage of document values. which is what you're trying to do.
What we can do is use $expr within the $match stage, this allows us to use aggregation oprerators, thus also giving access to the document values. like so:
{
$match: {
$expr: {
$eq: ['$Results.posts.voted._id', '$_id'],
}
},
},

How to group and a count fields in a mongoDB collection

I'm really new to mongodb coming from a sql background and struggling to work out how to run a simple report that will group a value from a nested document with a count and in a sort order with highest count first.
I've tried so many ways from what I've found online but I'm unable to target the exact field that I need for the grouping.
Here is the collection.
{
"_id": {
"$oid": "6005f95dbad14c0308f9af7e"
},
"title": "test",
"fields": {
"6001bd300b363863606a815e": {
"field": {
"_id": {
"$oid": "6001bd300b363863606a815e"
},
"title": "Title Two",
"datatype": "string"
},
"section": "Section 1",
},
"6001bd300b363863423a815e": {
"field": {
"_id": {
"$oid": "6001bd3032453453606a815e"
},
"title": "Title One",
"datatype": "string"
},
"section": "Section 1",
},
"6001bd30453534863423a815e": {
"field": {
"_id": {
"$oid": "6001bd300dfgdfgdf06a815e"
},
"title": "Title One",
"datatype": "string"
},
"section": "Section 1",
}
},
"sections": ["Section 1"]
}
The result I need to get from the above example would be:
"Title One", 2
"Title Two", 1
Can anyone please point me in the right direction? Thank you so much.
Having dynamic field names is usually a poor design.
Try this one:
db.collection.aggregate([
{ $set: { fields: { $objectToArray: "$fields" } } },
{ $unwind: "$fields" },
{ $group: { _id: "$fields.v.field.title", count: { $count: {} } } },
{ $sort: { count: -1 } }
])
Here's another way to do it. The $project throws away everything except for the deep-dive to "title". Then just $unwind and $sortByCount.
db.collection.aggregate([
{
"$project": {
"titles": {
"$map": {
"input": {
"$objectToArray": "$fields"
},
"in": "$$this.v.field.title"
}
}
}
},
{
"$unwind": "$titles"
},
{
"$sortByCount": "$titles"
}
])
Try it on mongoplayground.net.

Query entities with count of related attributes in separate field in mongodb

I have two collections. boxes and balls. A ball can be in a box:
boxes:
[{
"_id": { "$oid": "box-a" },
"name": "Box A"
},{
"_id": { "$oid": "box-b" },
"name": "Box B"
}]
balls:
[{
"_id": { "$oid": "ball-a" },
"color": "red",
"boxId": { "$oid": "box-a" }
},{
"_id": { "$oid": "ball-b" },
"color": "green",
"boxId": { "$oid": "box-a" }
}]
Now I want to query all boxes with an additional field ballColors where I get an overview how many balls of what color are in the boxes:
[{
"_id": { "$oid": "box-a" },
"name": "Box A",
"ballColors": {
"red": 1,
"green": 1,
}
},{
"_id": { "$oid": "box-b" },
"name": "Box B",
"ballColors": {}
}]
I tried to solve it with an aggregation like following:
db.boxes.aggregate([
{$lookup: {
from: "balls",
localField: "_id",
foreignField: "boxId",
as: "ballColors"
}},
{$addFields: {
ballColors: "$ballColors.color"
}}
])
...but this gives me something like that:
[{
"_id": { "$oid": "box-a" },
"name": "Box A",
"ballColors": [
"red",
"green"
]
},{
"_id": { "$oid": "box-b" },
"name": "Box B",
"ballColors": []
}]
I also did some experiments with $unwind combined with $group but I have no clue how to get those information back into the original objects...
Is there a way to count the colors in ballColors and put it in an object? Or is there another better way to do this?
$lookup with pipeline, pass _id as boxID in let, $match boxId condition
$group by color and get total count
$project to show k as color and v as count
convert key-values array to object using $arrayTOobject in $addFields
db.boxes.aggregate([
{
$lookup: {
from: "balls",
let: { boxId: "$_id" },
pipeline: [
{ $match: { $expr: { $eq: ["$$boxId", "$boxId"] } } },
{
$group: {
_id: "$color",
count: { $sum: 1 }
}
},
{
$project: {
_id: 0,
k: "$_id",
v: "$count"
}
}
],
as: "ballColors"
}
},
{ $addFields: { ballColors: { $arrayToObject: "$ballColors" } } }
])
Playground

How can i display count of related documents on parent level?

I'm trying to build a voting system where you can have X num of options to vote for on an entry.
The query I'm building now is when retrieving an entry, I would like to get the numbers of votes per option on an entry.
I have a very clear understanding of how I would do this in SQL but grasping to understand the concepts of aggregation, lookup, and group in MongoDB
The model looks like this:
Entries
{
"_id": "5fc2765938401a2308e18ac5",
"options": [
{
"name": "First Option"
"_id": "5fc2765938401a2308e18ac6",
},
{
"name": "Second Option"
"_id": "5fc2765938401a2308e18are",
},
{
"name": "Third Option"
"_id": "5fc2765938401a2308e18aef",
},
],
},
{
"_id": "5fc2766438401a2308e18ac8",
"options": [
{
"name": "Some other option"
"_id": "5fc2766438401a2308e18ac9",
},
{
"_id": "5fc2766438401a2308e18aca",
"name": "This is also an option"
}
],
}
Votes
{
"_id": "5fc2765938401a2308e18ac5",
"entryId": "5fc2765938401a2308e18ac6"
},
{
"_id": "5fc2765938401a2308e18aer",
"entryId": "5fc2765938401a2308e18are"
},
{
"_id": "5fc2765938401a2308e18ek",
"entryId": "5fc2765938401a2308e18ac6"
}
...
And I want the results of Entry to look like this.
{
"_id": "5fc2765938401a2308e18ac5",
"options": [
{
"name": "First Option"
"_id": "5fc2765938401a2308e18ac6",
"votes": 1,
},
{
"name": "Second Option"
"_id": "5fc2765938401a2308e18are",
"votes": 0,
},
{
"name": "Third Option"
"_id": "5fc2765938401a2308e18aef",
"votes": 5,
},
],
},
{
"_id": "5fc2766438401a2308e18ac8",
"options": [
{
"name": "Some other option"
"_id": "5fc2766438401a2308e18ac9",
"votes": 3,
},
{
"_id": "5fc2766438401a2308e18aca",
"name": "This is also an option"
"votes": 10,
}
],
}
$lookup to join votes collection, pass local field optoins._id and foreign field entryId
$project get options votes, $map to iterate loop of options array, $filter to get matching entryId records and $size to get count of element in return array, merge votes field and current object using $mergeObjects
db.entries.aggregate([
{
$lookup: {
from: "votes",
localField: "options._id",
foreignField: "entryId",
as: "votes"
}
},
{
$project: {
options: {
$map: {
input: "$options",
as: "a",
in: {
$mergeObjects: [
"$$a",
{
votes: {
$size: {
$filter: {
input: "$votes",
cond: { $eq: ["$$this.entryId", "$$a._id"] }
}
}
}
}
]
}
}
}
}
}
])
Playground