My goal is to get all 'student' documents that belong to 'classes'
that have at least one student of grade "blue" and at least one
of grade "red".
I am inclined to simply do a sequence of queries in Python (pymongo), tackling the task directly.
I wonder if there is some clever aggregation pipeline that I could use!
Given:
Classes collection:
{ class_id: 'a' }
{ class_id: 'b' }
Students collection:
{ class_id: 'a',
grade: 'blue' }
{class_id: 'a',
grade: 'red' }
You could use :
a $group to group by class_id and $push all grade in an array so we can deduce easily in the next step which class "contains" blue. Preserve the current document with $$ROOT because we'll need the students that match the class_id
a $match to match only classes that have grade blue in it
an $unwind to remove the array of students generated by previous $$ROOT
a $project to reorganize your document nicely
Query would be :
db.students.aggregate([{
"$group": {
"_id": "$class_id",
"grades": { "$push": "$grade" },
"students": { "$push": "$$ROOT" }
}
}, {
"$match": {
"grades": { "$all": ["blue","red"] }
}
}, {
"$unwind": "$students"
}, {
"$project": {
"_id": "$students._id",
"class_id": "$students.class_id",
"grade": "$students.grade",
}
}])
If you need to match other color than ["blue","red"] you can add more in the $match aggregation ($in: ["blue","red","yellow"])
For implementing it in PyMongo, it is very straightforward :
from pymongo import MongoClient
import pprint
db = MongoClient().testDB
pipeline = [ <the_aggregation_query_here> ]
pprint.pprint(list(db.students.aggregate(pipeline)))
Additionnaly, to match only students that belong to classes collection, perform a $lookup and match those that are not empty. Add the following at the aggregation query :
{
$lookup: {
from: "classes",
localField: "class_id",
foreignField: "class_id",
as: "class"
}
}, {
$match: {
"class": { $not: { $size: 0 } }
}
}
Related
Is there any way to give order or rankings to MongoDB aggregation results?
My result is:
{
"score":100
"name": "John"
},
{
"score":80
"name": "Jane"
},
{
"score":60
"name": "Lee"
}
My wanted result is:
{
"score":100
"name": "John",
"rank": 1
},
{
"score":80
"name": "Jane"
"rank": 2
},
{
"score":60
"name": "Lee"
"rank": 3
}
I know there is a operator called $includeArrayIndex but this only works with $unwind operator.
Is there any way to give rank without using $unwind?
Using $unwind requires grouping on my collection, and I'm afraid grouping pipeline would be too huge to process.
The other way is to use $map and add rank in document using its index, and don't use $unwind stage because it would be single field array you can directly access using its key name as mention in last line of code,
$group by null and make array of documents in root array,
$map to iterate loop of root array, get the index of current object from root array using $indexOfArray and increment that returned index number using $add because index start from 0, and that is how we are creating rank field, merge object with current element object and rank field using $mergeObjects
let result = await db.collection.aggregate([
{
$group: {
_id: null,
root: {
$push: "$$ROOT"
}
}
},
{
$project: {
_id: 0,
root: {
$map: {
input: "$root",
in: {
$mergeObjects: [
"$$this",
{
rank: { $add: [{ $indexOfArray: ["$root", "$$this"] }, 1] }
}
]
}
}
}
}
}
]);
// you can access result using root key
let finalResult = result[0]['root'];
Playground
Suppose, In MongoDB i have two collections. one is "Students" and the another is "Course".
Student have the document such as
{"id":"1","name":"Alex"},..
and Course has the document such as
{"course_id":"111","course_name":"React"},..
and there is a third collection named "students-courses" where i have kept student's id with their corresponding course id. Like this
{"student_id":"1","course_id":"111"}
i want to make a query with student's id so that it gives the output with his/her enrolled course. like this
{
"id": "1",
"name":"Alex",
"taken_courses": [
{"course_id":"111","course_name":"React"},
{"course_id":"112","course_name":"Vue"}
]
}
it will be many to many relationship in MongoDB without using ORM. How can i make this query?
Need to use $loopup with pipeline,
First $group by student_id because we are going to get courses of students, $push all course_id in course_ids for next step - lookup purpose
db.StudentCourses.aggregate([
{
$group: {
_id: "$student_id",
course_ids: {
$push: "$course_id"
}
}
},
$lookup with Student Collection and get the student details in student
$unwind student because its an array and we need only one from group of same student record
$project required fields
{
$lookup: {
from: "Student",
localField: "_id",
foreignField: "id",
as: "student"
}
},
{
$unwind: "$student"
},
{
$project: {
id: "$_id",
name: "$student.name",
course_ids: 1
}
},
$lookup Course Collection and get all courses that contains course_ids, that we have prepared in above $group
$project the required fields
course details will store in taken_courses
{
$lookup: {
from: "Course",
let: {
cId: "$course_ids"
},
pipeline: [
{
$match: {
$expr: {
$in: [
"$course_id",
"$$cId"
]
}
}
},
{
$project: {
_id: 0
}
}
],
as: "taken_courses"
}
},
$project details, removed not required fields
{
$project: {
_id: 0,
course_ids: 0
}
}
])
Working Playground: https://mongoplayground.net/p/FMZgkyKHPEe
For more details related syntax and usage, check aggregation
Using MongoDB 4.2 and MongoDB Atlas to test aggregation pipelines.
I've got this products collection, containing documents with this schema:
{
"name": "TestProduct",
"relatedList": [
{id:ObjectId("someId")},
{id:ObjectId("anotherId")}
]
}
Then there's this cities collection, containing documents with this schema :
{
"name": "TestCity",
"instructionList": [
{ related_id: ObjectId("anotherId"), foo: bar},
{ related_id: ObjectId("someId"), foo: bar}
{ related_id: ObjectId("notUsefulId"), foo: bar}
...
]
}
My objective is to join both collections to output something like this (the operation is picking each related object from the instructionList in the city document to put it into the relatedList of the product document) :
{
"name": "TestProduct",
"relatedList": [
{ related_id: ObjectId("someId"), foo: bar},
{ related_id: ObjectId("anotherId"), foo: bar},
]
}
I tried using the $lookup operator for aggregation like this :
$lookup:{
from: 'cities',
let: {rId:'$relatedList._id'},
pipeline: [
{
$match: {
$expr: {
$eq: ["$instructionList.related_id", "$$rId"]
}
}
},
]
}
But it's not working, I'm a bit lost with this complex pipeline syntax.
Edit
By using unwind on both arrays :
{
{$unwind: "$relatedList"},
{$lookup:{
from: "cities",
let: { "rId": "$relatedList.id" },
pipeline: [
{$unwind:"$instructionList"},
{$match:{$expr:{$eq:["$instructionList.related_id","$$rId"]}}},
],
as:"instructionList",
}},
{$group: {
_id: "$_id",
instructionList: {$addToSet:"$instructionList"}
}}
}
I am able to achieve what I want, however,
I'm not getting a clean result at all :
{
"name": "TestProduct",
instructionList: [
[
{
"name": "TestCity",
"instructionList": {
"related_id":ObjectId("someId")
}
}
],
[
{
"name": "TestCity",
"instructionList": {
"related_id":ObjectId("anotherId")
}
}
]
]
}
How can I group everything to be as clean as stated for my original question ?
Again, I'm completely lost with the Aggregation framework.
the operation is picking each related object from the instructionList in the city document to put it into the relatedList of the product document)
Given an example document on cities collection:
{"_id": ObjectId("5e4a22a08c54c8e2380b853b"),
"name": "TestCity",
"instructionList": [
{"related_id": "a", "foo": "x"},
{"related_id": "b", "foo": "y"},
{"related_id": "c", "foo": "z"}
]}
and an example document on products collection:
{"_id": ObjectId("5e45cdd8e8d44a31a432a981"),
"name": "TestProduct",
"relatedList": [
{"id": "a"},
{"id": "b"}
]}
You can achieve try using the following aggregation pipeline:
db.products.aggregate([
{"$lookup":{
"from": "cities",
"let": { "rId": "$relatedList.id" },
"pipeline": [
{"$unwind":"$instructionList"},
{"$match":{
"$expr":{
"$in":["$instructionList.related_id", "$$rId"]
}
}
}],
"as":"relatedList",
}},
{"$project":{
"name":"$name",
"relatedList":{
"$map":{
"input":"$relatedList",
"as":"x",
"in":{
"related_id":"$$x.instructionList.related_id",
"foo":"$$x.instructionList.foo"
}
}
}
}}
]);
To get a result as the following:
{ "_id": ObjectId("5e45cdd8e8d44a31a432a981"),
"name": "TestProduct",
"relatedList": [
{"related_id": "a", "foo": "x"},
{"related_id": "b", "foo": "y"}
]}
The above is tested in MongoDB v4.2.x.
But it's not working, I'm a bit lost with this complex pipeline syntax.
The reason why it's slightly complex here is because you have an array relatedList and also an array of subdocuments instructionList. When you refer to instructionList.related_id (which could mean multiple values) with $eq operator, the pipeline doesn't know which one to match.
In the pipeline above, I've added $unwind stage to turn instructionList into multiple single documents. Afterward, using $in to express a match of single value of instructionList.related_id in array relatedList.
I believe you just need to $unwind the arrays in order to lookup the relation, then $group to recollect them. Perhaps something like:
.aggregeate([
{$unwind:"relatedList"},
{$lookup:{
from:"cities",
let:{rId:"$relatedList.id"}
pipeline:[
{$match:{$expr:{$eq:["$instructionList.related_id", "$$rId"]}}},
{$unwind:"$instructionList"},
{$match:{$expr:{$eq:["$instructionList.related_id", "$$rId"]}}},
{$project:{_id:0, instruction:"$instructionList"}}
],
as: "lookedup"
}},
{$addFields: {"relatedList.foo":"$lookedup.0.instruction.foo"}},
{$group: {
_id:"$_id",
root: {$first:"$$ROOT"},
relatedList:{$push:"$relatedList"}
}},
{$addFields:{"root.relatedList":"$relatedList"}},
{$replaceRoot:{newRoot:"$root"}}
])
A little about each stage:
$unwind duplicates the entire document for each element of the array,
replace the array with the single element
$lookup can then consider each element separately. The stages in $lookup.pipeline:
a. $match so we only unwind the document with matching ID
b. $unwind the array so we can consider individual elements
c. repeat the $match so we are only left with matching elements (hopefully just 1)
$addFields assigns the foo field retrieved from the lookup to the object from relatedList
$group collects together all of the documents with the same _id (i.e. that were unwound from a single original document), stores the first as 'root', and pushes all of the relatedList elements back into an array
$addFields moves the relatedList in to root
$replaceRoot returns the root, which should now be the original document with the matching foo added to each relatedList element
When i do a find all in a mongodb collection, i want to get maintenanceList sorted by older maintenanceDate to newest.
The sort of maintenanceDate should not affect parents order in a find all query
{
"_id":"507f191e810c19729de860ea",
"color":"black",
"brand":"brandy",
"prixVenteUnitaire":200.5,
"maintenanceList":[
{
"cost":100.40,
"maintenanceDate":"2017-02-07T00:00:00.000+0000"
},
{
"cost":4000.40,
"maintenanceDate":"2019-08-07T00:00:00.000+0000"
},
{
"cost":300.80,
"maintenanceDate":"2018-08-07T00:00:00.000+0000"
}
]
}
Any guess how to do that ?
Thank you
Whatever order the fields are in with the previous pipeline stage, as operations like $project and $group effectively "copy" same position.So, it will not change the order of your fields in your aggregated result.
And the sort of maintenanceDate through aggregation will not affect parents order in a find all query.
So, simply doing this should work.
Assuming my collection name is example.
db.example.aggregate([
{
"$unwind": "$maintenanceList"
},
{
"$sort": {
"_id": 1,
"maintenanceList.maintenanceDate": 1
}
},
{
"$group": {
"_id": "$_id",
"color": {
$first: "$color"
},
"brand": {
$first: "$brand"
},
"prixVenteUnitaire": {
$first: "$prixVenteUnitaire"
},
"maintenanceList": {
"$push": "$maintenanceList"
}
}
}
])
Output:
I'm new to mongo and I have a document that has an array with the ids of all it's related documents. I need to fetch the document with all it's relateds in a single query. For the moment I fetch the document and I query separatly each of it's related document with there ids.
all my documents are on the same collection documents_nodes and look like so:
{
"id": "document_1",
"name": "flask",
"relateds": [
"document_2",
"document_3",
"document_4"
],
"parents": [
"document1_1"
]
}
The first query is to get the initial document
db.documents_nodes.find({id: document_1})
And then I query it's relateds with a second query
db.documents_nodes.aggregate([{
$match: {
$and: [{
id: {
$in: ["document_2", "document_3", "document_2"]
}
}]
}
}])
is there a way to combine the two queries, I tried this but it doesn't work
db.documents_nodes.aggregate([
{
$match: {
uri: "https://opus.adeo.com/LMFR_prod/3206"
}
},
{
$addFields: {
newRelateds:
{
$match: {
id: {
$in: [ "$relateds" ]
}
}
}
}
}
])
"errmsg" : "Unrecognized expression '$match'",
"code" : 168,
"codeName" : "InvalidPipelineOperator"
I have found a way to do it, in case someone has the same need.
I used the $unwind to flatten the array of documents and then used the $lookup to fetch the documents by their ids and finally I group the result on a new key.
[{
$match: {
id: "document_1"
}
}, {
$unwind: {
path: '$relateds',
}
}, {
$lookup: {
from: 'documents_nodes',
localField: 'relateds',
foreignField: 'id',
as: 'newRelateds'
}
}, {
$group: {
_id: '$id',
relateds: {
'$push': '$newRelateds'
}
}
}]