How to aggregate a result by condition - mongodb

I have a collection based on which I need to calculate a score which is as simple as this.
Calculate score for all students in my collection
If a student belongs to class 'A' or 'B' he gets a score of 5 else if he belongs to class 'C' or 'D' he gets 4
Student:
{
name:"Aster",
classes:['A','B']
}
Aggregation doesn't allow $in operator on $cond so how do i proceed
Ps:Excuse. Brevity sent on the go

Not sure this can cover totally your problem, but you can use $setIsSubset in Mongo 2.6:
db.collection.aggregate([
{ $project: { name: 1 ,
grade: { $cond: [{$setIsSubset: ["$classes", ["A","B"]]}, 5, 4]}
}
}])
In case classes can be either string or array:
db.collection.aggregate([
{ $project: { name: 1 ,
grade: {$cond: [{$or: [{$eq: ["$classes", "A"]},
{$eq: ["$classes", "B"]},
{$setIsSubset: ["$classes", ["A","B"]]}]},
5, 4]}
}
}])

Related

Limit number of objects pushed to array in MongoDB aggregation

I've been trying to find a way to limit the number of objects i'm pushing to arrays I'm creating while using "aggregate" on a MongoDB collection.
I have a collection of students - each has these relevant keys:
class number it takes this semester (only one value),
percentile in class (exists if is enrolled in class, null if not),
current score in class (> 0 if enrolled in class, else - 0),
total average (GPA),
max grade
I need to group all students who never failed, per class, in one array that contains those with a GPA higher than 80, and another array containing those without this GPA, sorted by their score in this specific class.
This is my query:
db.getCollection("students").aggregate([
{"$match": {
"class_number":
{"$in": [49, 50, 16]},
"grades.curr_class.percentile":
{"$exists": true},
"grades.min": {"$gte": 80},
}},
{"$sort": {"grades.curr_class.score": -1}},
{"$group": {"_id": "$class_number",
"studentsWithHighGPA":
{"$push":
{"$cond": [{"$gte": ["$grades.gpa", 80]},
{"id": "$_id"},
"$$REMOVE"]
}
},
"studentsWithoutHighGPA":
{"$push":
{"$cond": [{"$lt": ["$grades.gpa", 80]},
{"id": "$_id"},
"$$REMOVE"]
},
},
},
},
])
What i'm trying to do is limit the number of students in each of these arrays. I only want the top 16 in each array, but i'm not sure how to approach this.
Thanks in advance!
I've tried using limit in different variations, and slice too, but none seem to work.
Since mongoDb version 5.0, one option is to use $setWindowFields for this, and in particular, its $rank option. This will allow to keep only the relevant students and limit their count even before the $group step:
$match only relevant students as suggested by the OP
$set the groupId for the setWindowFields (as it can currently partition by one key only
$setWindowFields to define the rank of each student in their array
$match only students with the wanted rank
$group by class_number as suggested by the OP:
db.collection.aggregate([
{$match: {
class_number: {$in: [49, 50, 16]},
"grades.curr_class.percentile": {$exists: true},
"grades.min": {$gte: 80}
}},
{$set: {
groupId: {$concat: [
{$toString: "$class_number"},
{$toString: {$toBool: {$gte: ["$grades.gpa", 80]}}}
]}
}},
{$setWindowFields: {
partitionBy: "$groupId",
sortBy: {"grades.curr_class.score": -1},
output: {rank: {$rank: {}}}
}},
{$match: {rank: {$lte: rankLimit}}},
{$group: {
_id: "$class_number",
studentsWithHighGPA: {$push: {
$cond: [{$gte: ["$grades.gpa", 80]}, {id: "$_id"}, "$$REMOVE"]}},
studentsWithoutHighGPA: {$push: {
$cond: [{$lt: ["$grades.gpa", 80]}, {id: "$_id"}, "$$REMOVE"]}}
}}
])
See how it works on the playground example
*This solution will limit the rank of the students, so there is an edge case of more than n students in the array (In case there are multiple students with the exact rank of n). it can be simply solved by adding a $slice step
Maybe MongoDB $facets are a solution. You can specify different output pipelines in one aggregation call.
Something like this:
const pipeline = [
{
'$facet': {
'studentsWithHighGPA': [
{ '$match': { 'grade': { '$gte': 80 } } },
{ '$sort': { 'grade': -1 } },
{ '$limit': 16 }
],
'studentsWithoutHighGPA': [
{ '$match': { 'grade': { '$lt': 80 } } },
{ '$sort': { 'grade': -1 } },
{ '$limit': 16 }
]
}
}
];
coll.aggregate(pipeline)
This should end up with one document including two arrays.
studentsWithHighGPA (array)
0 (object)
1 (object)
...
studentsWithoutHighGPA (array)
0 (object)
1 (object)
See each facet as an aggregation pipeline on its own. So you can also include $group to group by classes or something else.
https://www.mongodb.com/docs/manual/reference/operator/aggregation/facet/
I don't think there is a mongodb-provided operator to apply a limit inside of a $group stage.
You could use $accumulator, but that requires server-side scripting to be enabled, and may have performance impact.
Limiting studentsWithHighGPA to 16 throughout the grouping might look something like:
"studentsWithHighGPA": {
"$accumulator": {
init: "function(){
return {combined:[]};
}",
accumulate: "function(state, id, score){
if (score >= 80) {
state.combined.push({_id:id, score:score})
};
return {combined:state.combined.slice(0,16)}
}",
accumulateArgs: [ "$_id", "$grades.gpa"],
merge: "function(A,B){
return {combined:
A.combined.concat(B.combined).sort(
function(SA,SB){
return (SB.score - SA.score)
})
}
}",
finalize: "function(s){
return s.combined.slice(0,16).map(function(A){
return {_id:A._id}
})
}",
lang: "js"
}
}
Note that the score is also carried through until the very end so that partial result sets from different shards can be combined properly.

MongoDB (Mongoose) database query question

I'm trying to get all the documents from a collection that have a variable amount of failed exams.
My collection is the following:
I have to retrieve all student that have for example 3 scores lower than 10.
The query I am currently running is the following:
Student.aggregate([
{
$project: {
_id: 0,
name: 1,
students: {
count: {
$size: {
$filter: {
input: "$results",
as: "result",
cond: {$lt: ["$$result.score", 10]}
}
}
}
}
}
}
])
How would I check if the count is $gte then for example 3?
My current output:
You are almost 100% done already! Just add
{$match: {"students.count": {$gte: 3}}
as a stage in the pipeline after $project.

MongoDB Find values passed in that don't match

Currently stuck with an issue using MongoDB aggregation. I have a array of '_ids' that I need to check exist in a specific collection.
Example:
I have 3 records in 'Collection 1' with _id 1,2,3. I can find the matching values using:
$match: {
_id: {
$in: [1, 2, 3, 4]
}
}
However what I want to know is from the values I have passed in (1,2,3,4). Which ones don't match up to a record. (In this case _id 4 will not have a matching record)
So instead of returning records with _id 1, 2, 3. It needs to return the _id that doesn't exist. So in this example '_id: 4'
The query should also disregard any extra records in the collection. Example, if the collection held records with ID 1-10, and I passed in a query to determine if the _ids: 1, 7, 15 existed. The the value i'm expecting would be along the lines of ' _id: 15 doesn't exist
The first thought was to use to use $project within a aggregation to hold each _id that was passed in, and then attach each record in the collection. To the matching _id passed in. E.g:
Record 1:
{
_id: 1,
Collection1: [
record details: ...,
...
...
]
},
{
_id: 2,
Collection1: [] // This _id passed in, doesn't have a matching collection
}
However cant seem to get a working example in this instance. Any help would be appreciated!
If the input documents are:
{ _id: 1 },
{ _id: 2 },
{ _id: 5 },
{ _id: 10 }
And the array to match is:
var INPUT_ARRAY = [ 1, 7, 15 ]
The following aggregation:
db.test.aggregate( [
{
$match: {
_id: {
$in: INPUT_ARRAY
}
}
},
{
$group: {
_id: null,
matches: { $push: "$_id" }
}
},
{
$project: {
ids_not_exist: { $setDifference: [ INPUT_ARRAY, "$matches" ] },
_id: 0
}
}
] )
Returns:
{ "ids_not_exist" : [ 7, 15 ] }
Are you looking for $not ?
MDB Docs

MongoDB aggregation but not including certain items

I'm very new to MongoDB's aggregation framework, so I do not know properly how to do this.
I have a data model that is structured like this:
{
name: String,
store: {
item1: Number,
item2: Number,
item3: Number,
item4: Number,
},
createdAt: Date
}
I want to return the average price of every item'i'. I'm trying with this query:
db.commerces.aggregate([
{
$group: {
_id: "",
item1Avg: { $avg: "$store.item1"},
item2Avg: { $avg: "$store.item2"},
item3Avg: { $avg: "$store.item3"},
item4Avg: { $avg: "$store.item4"}
}
}
]);
The problem is that when an item has no price set, it's stored in the database as a "-1".
I don't want these values to pollute the average result. Is there any way to limit the agreggation to only take into account when price is > 0.
$match operator before $group is not a solution because I want to return all the average prices.
Thank you!
EDIT: Here you have of an example of the input & desired output:
[{
name: 'name',
store: {
item1: 10,
item2: -1,
item3: 12,
item4: 3,
}
},
{
name: 'name2',
store: {
item1: 10,
item2: -1,
item3: -1,
item4: 2,
}
},...]
An the desired output:
{
item1Avg: 10,
item2Avg: 0,
item3Avg: 12,
item4Avg: 2.5
}
You need to $unwind the store, then $match values to meet your condition, then $group ones that passed the test. Unfortunately there is no way to $unwind an object, so you need to $project it to array first:
db.commerces.aggregate([
{$project: {store:[
{item:{$literal:"item1"}, val:"$store.item1"},
{item:{$literal:"item2"}, val:"$store.item2"},
{item:{$literal:"item3"}, val:"$store.item3"},
{item:{$literal:"item4"}, val:"$store.item4"}
]}},
{$unwind:"$store"},
{$match: {"store.val":{$gt:0}}},
{$group: {_id:"$store.item", avg:{$avg:"$store.val"}}}
])
EDIT:
As #blakes-seven pointed, it may not work on versions < 3.2. An alternative approach with $map may work:
db.commerces.aggregate([
{$project: {
store: {
$map:{
input:[
{item:{$literal:"item1"}, val:"$store.item1"},
{item:{$literal:"item2"}, val:"$store.item2"},
{item:{$literal:"item3"}, val:"$store.item3"},
{item:{$literal:"item4"}, val:"$store.item4"}
],
as: "i",
in: "$$i"
}
}
}},
{$unwind:"$store"},
{$match: {"store.val":{$gt:0}}},
{$group: {_id:"$store.item", avg:{$avg:"$store.val"}}}
])

Doing a sum with mongo db aggregation framework

I have the following kind of docs in a collection in mongo db
{ _id:xx,
iddoc:yy,
type1:"sometype1",
type2:"sometype2",
date:
{
year:2015,
month:4,
day:29,
type:"day"
},
count:23
}
I would like to do a sum over the field count grouping by iddoc for all docs where:
type1 in ["type1A","type1B",...]
where type2 in ["type2A","type2B",...]
date.year: 2015,
date.month: 4,
date.type: "day"
date.day between 4 and 7
I would like then to sort these sums.
I think this is probably easy to do within mongo db aggregation framework but I am new to it and would appreciate a tip to get started.
This is straightforward to do with an aggregation pipeline:
db.test.aggregate([
// Filter the docs based on your criteria
{$match: {
type1: {$in: ['type1A', 'type1B']},
type2: {$in: ['type2A', 'type2B']},
'date.year': 2015,
'date.month': 4,
'date.type': 'day',
'date.day': {$gte: 4, $lte: 7}
}},
// Group by iddoc and count them
{$group: {
_id: '$iddoc',
sum: {$sum: 1}
}},
// Sort by sum, descending
{$sort: {sum: -1}}
])
If I understood you correctly:
db.col.aggregate
(
[{
$match:
{
type1: {$in: ["type1A", type1B",...]},
type2: {$in: ["type2A", type2B",...]},
"date.year": 2015,
"date.month": 4,,
"date.day": {$gte: 4, $lte: 7},
"date.type": "day"
}
},
{
$group:
{
_id: "$iddoc",
total_count: {$sum: "$count"}
}
},
{ $sort: {total_count: 1}}]
)
This is filtering the field date.day between 4 and 7 inclusive (if not, use $gt and $lt to exclude them). And it sorts results from lower to higher (ascending), if you want to do a descending sort, then:
{ $sort: {total_count: -1}}