MongoDB - Select all documents by the count of an array field - mongodb

In my current project I have a structure like this:
"squad": {
"members": [
{
"name": "xyz",
"empty": true
},
{
"name": "xyz",
"empty": true
},
{
"name": "xyz",
"empty": true
}
]
}
Now I want to query every squad with mongodb which have at least, lets say 3 empty member slots. I've googled and only found aggregate and $size, which seem to only select an array count not something per field.
Any idea how to do it?

You can try this query :
db.getCollection('collectionName').aggregate([
{$unwind:"$squad.members"},
{$group:{_id:"$_id",count:{$sum:{$cond: [{$eq: ['$squad.members.empty', true]}, 1, 0]}}}},
{$match: {count: {$gte: 3}}}
])
In this query applied conditional sum and then check the count is greater than or equal 3

It will return all documents will empty slots greater than 3
db.squad.aggregate([
{$unwind:"$squad.members"},
{$match:{"squad.members.empty": true}},
{$group:{_id:"$_id",count:{$sum:1}}},
{$match: {count: {$gt: 3}}}
])

Related

Limit number of objects pushed to array in MongoDB aggregation

I've been trying to find a way to limit the number of objects i'm pushing to arrays I'm creating while using "aggregate" on a MongoDB collection.
I have a collection of students - each has these relevant keys:
class number it takes this semester (only one value),
percentile in class (exists if is enrolled in class, null if not),
current score in class (> 0 if enrolled in class, else - 0),
total average (GPA),
max grade
I need to group all students who never failed, per class, in one array that contains those with a GPA higher than 80, and another array containing those without this GPA, sorted by their score in this specific class.
This is my query:
db.getCollection("students").aggregate([
{"$match": {
"class_number":
{"$in": [49, 50, 16]},
"grades.curr_class.percentile":
{"$exists": true},
"grades.min": {"$gte": 80},
}},
{"$sort": {"grades.curr_class.score": -1}},
{"$group": {"_id": "$class_number",
"studentsWithHighGPA":
{"$push":
{"$cond": [{"$gte": ["$grades.gpa", 80]},
{"id": "$_id"},
"$$REMOVE"]
}
},
"studentsWithoutHighGPA":
{"$push":
{"$cond": [{"$lt": ["$grades.gpa", 80]},
{"id": "$_id"},
"$$REMOVE"]
},
},
},
},
])
What i'm trying to do is limit the number of students in each of these arrays. I only want the top 16 in each array, but i'm not sure how to approach this.
Thanks in advance!
I've tried using limit in different variations, and slice too, but none seem to work.
Since mongoDb version 5.0, one option is to use $setWindowFields for this, and in particular, its $rank option. This will allow to keep only the relevant students and limit their count even before the $group step:
$match only relevant students as suggested by the OP
$set the groupId for the setWindowFields (as it can currently partition by one key only
$setWindowFields to define the rank of each student in their array
$match only students with the wanted rank
$group by class_number as suggested by the OP:
db.collection.aggregate([
{$match: {
class_number: {$in: [49, 50, 16]},
"grades.curr_class.percentile": {$exists: true},
"grades.min": {$gte: 80}
}},
{$set: {
groupId: {$concat: [
{$toString: "$class_number"},
{$toString: {$toBool: {$gte: ["$grades.gpa", 80]}}}
]}
}},
{$setWindowFields: {
partitionBy: "$groupId",
sortBy: {"grades.curr_class.score": -1},
output: {rank: {$rank: {}}}
}},
{$match: {rank: {$lte: rankLimit}}},
{$group: {
_id: "$class_number",
studentsWithHighGPA: {$push: {
$cond: [{$gte: ["$grades.gpa", 80]}, {id: "$_id"}, "$$REMOVE"]}},
studentsWithoutHighGPA: {$push: {
$cond: [{$lt: ["$grades.gpa", 80]}, {id: "$_id"}, "$$REMOVE"]}}
}}
])
See how it works on the playground example
*This solution will limit the rank of the students, so there is an edge case of more than n students in the array (In case there are multiple students with the exact rank of n). it can be simply solved by adding a $slice step
Maybe MongoDB $facets are a solution. You can specify different output pipelines in one aggregation call.
Something like this:
const pipeline = [
{
'$facet': {
'studentsWithHighGPA': [
{ '$match': { 'grade': { '$gte': 80 } } },
{ '$sort': { 'grade': -1 } },
{ '$limit': 16 }
],
'studentsWithoutHighGPA': [
{ '$match': { 'grade': { '$lt': 80 } } },
{ '$sort': { 'grade': -1 } },
{ '$limit': 16 }
]
}
}
];
coll.aggregate(pipeline)
This should end up with one document including two arrays.
studentsWithHighGPA (array)
0 (object)
1 (object)
...
studentsWithoutHighGPA (array)
0 (object)
1 (object)
See each facet as an aggregation pipeline on its own. So you can also include $group to group by classes or something else.
https://www.mongodb.com/docs/manual/reference/operator/aggregation/facet/
I don't think there is a mongodb-provided operator to apply a limit inside of a $group stage.
You could use $accumulator, but that requires server-side scripting to be enabled, and may have performance impact.
Limiting studentsWithHighGPA to 16 throughout the grouping might look something like:
"studentsWithHighGPA": {
"$accumulator": {
init: "function(){
return {combined:[]};
}",
accumulate: "function(state, id, score){
if (score >= 80) {
state.combined.push({_id:id, score:score})
};
return {combined:state.combined.slice(0,16)}
}",
accumulateArgs: [ "$_id", "$grades.gpa"],
merge: "function(A,B){
return {combined:
A.combined.concat(B.combined).sort(
function(SA,SB){
return (SB.score - SA.score)
})
}
}",
finalize: "function(s){
return s.combined.slice(0,16).map(function(A){
return {_id:A._id}
})
}",
lang: "js"
}
}
Note that the score is also carried through until the very end so that partial result sets from different shards can be combined properly.

Update multiple fields based on condition in aggregation pipeline MongoDB Atlas trigger

I have the following pipeline that calculate the rank (sort) according to the score when the flag update is set to true:
const pipeline = [
{$match: {"score": {$gt: 0}, "update": true}},
{$setWindowFields: {sortBy: {"score": -1}, output: {"rank": {$denseRank: {}}}}},
{$merge: {into: "ranking"}}
];
await ranking_col.aggregate(pipeline).toArray();
What i do next is to set the rank to 0 when the update flag is set to false:
ranking_col.updateMany({"update": false}, {$set: {"rank": parseInt(0, 10)}});
One of my document looks like this :
{
"_id": "7dqe1kcA7R1YGjdwHsAkV83",
"score": 294,
"update": false,
"rank": 0,
}
I want to avoid the extra updateMany call and do the equivalent inside the pipeline. MongoDB support back then told me to use the $addFields flag this way :
const pipeline = [
{$match: {"score": {$gt: 0}, "update": true}},
{$setWindowFields: {sortBy: {"score": -1}, output: {"rank": {$denseRank: {}}}}},
{$addFields: {rank: {$cond: [{$eq: ['$update', false]},parseInt(0, 10),'$rank']}}},
{$merge: {into: "ranking"}}
];
This is not working in my Atlas Trigger.
Can you please correct my syntax or tell me a good way to do so ?
This aggregation pipeline isn't particularly efficient (a fair amount of work in "$setWindowFields" gets thrown away - more comments about this below), but I think it does what you want. Please check to make sure it's correct as I don't have complete understanding of the collection, its use, etc.
N.B.: This aggregation pipeline is not very efficient because:
It processes every document. There's no leading "$match" to filter documents.
Because of 1., "$setWindowFields" has to "partitionBy": "$update" and sort/rank the "update": false partition and "$and": ["update": true, {"$lte": ["score", 0]}] docs even though it is irrelevant.
All the irrelevant work is thrown away by just setting the "update": false" partition's "rank" to 0 and then excluding all the "$and": ["update": true, {"$lte": ["score", 0]}] documents from the "$merge".
In a large collection, your original two-step update may likely be more efficient.
db.ranking.aggregate([
{
"$setWindowFields": {
"partitionBy": "$update",
"sortBy": {"score": -1},
"output": {
"rank": {"$denseRank": {}}
}
}
},
{
"$set": {
"rank": {
"$cond": [
"$update",
"$rank",
0
]
}
}
},
{
"$match": {
"$expr": {
"$not": [{"$and": ["$update", {"$lte": ["$score", 0]}]}]
}
}
},
{"$merge": "ranking"}
])
Try it on mongoplayground.net.

Scala / MongoDB - removing duplicate

I have seen very similar questions with solutions to this problem, but I am unsure how I would incorporate it in to my own query. I'm programming in Scala and using a MongoDB Aggregates "framework".
val getItems = Seq (
Aggregates.lookup(Store...)...
Aggregates.lookup(Store.STORE_NAME, "relationship.itemID", "uniqueID", "item"),
Aggregates.unwind("$item"),
// filter duplicates here ?
Aggregates.lookup(Store.STORE_NAME, "item.content", "ID", "content"),
Aggregates.unwind("$content"),
Aggregates.project(Projections.fields(Projections.include("store", "item", "content")))
)
The query returns duplicate objects which is undesirable. I would like to remove these. How could I go about incorporating Aggregates.group and "$addToSet" to do this? Or any other reasonable solution would be great too.
Note: I have to omit some details about the query, so the store lookup aggregate is not there. However, I want to remove the duplicates later in the query so it hopefully shouldn't matter.
Please let me know if I need to provide more information.
Thanks.
EDIT: 31/ 07/ 2019: 13:47
I have tried the following:
val getItems = Seq (
Aggregates.lookup(Store...)...
Aggregates.lookup(Store.STORE_NAME, "relationship.itemID", "uniqueID", "item"),
Aggregates.unwind("$item"),
Aggregates.group("$item.itemID,
Accumulators.first("ID", "$ID"),
Accumulators.first("itemName", "$itemName"),
Accumulators.addToSet("item", "$item")
Aggregates.unwind("$items"),
Aggregates.lookup(Store.STORE_NAME, "item.content", "ID", "content"),
Aggregates.unwind("$content"),
Aggregates.project(Projections.fields(Projections.include("store", "items", "content")))
)
But my query now returns zero results instead of the duplicate result.
You can use $first to remove the duplicates.
Suppose I have the following data:
[
{"_id": 1,"item": "ABC","sizes": ["S","M","L"]},
{"_id": 2,"item": "EFG","sizes": []},
{"_id": 3, "item": "IJK","sizes": "M" },
{"_id": 4,"item": "LMN"},
{"_id": 5,"item": "XYZ","sizes": null
}
]
Now, let's aggregate it using $first and $unwind and see the difference:
First let's aggregate it using $first
db.collection.aggregate([
{ $sort: {
item: 1
}
},
{ $group: {
_id: "$item",firstSize: {$first: "$sizes"}}}
])
Output
[
{"_id": "XYZ","firstSize": null},
{"_id": "ABC","firstSize": ["S","M","L" ]},
{"_id": "IJK","firstSize": "M"},
{"_id": "EFG","firstSize": []},
{"_id": "LMN","firstSize": null}
]
Now, Let's aggregate it using $unwind
db.collection.aggregate([
{
$unwind: "$sizes"
}
])
Output
[
{"_id": 1,"item": "ABC","sizes": "S"},
{"_id": 1,"item": "ABC","sizes": "M"},
{"_id": 1,"item": "ABC","sizes": "L},
{"_id": 3,"item": "IJK","sizes": "M"}
]
You can see $first removes the duplicates where as $unwind keeps the duplicates.
Using $unwind and $first together.
db.collection.aggregate([
{ $unwind: "$sizes"},
{
$group: {
_id: "$item",firstSize: {$first: "$sizes"}}
}
])
Output
[
{"_id": "IJK", "firstSize": "M"},
{"_id": "ABC","firstSize": "S"}
]
group then addToSet is an effective way to deal with your problem !
it looks like this in mongoshell
db.sales.aggregate(
[
{
$group:
{
_id: { day: { $dayOfYear: "$date"}, year: { $year: "$date" } },
itemsSold: { $addToSet: "$item" }
}
}
]
)
in scala you can do it like
Aggregates.group("$groupfield", Accumulators.addToSet("fieldName","$expression"))
if you have multiple field to group
Aggregates.group(new BasicDBObject().append("fieldAname","$fieldA").append("fieldBname","$fieldB")), Accumulators.addToSet("fieldName","expression"))
then unwind

Is it possible to use` $sum` in the `$match` stage of mongo aggregation and how?

I have a gifts collection in mongodb with four items inside it. how do I query the db so that I get only gifts that the sum of their amount is less-than-or-equal-to 5500?
so for example from these four gifts in db:
{
"_id": 1,
"amount": 3000,
},
{
"_id": 2,
"amount": 2000,
},
{
"_id": 3,
"amount": 1000,
},
{
"_id": 4,
"amount": 5000,
}
The query should return the first two only:
{
"_id": 1,
"amount": 3000,
},
{
"_id": 1,
"amount": 2000,
},
I think I should use mongo aggregation? if so, what is the syntax?
I had some googling, I know how to use $sum in the $group stage, but I don't know how to use it in the $match stage. is it event possible to do so?
P.S: I assumend I should use $sum in $match, Am I supposed to group them first? if so, how do I tell mongo to make a group where the sum of amounts in that group is less-than-or-equal-to 5500?
Thanks for any help you are able to provide.
You're going the right way.
First store your $sum in a variable then filter them with $match:
db.gifts.aggregate([
{$match: {}}, // Initial query
{$group: {
_id: '$code', // Assume your gift could be grouped by a unique code
sum: {$sum: '$amount'}, // Sum all amount per group
items: {$push: '$$ROOT'} // Push all gift item to an array
}},
{$match: {sum: {$lte: 5500}}}, // Filter group where sum <= 5500
{$unwind: '$items'}, // Unwind items array to get all match field
{$replaceRoot: {newRoot: '$items'}} // Use this stage to get back the original items
])

MongoDb Access array of objects with certain property

I have one document as follows:
{
user: 'hvt07',
photos: [
{
link: 'http://link.to.com/image1.jpg',
isPrivate: true
},
{
link: 'http://link.to.com/image2.jpg',
isPrivate: false
}
]
}
I want to get all photos which are with:
isPrivate: false
I am using the following query:
db.collection_name.find({ photos:{ $elemMatch:{isPrivate: false} } }).pretty()
I have also tried:
db.collection_name.find({'photos.isPrivate': true}).pretty()
But both return all elements in the array even ones that are set as :
isPrivate: true
Please suggest.
Aggregation is the solution.
You need to deconstruct the photos array using the $unwind operator. Next use the $match to select documents where isPrivate: false. The $group you can regroup your documents by _id and reconstruct your photos array using the $push operator
db.collection_name.aggregate(
[
{$unwind: "$photos"},
{$match: {"photos.isPrivate": false}},
{$group: {"_id": {"id": "$_id", "user": "$user"}, photos: {$push: "$photos"}}}
{$project: {"_id": "$_id.id", "user": "$_id.user", "photos": 1, "_id": 0 }}
]
)
You can use $elemMatch to the result projection like this
db.collection_name.find(
{ photos:{ $elemMatch:{isPrivate: false} } }, //1
{photos:{$elemMatch:{isPrivate: false}}}) //2
Find all documents that have at least a photo which is not private
Select only photos that are not private for the documents found.