Second $project Stage Producing Unexpected Result in MongoDB Aggregation - mongodb

I am trying to run some aggregation in my MongoDB backend where I calculate a value and then add that calculated value to another value. The first step is working, but the second step produces a value of null, and I'm trying to understand why, and how to fix it.
This is what my aggregation look like:
db.staff.aggregate({
$project: {
_id: 1,
"workloadSatisfactionScore": {
$cond: [ { $eq: [ "$workload.shiftAvg", 0 ] }, "N/A", { $divide: [ "$workload.shiftAvg", "$workload.weeklyShiftRequest.minimum" ] } ]
}
},
$project: {
_id: 1,
totalScore: {
$sum: [ "$workloadSatisfactionScore", 10 ]
},
}
})
Even though the first $project stage produces documents with a numeric result or null for 'workloadSatisfactionScore', after the second $project stage, ALL documents have a value of null for 'totalScore'.
What I should get is whatever the value of 'workloadSatisfactionScore' is, added to 10. But as I say, all I get is null for all documents. What looks incorrect here?
As an example, one particular document in my collection returns a value of 0.9166666666666666 for "workloadSatisfactionScore". So when that is plugged into the second $project stage I'd expect a value of 10.9166666666666666 for 'totalScore'. But, as I say, instead I get null for that document, and all other documents.

It's possible that by the time the second $project pipeline is reached, workloadSatisfactionScore could be a string i.e. with the value "N/A" which will result in null when $add or $sum is used with a non-numerical value.
No need for the second project pipeline, you can add the value in the other conditional which handles the non-numerical part without passing it down the pipeline:
db.staff.aggregate({
"$project": {
"_id": 1,
"totalScore": {
"$cond": [
{ "$eq": [ "$workload.shiftAvg", 0 ] },
"N/A",
{ "$add": [
10,
{ "$divide": [
"$workload.shiftAvg",
"$workload.weeklyShiftRequest.minimum"
] }
] }
]
}
}
})

Related

MongoDB decrement until zero

I would like to achieve an operation in MongoDB that would be analogous to doc.value = max(doc.value - amount, 0). I could do it by fetching document, updating its value then saving it, but is it possible with an atomic operation to avoid problems with synchronisation of parallel decrements?
It is, in fact, possible to achieve with a single operation. All you need is an aggregation pipeline inside an update operator.
Let's say you have a doc that looks like this:
{
"key": 1,
value: 30
}
You want to subtract x from value and if the resulting value is less than zero, set value to 0, otherwise set it to whatever value - x is. Here is an update aggregator you need. In this example I am subtracting 20 from value.
db.collection.update({
key: 1
},
[
{
$set: {
"value": {
$cond: [
{
$gt: [
{
$subtract: [
"$value",
20
]
},
0
]
},
{
$subtract: [
"$value",
20
]
},
0
]
}
}
}
])
The result will be:
{
"key": 1,
"value": 10
}
But if you change 20 to, say 44, the result is:
{
"key": 1,
"value": 0
}
Here is a Playground for you: https://mongoplayground.net/p/Y9yO6v9Oca8
Kudos to codemonkey's response for providing a solution using aggregation pipelines for an atomic transaction.
Here's a slightly simpler aggregation pipeline that takes advantage of the $max operator:
db.collection.update({},
[
{
$set: {
"value": {
$max: [
0,
{
$subtract: [
"$value",
20
]
}
]
}
}
}
],
{
multi: true
})
The pipeline sets the value to the maximum of 0 and the result of the decrement.
Playground

How do i find the total number of subjectsthat has no prerequisites using agregation?

I have tried several codes but it didn't work.
Example from the database,
one has a prerequisite and one does not have prerequisites and I would like to find the total number of the subject with no prerequisites :
db.Subject.insert(
{
"_id":ObjectId(),
"subject":{
"subCode":"CSCI321",
"subTitle":"Final Year Project",
"credit":6,
"type":"Core",
"assessments": [
{ "assessNum": 1,
"weight":30,
"assessType":"Presentation",
"description":"Prototype demonstration" },
{ "assignNum": 2,
"weight":70,
"assessType":"Implementation and Presentation",
"description":"Final product Presentation and assessment of product implementation by panel of project supervisors" }
]
}
}
)
db.Subject.insert(
{
"_id":ObjectId(),
"subject":{
"subCode":"CSCI203",
"subTitle":"Algorithm and Data Structures",
"credit":3,
"type":"Core",
"prerequisite": ["csci103"]
}})
one of the few codes that I tried using :
db.Subject.aggregate({$group:{"prerequisite":{"$exists": null}, count:{$sum:1}}});
Results :
_getErrorWithCode#src/mongo/shell/utils.js:25:13
doassert#src/mongo/shell/assert.js:18:14
_assertCommandWorked#src/mongo/shell/assert.js:534:17
assert.commandWorked#src/mongo/shell/assert.js:618:16
DB.prototype._runAggregate#src/mongo/shell/db.js:260:9
DBCollection.prototype.aggregate#src/mongo/shell/collection.js:1062:12
#(shell):1:1
You can use $match to eliminate unwanted documents and $group to calculate sum
db.collection.aggregate([
{
$match: {
"subject.prerequisite": {
"$exists": false
}
}
},
{
$group: {
_id: null,
total: {
$sum: 1
}
}
}
])
Working Mongo playground
This can be achieved within a single aggregation pipeline stage i.e. the $group step where you can use the BSON Types comparison order to aggregate the
documents where the 'subjects.prerequisites' field exists and has at least an element. The condition can be used as the group by key i.e. the _id field
in $group.
Consider running the following aggregation pipeline to get the desired results:
db.Subject.aggregate([
{ $group: {
_id: {
$cond: [
{
$or: [
{ $lte: ['$subject.prerequisite', null] },
{
$eq: [
{ $size: { $ifNull: ['$subject.prerequisite', [] ] } },
0
]
}
]
},
'noPrerequisite',
'havePrerequisite'
]
},
count: { $sum: 1 }
} }
])
The first condition in the OR simply returns true if a document does not have the embedded prerequisites field and the other satisfies these set of conditions:
if length of ( prerequisites || [] ) is zero
In the above, $cond takes a logical condition as its first argument (if) and then returns the second argument where the evaluation is true (then) or the third argument where false (else). When used as an expression in the _id field for $group, it groups all the documents into either true/false which is conditionally projected as "noPrerequisite" (true) OR "havePrerequisite" (false) in the group key.
The results will contain both counts for documents where the prerequisite field exists and for those without the field OR it has an empty array.

mongo aggregation - number of documents where field in one array is also in another one

I have a Movies collection
...
{
...
"cast":[ "First Actor", "Second Actor" ],
"directors":[ "First Director", "Second Director" ]
},
{
...
"cast": [ "Actor Director", "First Actor" ],
"directors": [ "Actor Director", "Firt Director" ]
}
...
Using aggregation framework I need to get number of documents where at least one value from directors array is also in a cast array. How could I achieve it?
You can use $setIntersection to find common entries in both arrays, then filter documents by $size of the result gt than 0 (means that at least one element is common to arrays), and finally use $count stage to count documents that match this condition.
-- EDIT : Add $addFields stage in case of no array present for cast or directors
In case of any document that doesn't contain cast or directors array, you will get an error for size waiting for an array and getting a null value.
In order to avoid this, you need to add an $addField stage to define empty array instead of null, for cast and directors.
Here's the query :
db.collection.aggregate([
{
$addFields: {
directors: {
$cond: {
if: {
$isArray: "$directors"
},
then: "$directors",
else: []
}
},
cast: {
$cond: {
if: {
$isArray: "$cast"
},
then: "$cast",
else: []
}
}
}
},
{
$match: {
$expr: {
$gt: [
{
$size: {
$setIntersection: [
"$cast",
"$directors"
]
}
},
0
]
}
}
},
{
$count: "have_common_value"
}
])
You can test it here

Find Index of first Matching Element $gte with $indexOfArray

MongoDB has $indexOfArray to let you find the element's array index, for example:
$indexOfArray: ["$article.date", ISODate("2019-03-29")]
Is it possible to use comparison operators with $indexOfArray together, like:
$indexOfArray: ["$article.date", {$gte: ISODate("2019-03-29")}]
Not it's not possible with $indexOfArray as that will only look for an equality match to an expression as the second argument.
Instead you can make a construct like this:
db.data.insertOne({
"_id" : ObjectId("5ca01e301a97dd8b468b3f55"),
"array" : [
ISODate("2018-03-01T00:00:00Z"),
ISODate("2018-03-02T00:00:00Z"),
ISODate("2018-03-03T00:00:00Z")
]
})
db.data.aggregate([
{ "$addFields": {
"matchedIndex": {
"$let": {
"vars": {
"matched": {
"$arrayElemAt": [
{ "$filter": {
"input": {
"$zip": {
"inputs": [ "$array", { "$range": [ 0, { "$size": "$array" } ] }]
}
},
"cond": { "$gte": [ { "$arrayElemAt": ["$$this", 0] }, new Date("2018-03-02") ] }
}},
0
]
}
},
"in": {
"$arrayElemAt": [{ "$ifNull": [ "$$matched", [0,-1] ] },1]
}
}
}
}}
])
Which would return for the $gte of Date("2018-03-02"):
{
"_id" : ObjectId("5ca01e301a97dd8b468b3f55"),
"array" : [
ISODate("2018-03-01T00:00:00Z"),
ISODate("2018-03-02T00:00:00Z"),
ISODate("2018-03-03T00:00:00Z")
],
"matchedIndex" : 1
}
Or -1 where the condition was not met in order to be consistent with $indexOfArray.
The basic premise is using $zip in order to "pair" with the array index positions which get generated from $range and $size of the array. This can be fed to a $filter condition which will return ALL matching elements to the supplied condition. Here it is the first element of the "pair" ( being the original array content ) via $arrayElemAt matching the specified condition using $gte
{ "$gte": [ { "$arrayElemAt": ["$$this", 0] }, new Date("2018-03-02") ] }
The $filter will return either ALL elements after ( in the case of $gte ) or an empty array where nothing was found. Consistent with $indexOfArray you only want the first match, which is done with another wrapping $arrayElemAt on the output for the 0 position.
Since the result could be an omitted value ( which is what happens by $arrayElemAt: [[], 0] ) then you use [$ifNull][8] to test the result ans pass a two element array back with a -1 as the second element in the case where the output was not defined. In either case that "paired" array has the second element ( index 1 ) extracted again via $arrayElemAt in order to get the first matched index of the condition.
Of course since you want to refer to that whole expression, it just reads a little cleaner in the end within a $let, but that is optional as you can "inline" with the $ifNull if wanted.
So it is possible, it's just a little more involved than placing a range expression inside of $indexOfArray.
Note that any expression which actually returns a single value for equality match is just fine. But since operators like $gte return a boolean, then that would not be equal to any value in the array, and thus the sort of processing with $filter and then extraction is what you require.

How to get multiple counts in one query for one field?

So what i'm trying to do is to get multiple counts of one field depending on a min max value.
Collection holds something like
{name:'hi',price:100},
{name:'hi',price:134},
{name:'hi',price:500}
What i want to get is for example the count of items that are between price 100-200, 200-300, 300-400, 400-500.
Is there a way to do this in mongoDB with one query? Is there a way to get the query without knowing min max?
You want .aggregate() here with the $cond ternary operator to determine the grouping id withing $group:
db.collection.aggregate([
{ "$match": {
"price": { "$gte": 100, "$lte" 500 }
}},
{ "$group": {
"_id": {
"$cond": [
{ "$lte": [ "$price", 200 ] },
"100-200",
{ "$cond": [
{ "$lte": [ "$price", 300 ] },
"200-300",
{ "$cond": [
{ "$lte": [ "$price", 400 ] },
"300-400",
"400-500"
]}
]}
]
},
"count": { "$sum": 1 }
}}
])
As a "ternary" if/then/else the $cond will evaluate the expression in the first argument "if" and then either return the second argument "then" where true or the third "else" where false.
The cascading logic means that you "nest" each ternary operation inside the false assertion till you reach an eventual result.
With the grouping _id value provided by conditions, you then just use $sum with an argument of 1 to "count" the matches in the group.
This gives you a response on the sample as:
{ "_id": "100-200", "count": 2 }
{ "_id": "400-500", "count": 1 }
The $match is making sure that all results are in the "ranges" that wil be tested. If you exclude that then you likely want a last $cond "else" condition to return another value if the "price" was outside of an expected "range".
If you are looking to return "each" range, then you are better off inspecting the result and inserting a 0 count entry for every range that is not returned.