Find Index of first Matching Element $gte with $indexOfArray - mongodb

MongoDB has $indexOfArray to let you find the element's array index, for example:
$indexOfArray: ["$article.date", ISODate("2019-03-29")]
Is it possible to use comparison operators with $indexOfArray together, like:
$indexOfArray: ["$article.date", {$gte: ISODate("2019-03-29")}]

Not it's not possible with $indexOfArray as that will only look for an equality match to an expression as the second argument.
Instead you can make a construct like this:
db.data.insertOne({
"_id" : ObjectId("5ca01e301a97dd8b468b3f55"),
"array" : [
ISODate("2018-03-01T00:00:00Z"),
ISODate("2018-03-02T00:00:00Z"),
ISODate("2018-03-03T00:00:00Z")
]
})
db.data.aggregate([
{ "$addFields": {
"matchedIndex": {
"$let": {
"vars": {
"matched": {
"$arrayElemAt": [
{ "$filter": {
"input": {
"$zip": {
"inputs": [ "$array", { "$range": [ 0, { "$size": "$array" } ] }]
}
},
"cond": { "$gte": [ { "$arrayElemAt": ["$$this", 0] }, new Date("2018-03-02") ] }
}},
0
]
}
},
"in": {
"$arrayElemAt": [{ "$ifNull": [ "$$matched", [0,-1] ] },1]
}
}
}
}}
])
Which would return for the $gte of Date("2018-03-02"):
{
"_id" : ObjectId("5ca01e301a97dd8b468b3f55"),
"array" : [
ISODate("2018-03-01T00:00:00Z"),
ISODate("2018-03-02T00:00:00Z"),
ISODate("2018-03-03T00:00:00Z")
],
"matchedIndex" : 1
}
Or -1 where the condition was not met in order to be consistent with $indexOfArray.
The basic premise is using $zip in order to "pair" with the array index positions which get generated from $range and $size of the array. This can be fed to a $filter condition which will return ALL matching elements to the supplied condition. Here it is the first element of the "pair" ( being the original array content ) via $arrayElemAt matching the specified condition using $gte
{ "$gte": [ { "$arrayElemAt": ["$$this", 0] }, new Date("2018-03-02") ] }
The $filter will return either ALL elements after ( in the case of $gte ) or an empty array where nothing was found. Consistent with $indexOfArray you only want the first match, which is done with another wrapping $arrayElemAt on the output for the 0 position.
Since the result could be an omitted value ( which is what happens by $arrayElemAt: [[], 0] ) then you use [$ifNull][8] to test the result ans pass a two element array back with a -1 as the second element in the case where the output was not defined. In either case that "paired" array has the second element ( index 1 ) extracted again via $arrayElemAt in order to get the first matched index of the condition.
Of course since you want to refer to that whole expression, it just reads a little cleaner in the end within a $let, but that is optional as you can "inline" with the $ifNull if wanted.
So it is possible, it's just a little more involved than placing a range expression inside of $indexOfArray.
Note that any expression which actually returns a single value for equality match is just fine. But since operators like $gte return a boolean, then that would not be equal to any value in the array, and thus the sort of processing with $filter and then extraction is what you require.

Related

if mongodb match inside aggregation returns nothing, how to make a new query?

I use match to select some documents from the collection, and then output all other documents except those found.
If match doesn't find any documents, then I need to display all available documents from the collection.
How can this be done?
Without an example I don't know if I've understood correctly, but you can try this aggregation query (or add this aggregation stages into your query).
The ide is using $facet create two ways:
Frist way: Match the value
Second way: Get everything
And use $project to output one of these options using $cond and $size.
Into the $project if the array returned in the "exists way" is 0 (any result) the result is no_exists(i.e. all values) otherwise is the exists value.
db.collection.aggregate([
{
"$facet": {
"exists": [
{
"$match": {
// your match
}
}
],
"no_exists": []
}
},
{
"$project": {
"result": {
"$cond": {
"if": {
"$eq": [
{
"$size": "$exists"
},
0
]
},
"then": "$no_exists",
"else": "$exists"
}
}
}
}
])
Example here where value exists and output only the value, and here where not exists and output all collection.

Second $project Stage Producing Unexpected Result in MongoDB Aggregation

I am trying to run some aggregation in my MongoDB backend where I calculate a value and then add that calculated value to another value. The first step is working, but the second step produces a value of null, and I'm trying to understand why, and how to fix it.
This is what my aggregation look like:
db.staff.aggregate({
$project: {
_id: 1,
"workloadSatisfactionScore": {
$cond: [ { $eq: [ "$workload.shiftAvg", 0 ] }, "N/A", { $divide: [ "$workload.shiftAvg", "$workload.weeklyShiftRequest.minimum" ] } ]
}
},
$project: {
_id: 1,
totalScore: {
$sum: [ "$workloadSatisfactionScore", 10 ]
},
}
})
Even though the first $project stage produces documents with a numeric result or null for 'workloadSatisfactionScore', after the second $project stage, ALL documents have a value of null for 'totalScore'.
What I should get is whatever the value of 'workloadSatisfactionScore' is, added to 10. But as I say, all I get is null for all documents. What looks incorrect here?
As an example, one particular document in my collection returns a value of 0.9166666666666666 for "workloadSatisfactionScore". So when that is plugged into the second $project stage I'd expect a value of 10.9166666666666666 for 'totalScore'. But, as I say, instead I get null for that document, and all other documents.
It's possible that by the time the second $project pipeline is reached, workloadSatisfactionScore could be a string i.e. with the value "N/A" which will result in null when $add or $sum is used with a non-numerical value.
No need for the second project pipeline, you can add the value in the other conditional which handles the non-numerical part without passing it down the pipeline:
db.staff.aggregate({
"$project": {
"_id": 1,
"totalScore": {
"$cond": [
{ "$eq": [ "$workload.shiftAvg", 0 ] },
"N/A",
{ "$add": [
10,
{ "$divide": [
"$workload.shiftAvg",
"$workload.weeklyShiftRequest.minimum"
] }
] }
]
}
}
})

MongoDB sum arrays from multiple documents on a per-element basis

I have the following document structure (simplified for this example)
{
_id : ObjectId("sdfsdf"),
result : [1, 3, 5, 7, 9]
},
{
_id : ObjectId("asdref"),
result : [2, 4, 6, 8, 10]
}
I want to get the sum of those result arrays, but not a total sum, instead a new array corresponding to the sum of the original arrays on an element basis, i.e.
result : [3, 7, 11, 15, 19]
I have searched through the myriad questions here and a few come close (e.g. this one, this one, and this one), but I can't quite get there.
I can get the sum of each array fine
aggregate(
[
{
"$unwind" : "$result"
},
{
"$group": {
"_id": "$_id",
"results" : { "$sum" : "$result"}
}
}
]
)
which gives me
[ { _id: sdfsdf, results: 25 },
{ _id: asdref, results: 30 } ]
but I can't figure out how to get the sum of each element
You can use includeArrayIndex if you have 3.2 or newer MongoDb.
Then you should change $unwind.
Your code should be like this:
.aggregate(
[
{
"$unwind" : { path: "$result", includeArrayIndex: "arrayIndex" }
},
{
"$group": {
"_id": "$arrayIndex",
"results" : { "$sum" : "$result"}
}
},
{
$sort: { "_id": 1}
},
{
"$group":{
"_id": null,
"results":{"$push":"$results"}
}
},
{
"$project": {"_id":0,"results":1}
}
]
)
There is an alternate approach to this, but mileage may vary on how practical it is considering that a different approach would involve using $push to create an "array of arrays" and then applying $reduce as introduced in MongoDB 3.4 to $sum those array elements into a single array result:
db.collection.aggregate([
{ "$group": {
"_id": null,
"result": { "$push": "$result" }
}},
{ "$addFields": {
"result": {
"$reduce": {
"input": "$result",
"initialValue": [],
"in": {
"$map": {
"input": {
"$zip": {
"inputs": [ "$$this", "$$value" ],
"useLongestLength": true
}
},
"as": "el",
"in": { "$sum": "$$el" }
}
}
}
}
}}
])
The real trick there is in the "input" to $map we use the $zip operation which creates a transposed list of arrays "pairwise" for the two array inputs.
In a first iteration this takes the empty array as supplied to $reduce and would return the "zipped" output with consideration to the first object found as in:
[ [0,1], [0,3], [0,5], [0,7], [0,9] ]
So the useLongestLength would substitute the empty array with 0 values out to the the length of the current array and "zip" them together as above.
Processing with $map, each element is subject to $sum which "reduces" the returned results as:
[ 1, 3, 5, 7, 9 ]
On the second iteration, the next entry in the "array of arrays" would be picked up and processed by $zip along with the previous "reduced" content as:
[ [1,2], [3,4], [5,6], [7,8], [9,10] ]
Which is then subject to the $map for each element using $sum again to produce:
[ 3, 7, 11, 15, 19 ]
And since there were only two arrays pushed into the "array of arrays" that is the end of the operation, and the final result. But otherwise the $reduce would keep iterating until all array elements of the input were processed.
So in some cases this would be the more performant option and what you should be using. But it is noted that particularly when using a null for $group you are asking "every" document to $push content into an array for the result.
This could be a cause of breaking the BSON Limit in extreme cases, and therefore when aggregating positional array content over large results, it is probably best to use $unwind with the includeArrayIndex option instead.
Or indeed actually take a good look at the process, where in particular if the "positional array" in question is actually the result of some other "aggregation operation", then you should rather be looking at the previous pipeline stages that were used to create the "positional array". And then consider that if you wanted those positions "aggregated further" to new totals, then you should in fact do that "before" the positional result was obtained.

Check last element in array matches a condition

I have an array of numbers in my mongodb documents and need to check if the last number in that array meets my conditions.
My documents are stored like this:
{
name: String,
data: {
dates: Array,
numbers: Array
}
}
and I need to check if the last number in numbers "lies between" two other numbers.
Any suggestions on how to do this would be appreciated.
Right now the most effficient way you have of doing this is using the JavaScript evaluation of $where as you can simply find the value of the last array element and test it programatically.
With sample documents:
{ "a": [1,2,3] },
{ "a": [1,2,4] },
{ "a": [1,2,5] }
And to query:
db.collection.find(function() { var a = this.a.pop(); return ( a > 2 ) & ( a < 5 ) })
Or simply presented with $where as a string for evaluation:
Model.find(
{
"$where": "var a = this.a.pop(); return ( a > 2 ) && ( a < 5 )"
},
function(err,results) {
// handling here
}
);
Which is a really simple way to do this and does not have "overhead" such as $unwind in the aggregation framework created to to "denormalize" and process arrays. Not really efficient there.
In the "future" however, it will be. As is currently available in development releases, there is a $slice operator for the aggregation framework. This operator will allow easy access to the "last" array element for testing.
Since the aggregation framework operators are in "native code" aand not JavaScript to be interpreted, then a single pipeline stage then becomes more efficient than the JavaScript form. Though this listing to do this looks longer in submission:
db.collection.aggregate([
{ "$redact": {
"$cond": {
"if": {
"$anyElementTrue": {
"$map": {
"input": { "$slice": ["$a",-1] },
"as": "el",
"in":{
"$and": [
{ "$gt": [ "$$el", 2 ] },
{ "$lt": [ "$$el", 5 ] }
]
}
}
}
},
"then": "$$KEEP",
"else": "$$PRUNE"
}
}}
])
The $redact operator that already exists is used to "logically filter" with a comparison expression here. Based on the true/false match conditions it either "keeps" or "prunes" the document from the results repectively.
The $slice operator itself in it's aggregagtion framework form will still untimately return an array, albeit a single element array in this case. This is why $map is used to "transform" each element into a true/false condition and the $anyElementTrue operator reduces the "array" to a singular reponse as is repected by $cond.
So when that is released, then it will be be most efficient way to do this. But until then, stick with the JavaScript as it is presently the fastest way to to this evaluation.
Both query forms return just the first two documents of the sample here:
{ "a": [1,2,3] },
{ "a": [1,2,4] }
MongoDB aggregate may be a feasible way. Assuming name field in your document is unique.
If you have the sample document.
{
name: "allen",
data: {
dates: ["2015-08-08"],
numbers: [20, 21, 22, 23]
}
}
The following code is used to do the check. As the db.collection.aggregate() method returns a cursor and then we can use cursor's hasNext to decide whether the last number lies between the given two numbers.
var result = db.last_one.aggregate(
[
{
// deconstruct the array field numbers
$unwind: "$data.numbers"
},
{
$group: {
_id: "$name",
// lastNumber is 23 in this case
lastNumber: { $last: "$data.numbers" }
}
},
{
$match: {
lastNumber: { $gt: num1, $lt: num2 }
}
}
]
).hasNext()
if (result) print("matched"); else print("not matched")
For example, if num1 is 22, num2 is 24, the result is matched; if num1 is 21, num2 is 22, the result is not matched.
But actually, group on name is not a good idea. It's much better if your document has an unique ObjectId then we can group on that _id.

How to search embedded array

I want to get all matching values, using $elemMatch.
// create test data
db.foo.insert({values:[0,1,2,3,4,5,6,7,8,9]})
db.foo.find({},{
'values':{
'$elemMatch':{
'$gt':3
}
}
}) ;
My expecected result is {values:[3,4,5,6,7,8,9]} . but , really result is {values:[4]}.
I read mongo document , I understand this is specification.
How do I search for multi values ?
And more, I use 'skip' and 'limit'.
Any idea ?
Using Aggregation:
db.foo.aggregate([
{$unwind:"$values"},
{$match:{"values":{$gt:3}}},
{$group:{"_id":"$_id","values":{$push:"$values"}}}
])
You can add further filter condition in the $match, if you would like to.
You can't achieve this using an $elemMatch operator since, mongoDB doc says:
The $elemMatch projection operator limits the contents of an array
field that is included in the query results to contain only the array
element that matches the $elemMatch condition.
Note
The elements of the array are documents.
If you look carefully at the documentation on $elemMatch or the counterpart to query of the positional $ operator then you would see that only the "first" matched element is returned by this type of "projection".
What you are looking for is actually "manipulation" of the document contents where you want to "filter" the content of the array in the document rather than return the original or "matched" element, as there can be only one match.
For true "filtering" you need the aggregation framework, as there is more support there for document manipulation:
db.foo.aggregate([
// No point selecting documents that do not match your condition
{ "$match": { "values": { "$gt": 3 } } },
// Unwind the array to de-normalize as documents
{ "$unwind": "$values },
// Match to "filter" the array
{ "$match": { "values": { "$gt": 3 } } },
// Group by to the array form
{ "$group": {
"_id": "$_id",
"values": { "$push": "$values" }
}}
])
Or with modern versions of MongoDB from 2.6 and onwards, where the array values are "unique" you could do this:
db.foo.aggregate([
{ "$project": {
"values": {
"$setDifference": [
{ "$map": {
"input": "$values",
"as": "el",
"in": {
"$cond": [
{ "$gt": [ "$$el", 3 ] },
"$$el",
false
]
}
}},
[false]
]
}
}}
])