MongoDB $filter nested array by date does not work - mongodb

I have a document with a nested array which looks like this:
[
{
"id": 1,
data: [
[
ISODate("2000-01-01T00:00:00Z"),
2,
3
],
[
ISODate("2000-01-03T00:00:00Z"),
2,
3
],
[
ISODate("2000-01-05T00:00:00Z"),
2,
3
]
]
},
{
"id": 2,
data: []
}
]
As you can see, we have an array of arrays. For each element in the data array, the first element is a date.
I wanted to create an aggregation pipeline which filters only the elements of data where the date is larger than a given date.
db.collection.aggregate([
{
"$match": {
"id": 1
}
},
{
"$project": {
"data": {
"$filter": {
"input": "$data",
"as": "entry",
"cond": {
"$gt": [
"$$entry.0",
ISODate("2000-01-04T00:00:00Z")
]
}
}
}
}
}
])
The problem is that with $gt, this just returns an empty array for data. With $lt this returns all elements. So the filtering clearly does not work.
Expected result:
[
{
"id": 1,
"data": [
[
ISODate("2000-01-05T00:00:00Z"),
2,
3
]
]
}
]
Any ideas?
Playground

I believe the issue is that when you write $$entry.0, MongoDB is trying to evaluate entry.0 as a variable name, when in reality the variable is named entry. You could make use of the $first array operator in order to get the first element like so:
db.collection.aggregate([
{
"$match": {
"id": 1
}
},
{
"$project": {
"data": {
"$filter": {
"input": "$data",
"as": "entry",
"cond": {
"$gt": [
{
$first: "$$entry"
},
ISODate("2000-01-04T00:00:00Z")
]
}
}
}
}
}
])
Mongo playground example

Don't think $$entry.0 work to get the first element of the array. Instead, use $arrayElemAt operator.
db.collection.aggregate([
{
"$match": {
"id": 1
}
},
{
"$project": {
"data": {
"$filter": {
"input": "$data",
"as": "entry",
"cond": {
"$gt": [
{
"$arrayElemAt": [
"$$entry",
0
]
},
ISODate("2000-01-04T00:00:00Z")
]
}
}
}
}
}
])
Sample Mongo Playground

to specify which element in the array you are comparing it is better to use $arrayElemAt instead of $$ARRAY.0. you must pass 2 parameters while using $arrayElemAt, the first one is the array which in your case is $$entry, and the second one is the index which in your case is 0
this is the solution I came up with:
db.collection.aggregate([
{
"$match": {
"id": 1
}
},
{
"$project": {
"data": {
"$filter": {
"input": "$data",
"as": "entry",
"cond": {
"$gt": [
{
"$arrayElemAt": [
"$$entry",
0
]
},
ISODate("2000-01-04T00:00:00Z")
]
}
}
}
}
}
])
playground

Related

Can I get the count of subdocuments that match a filter?

I have the following document
[
{
"_id": "624713340a3d2901f2f5a9c0",
"username": "fotis",
"exercises": [
{
"_id": "624713530a3d2901f2f5a9c3",
"description": "Sitting",
"duration": 60,
"date": "2022-03-24T00:00:00.000Z"
},
{
"_id": "6247136a0a3d2901f2f5a9c6",
"description": "Coding",
"duration": 999,
"date": "2022-03-31T00:00:00.000Z"
},
{
"_id": "624713a00a3d2901f2f5a9ca",
"description": "Sitting",
"duration": 999,
"date": "2022-03-30T00:00:00.000Z"
}
],
"__v": 3
}
]
And I am trying to get the count of exercises returned with the following aggregation (I know it is way easier to do it in my code, but I am trying to understand how to use mongodb queries)
db.collection.aggregate([
{
"$match": {
"_id": "624713340a3d2901f2f5a9c0"
}
},
{
"$project": {
"username": 1,
"exercises": {
"$slice": [
{
"$filter": {
"input": "$exercises",
"as": "exercise",
"cond": {
"$eq": [
"$$exercise.description",
"Sitting"
]
}
}
},
1
]
},
"count": {
"$size": "exercises"
}
}
}
])
When I try to access the exercises field using "$size": "exercises", I get an error query failed: (Location17124) Failed to optimize pipeline :: caused by :: The argument to $size must be an array, but was of type: string.
But when I access the subdocument exercises using "$size": "$exercises" I get the count of all the subdocuments contained in the document.
Note: I know that in this example I use $slice and I set the limit to 1, but in my code it is a variable.
You are actually on the right track. You don't really need the $slice. You can just use $reduce to perform the filtering. The reason that your count is not working is that the filtering and the $size are in the same stage. In such case, it will take the pre-filtered array to do the count. You can resolve this by adding a $addFields stage.
db.collection.aggregate([
{
"$match": {
"_id": "624713340a3d2901f2f5a9c0"
}
},
{
"$project": {
"username": 1,
"exercises": {
"$filter": {
"input": "$exercises",
"as": "exercise",
"cond": {
"$eq": [
"$$exercise.description",
"Sitting"
]
}
}
}
}
},
{
"$addFields": {
"count": {
$size: "$exercises"
}
}
}
])
Here is the Mongo playground for your reference.

How to filter an array of objects in mongoose by date field only selecting the most recent date

I'm trying to filter through an array of objects in a user collection on MongoDB. The structure of this particular collection looks like this:
name: "John Doe"
email: "john#doe.com"
progress: [
{
_id : ObjectId("610be25ae20ce4872b814b24")
challenge: ObjectId("60f9629edd16a8943d2cab9b")
date_unlocked: 2021-08-05T12:15:32.129+00:00
completed: true
date_completed: 2021-08-06T12:15:32.129+00:00
}
{
_id : ObjectId("611be24ae32ce4772b814b32")
challenge: ObjectId("60g6723efd44a6941l2cab81")
date_unlocked: 2021-08-06T12:15:32.129+00:00
completed: true
date_completed: 2021-08-07T12:15:32.129+00:00
}
]
date: 2021-08-04T13:06:34.129+00:00
How can I query the database using mongoose to return only the challenge with the most recent 'date_unlocked'?
I have tried: User.findById(req.user.id).select('progress.challenge progress.date_unlocked').sort({'progress.date_unlocked': -1}).limit(1);
but instead of returning a single challenge with the most recent 'date_unlocked', it is returning the whole user progress array.
Any help would be much appreciated, thank you in advance!
You can try this.
db.collection.aggregate([
{
"$unwind": {
"path": "$progress"
}
},
{
"$sort": {
"progress.date_unlocked": -1
}
},
{
"$limit": 1
},
{
"$project": {
"_id": 0,
"latestChallenge": "$progress.challenge"
}
}
])
Test the code here
Alternative solution is to use $reduce in that array.
db.collection.aggregate([
{
"$addFields": {
"latestChallenge": {
"$arrayElemAt": [
{
"$reduce": {
"input": "$progress",
"initialValue": [
"0",
""
],
"in": {
"$let": {
"vars": {
"info": "$$value",
"progress": "$$this"
},
"in": {
"$cond": [
{
"$gt": [
"$$progress.date_unlocked",
{
"$arrayElemAt": [
"$$info",
0
]
}
]
},
[
{
"$arrayElemAt": [
"$$info",
0
]
},
"$$progress.challenge"
],
"$$info"
]
}
}
}
}
},
1
]
}
}
},
{
"$project": {
"_id": 0,
"latestChallenge": 1
}
},
])
Test the code here
Mongoose can use raw MQL so you can use it.

Return first element if no match found in array

I have the following document:
{
_id: 123,
state: "AZ",
products: [
{
product_id: 1,
desc: "P1"
},
{
product_id: 2,
desc: "P2"
}
]
}
I need to write a query to return a single element from the products array where state is "AZ" and product_id is 2. If the matching product_id is not found, then return the first (or any) element from the products array.
For example: If product_id is 2 (match found), then the result should be:
products: [
{
product_id: 2,
desc: "P2"
}
]
If the product_id is 3 (not found), then the result should be:
products: [
{
product_id: 1,
desc: "P1"
}
]
I was able to meet one condition when the match is found but not sure how to satisfy the second condition in the same query:
db.getCollection('test').find({"state": "AZ"}, {_id: 0, state: 0, products: { "$elemMatch": {"product_id": "2"}}})
I tried using the aggregation pipeline as well but could not find a working solution.
Note: This is different from the following question as I need to return a default element if the match is not found:
Retrieve only the queried element in an object array in MongoDB collection
You can try below aggregation
Basically you need to $filter the products array and check for the $condition if it doesn't contain any element or equal to [] then you have to $slice with the first element of the products array.
db.collection.aggregate([
{ "$addFields": {
"products": {
"$cond": [
{
"$eq": [
{ "$filter": {
"input": "$products",
"cond": { "$eq": ["$$this.product_id", 2] }
}},
[]
]
},
{ "$slice": ["$products", 1] },
{ "$filter": {
"input": "$products",
"cond": { "$eq": ["$$this.product_id", 2] }
}}
]
}
}}
])
or even using $let aggregation
db.collection.aggregate([
{ "$addFields": {
"products": {
"$let": {
"vars": {
"filt": {
"$filter": {
"input": "$products",
"cond": { "$eq": ["$$this.product_id", 2] }
}
}
},
"in": {
"$cond": [
{ "$eq": ["$$filt", []] },
{ "$slice": ["$products", 1] },
"$$filt"
]
}
}
}
}}
])
If you don't care which element you get back then this is the way to go (you'll get the last element in the array in case of no match since $indexOfArray will return -1):
db.getCollection('test').aggregate([{
$addFields: {
"products": {
$arrayElemAt: [ "$products", { $indexOfArray: [ "$products.product_id", 2 ] } ]
},
}
}])
If you want the first then do this instead ($max will take care of transforming -1 into index 0 which is the first element):
db.getCollection('test').aggregate([{
$addFields: {
"products": {
$arrayElemAt: [ "$products", { $max: [ 0, { $indexOfArray: [ "$products.product_id", 2 ] } ] } ]
},
}
}])
Here is a version that should work on v3.2 as well:
db.getCollection('test').aggregate([{
"$project": {
"products": {
$slice: [{
$concatArrays: [{
$filter: {
"input": "$products",
"cond": { "$eq": ["$$this.product_id", 2] }
}},
"$products" // simply append the "products" array
// alternatively, you could append only the first or a specific item like this [ { $arrayElemAt: [ "$products", 0 ] } ]
]
},
1 ] // take first element only
}
}
}])

Query specific element of nested array or default

Having document in collection test as follow:
{a:1, list:[{lang:"en", value:"Mother"}, {lang:"de", value:"Mutter"}] }
{a:2, list:[{lang:"en", value:"Iddqd"}] }
I would like build query that tries to match list.value to selected language, but if it absent then return any-available-item of list, so for example above and for query language de i need get $projection as:
{a:1, label:"Mutter"},
{a:2, label:"Iddqd"} //since no label matched 'de' let's select any available
Server version: MongoDB 3.2+
You need to filter the list, assign the result to a variable using the $let operator. If the variable is an empty list, you return a given value using the $arrayElemAt operator. In this case, I simply return the first sub-document.
db.coll.aggregate([
{ "$project": {
"a": 1,
"label": {
"$let": {
"vars": {
"values": {
"$arrayElemAt": [
{ "$filter": {
"input": "$list",
"as": "lst",
"cond": { "$eq": [ "$$lst.lang", "de" ] }
}},
0
]
}
},
"in": {
"$ifNull": [
"$$values.value",
{ "$let": {
"vars": {
"default": {
"$arrayElemAt": [ "$list", 0 ]
}
},
"in": "$$default.value"
}}
]
}
}
}
}}
])

Mongo Query to Return only a subset of SubDocuments

Using the example from the Mongo docs:
{ _id: 1, results: [ { product: "abc", score: 10 }, { product: "xyz", score: 5 } ] }
{ _id: 2, results: [ { product: "abc", score: 8 }, { product: "xyz", score: 7 } ] }
{ _id: 3, results: [ { product: "abc", score: 7 }, { product: "xyz", score: 8 } ] }
db.survey.find(
{ id: 12345, results: { $elemMatch: { product: "xyz", score: { $gte: 6 } } } }
)
How do I return survey 12345 (regardless of even if it HAS surveys or not) but only return surveys with a score greater than 6? In other words I don't want the document disqualified from the results based on the subdocument, I want the document but only a subset of subdocuments.
What you are asking for is not so much a "query" but is basically just a filtering of content from the array in each document.
You do this with .aggregate() and $project:
db.survey.aggregate([
{ "$project": {
"results": {
"$setDifference": [
{ "$map": {
"input": "$results",
"as": "el",
"in": {
"$cond": [
{ "$and": [
{ "$eq": [ "$$el.product", "xyz" ] },
{ "$gte": [ "$$el.score", 6 ] }
]}
]
}
}},
[false]
]
}
}}
])
So rather than "contrain" results to documents that have an array member matching the condition, all this is doing is "filtering" the array members out that do not match the condition, but returns the document with an empty array if need be.
The fastest present way to do this is with $map to inspect all elements and $setDifference to filter out any values of false returned from that inspection. The possible downside is a "set" must contain unique elements, so this is fine as long as the elements themselves are unique.
Future releases will have a $filter method, which is similar to $map in structure, but directly removes non-matching results where as $map just returns them ( via the $cond and either the matching element or false ) and is then better suited.
Otherwise if not unique or the MongoDB server version is less than 2.6, you are doing this using $unwind, in a non performant way:
db.survey.aggregate([
{ "$unwind": "$results" },
{ "$group": {
"_id": "$_id",
"results": { "$push": "$results" },
"matched": {
"$sum": {
"$cond": [
{ "$and": [
{ "$eq": [ "$results.product", "xyz" ] },
{ "$gte": [ "$results.score", 6 ] }
]},
1,
0
]
}
}
}},
{ "$unwind": "$results" },
{ "$match": {
"$or": [
{
"results.product": "xyz",
"results.score": { "$gte": 6 }
},
{ "matched": 0 }
}},
{ "$group": {
"_id": "$_id",
"results": { "$push": "$results" },
"matched": { "$first": "$matched" }
}},
{ "$project": {
"results": {
"$cond": [
{ "$ne": [ "$matched", 0 ] },
"$results",
[]
]
}
}}
])
Which is pretty horrible in both design and perfomance. As such you are probably better off doing the filtering per document in client code instead.
You can use $filter in mongoDB 3.2
db.survey.aggregate([{
$match: {
{ id: 12345}
}
}, {
$project: {
results: {
$filter: {
input: "$results",
as: "results",
cond:{$gt: ['$$results.score', 6]}
}
}
}
}]);
It will return all the sub document that have score greater than 6. If you want to return only first matched document than you can use '$' operator.
You can use $redact in this way:
db.survey.aggregate( [
{ $match : { _id : 12345 }},
{ $redact: {
$cond: {
if: {
$or: [
{ $eq: [ "$_id", 12345 ] },
{ $and: [
{ $eq: [ "$product", "xyz" ] },
{ $gte: [ "$score", 6 ] }
]}
]
},
then: "$$DESCEND",
else: "$$PRUNE"
}
}
}
] );
It will $match by _id: 12345 first and then it will "$$PRUNE" all the subdocuments that don't have "product":"xyz" and don't have score greater or equal 6. I added the condition ($cond) { $eq: [ "$_id", 12345 ] } so that it wouldn't prune the whole document before it reaches the subdocuments.