mongodb using projection in $filter - mongodb

I have inserted a sample document
db.test.insert({
x:1,
a:[
{b:1,c:1,d:1},
{b:2,c:2}
]
})
I am facing 2 problems when I try to use $fitler aggregation as in my below query
db.test.aggregate(
{$project:{
a:{$filter:{
input : '$a',
as : 'item',
cond : '$$item.d'
}}
}}
)
Element Existence
1] How do I test the existence of element a.d, I found a way of just using cond : '$$item.d', but I think there should be a better way of doing it.
Selective Projection
2] How do I selectively project b and d nodes.
I tried the below code and it works, but I think there is a pipeline in projection as well. Therefore I applied projection twice on the same node 1 for filtering array elements, 2 for array element nodes
db.test.aggregate(
{$project:{
a:{$filter:{
input : '$a',
as : 'item',
cond : '$$item.d'
}},
a:{b:1, d:1}
}}
)
I seem to get the solution, but I think there may be a better way. Thanks for reply!

(1) It appears to me that the $exists operator is not yet unavailable in aggregation pipelines. You may wish to check if there is a jira requesting this, and if so, watch it and vote for it, and if not, add one?
Your workaround, I believe, will only return cases where item.d is true, rather than when it exists. So if item.d == null, false, 0, it will not be returned. I would suggest trying this instead:
cond : { $gte : [ '$$item.d', null ] }
(2) I'm not 100% sure I understood the question, but if I do, I think the way to do it is to have two $project's in the pipeline. So something like this:
db.test.aggregate(
[ { $project:
{a:{$filter:{input:'$a',as:'item',cond:{$gte:['$$item.d',null]}}}}
},
{ $project: { a : { b : 1, d : 1 } } }
]
)

Related

Is there a way to prevent mongo queries "branching" on arrays?

If I have the following documents:
{a: {x:1}} // without array
{a: [{x:1}]} // with array
Is there a way to query for {'a.x':1} that will return the first one but not the second one? IE, I want the document where a.x is 1, and a is not an array.
Please note that future version of MongoDB would incorporate the $isArray aggregation expression. In the meantime...
...the following code will do the trick as the $elemMatch operator matches only documents having an array field:
> db.test.find({"a.x": 1, "a": {$not: {$elemMatch: {x:1}}}})
Given that dataset:
> db.test.find({},{_id:0})
{ "a" : { "x" : 1 } }
{ "a" : [ { "x" : 1 } ] }
{ "a" : [ { "x" : 0 }, { "x" : 1 } ]
It will return:
> db.test.find({"a.x": 1, "a": {$not: {$elemMatch: {x:1}}}}, {_id:0})
{ "a" : { "x" : 1 } }
Please note this should be considered as a short term solution. The MongoDB team took great cares to ensure that [{x:1}] and {x:1} behave the same (see dot-notation or $type for arrays). So you should consider that at some point in the future, $elemMatch might be updated (see JIRA issue SERVER-6050). In the meantime, maybe worth considering fixing your data model so it would no longer be necessary to distinguish between an array containing one subdocument and a bare subdocument.
You can do this by adding a second term that ensures a has no elements. That second term will always be true when a is a plain subdoc, and always false when a is an array (as otherwise the first term wouldn't have matched).
db.test.find({'a.x': 1, 'a.0': {$exists: false}})

MongoDb: querying against a collection's own fields

I've done some research and it seems that it's possible to query (i.e. compare) two fields in the same collection using the aggregation framework. It's also possible with the $where operator but I want to avoid a low performance Javascript solution.
Here's an example document:
{
"_id" : ObjectId("541ba14d2208236d06ff1e57"),
"a" : "foo",
"d" : {
"e" : "foo"
}
}
{
"_id" : ObjectId("541ba14d2208236d06ff1e58"),
"a" : "foo",
"d" : {
"e" : "bar"
}
}
I'd like to pick the documents where 'a' != 'd.e'. I've attempted the following without success:
db.test.aggregate([{$match: {$ne: ['$a', '$d.e']}}]);
As you said the query can be done with JavaScript by issuing a $where condition in your query:
db.test.find(function() { return this.a != this.d.e } )
Which is the short form of the query.
While you can do other manipulation in the aggregation framework, it does not change the basic nature of the query in that you cannot place a query condition that compares the values of two fields. This is why $match alone cannot do this because it follows the same rules.
What you "can" do is $project another field value that matches the same logical conditions that you want to enforce. Depending on your actual implementation this may or may not be better for performance:
db.test.aggregate([
{ "$project": {
"a": 1,
"d": 1,
"notEqual": { "$ne": [ "$a", "$d.e" ] }
}},
{ "$match": { "notEqual": true } }
])
That probably is not going to make a lot of sense on it's own unless some other filtering is done in the overall process though. But the general comparison is done with a comparison operator to return a true/false result that can then be filtered.
So the best thing to do if you can is to actually maintain the result of this in a similar way by a field that is present on your document. Then you have a basic query condition to look for that value rather than the comparison. This is if you need to regularly do these kinds of checks.
But for "ad-hoc" purposes, you either stick with the JavaScript evaluation or use the "projection" form in aggregation queries ( where you cannot use a $where clause ) in order to do the field level comparison.

How to use $elemMatch on aggregate's projection?

This is my object:
{ "_id" : ObjectId("53fdcb6796cb9b9aa86f05b9"), "list" : [ "a", "b" ], "complist" : [ { "a" : "a", "b" : "b" }, { "a" : "c", "b" : "d" } ] }
And this is what I want to accomplish: check if "list" contains a certain element and get only the field "a" from the objects on "complist" while reading the document regardless of any of these values. I'm building a forum system, this is the query that will return the details of a forum. I need to read the forum information while knowing if the user is in the forum's white list.
With a find I can use the query
db.itens.find({},{list:{$elemMatch:{$in:["a"]}}})
to get only the first element that matches a certain value. This way I can just check if the returned array is not empty and I know if "list" contains the value I'm looking for. I can't do it on the query because I want the document regardless of it containing the value I'm looking for in the "list" value. I need the document AND know if "list" has a certain value.
With an aggregate I can use the query
db.itens.aggregate({$project:{"complist.a":1}})
to read only the field "a" of the objects contained in complist. This is going to get the forum's threads basic information, I don't want all the information of the threads, just a couple of things.
But when I try to use the query
db.itens.aggregate({$project:{"complist.b":1,list:{$elemMatch:{$in:["a"]}}}})
to try and do both, it throws me an error saying the operator $elemMatch is not valid.
Am I doing something wrong here with the $elemMatch in aggregate? Is there a better way to accomplish this?
Quite on old question but literally none of the proposed answers are good.
TLDR:
You can't use $elemMatch in a $project stage. but you can achieve the same result using other aggregation operators like $filter.
db.itens.aggregate([
{
$project: {
compList: {
$filter: {
input: "$complist",
as: "item",
cond: {$eq: ["$$item.a", 1]}
}
}
}
}
])
And if you want just the first item from the array that matches the condition similarly to what $elemMatch does you can incorporate $arrayElemAt
In Depth Explanation:
First let's understand $elemMatch:
$elemMatch is a query expressions while also this projection version of it exists this refers to a query projection and not $project aggregation stage.
So what? what does this have to do with anything? well a $project stage has certain input structure it can have while the one we want to use is:
<field>: <expression>
What is a valid expression?
Expressions can include field paths, literals, system variables, expression objects, and expression operators. Expressions can be nested.
So we want to use an expression operator, but as you can see from the doc's $elemMatch is not part of it. hence it's not a valid expression to be used in an aggregation $project stage.
For some reason $elemMatch doesn't work in aggregations. You need to use the new $filter operator in Mongo 3.2. See https://docs.mongodb.org/manual/reference/operator/aggregation/filter/
The answer to this question maybe help.
db.collection_name.aggregate({
"$match": {
"complist": {
"$elemMatch": {
"a": "a"
}
}
}
});
Actually, the simplest solution is to just $unwind your array, then $match the appropriate documents. You can wind-up the appropriate documents again using $group and $push.
Although the question is old, here is my contribution for November 2017.
I had similar problem and doing two consecutive match operations worked for me. The code below is a subset of my whole code and I changed elements names, so it's not tested. Anyway this should point you in the right direction.
db.collection.aggregate([
{
"$match": {
"_id": "ID1"
}
},
{
"$unwind": "$sub_collection"
},
{
"$match": {
"sub_collection.field_I_want_to_match": "value"
}
}
])
For aggregations simply use $expr:
db.items.aggregate([
{
"$match": {
"$expr": {"$in": ["a", "$list"]}
}
},
])
Well, it happens you can use "array.field" on a find's projection block.
db.itens.find({},{"complist.b":1,list:{$elemMatch:{$in:["a"]}}})
did what I needed.

Query returns more than expected results

Bear with me, this is not really my question. Just trying to get someone to understand.
Authors note:
The possible duplicate question solution allows $elemMatch to constrain because >all of the elements are an array. This is a little different.
So, in the accepted answer the main point has been brought up. This behavior is well
documented and you should not "compare 'apples'` with 'oranges'". The fields are of
different types, and while there is a workaround for this, the best solution for the real
world is don't do this.
Happy reading :)
I have a collection of documents I am trying to search, the collection contains the following:
{ "_id" : ObjectId("52faa8a695fa10cc7d2b7908"), "x" : 1 }
{ "_id" : ObjectId("52faa8ab95fa10cc7d2b7909"), "x" : 5 }
{ "_id" : ObjectId("52faa8ad95fa10cc7d2b790a"), "x" : 15 }
{ "_id" : ObjectId("52faa8b095fa10cc7d2b790b"), "x" : 25 }
{ "_id" : ObjectId("52faa8b795fa10cc7d2b790c"), "x" : [ 5, 25 ] }
So I want to find the results where x falls between the values of 10 and 20. So this is the query that seemed logical to me:
db.collection.find({ x: {$gt: 10, $lt: 20} })
But the problem is this returns two documents in the result:
{ "_id" : ObjectId("52faa8ad95fa10cc7d2b790a"), "x" : 15 }
{ "_id" : ObjectId("52faa8b795fa10cc7d2b790c"), "x" : [ 5, 25 ] }
I am not expecting to see the second result as none of the values are between 10 and 20.
Can someone explain why I do not get the result I expect? I think { "x": 15 } should be the only document returned.
So furthermore, how can I get what I expect?
This behaviour is expected and explained in mongo documentation here.
Query a Field that Contains an Array
If a field contains an array and your query has multiple conditional
operators, the field as a whole will match if either a single array
element meets the conditions or a combination of array elements
meet the conditions.
Mongo seems to be willing to play "smug", by giving back results when a combination of array elements match all conditions independently.
In our example, 5 matches the $lt:20 condition and 25 matches the $gt:10 condition. So, it's a match.
Both of the following will return the [5,25] result:
db.collection.find({ x: {$gt: 10, $lt: 20} })
db.collection.find({ $and : [{x: {$gt: 10}},{x:{ $lt: 20}} ] })
If this is user expected behaviour, opinions can vary. But it certainly is documented, and should be expected.
Edit, for Neil's sadistic yet highly educational edit to original answer, asking for a solution:
Use of the $elemMatch can make "stricter" element comparisons for arrays only.
db.collection.find({ x: { $elemMatch:{ $gt:10, $lt:20 } } })
Note: this will match both x:[11,12] and x:[11,25]
I believe when a query like this is needed, a combination on two queries is required, and the results combined. Below is a query that returns correct results for documents with x being not an array:
db.collection.find( { $where : "!Array.isArray(this.x)", x: {$gt: 10, $lt: 20} } )
But the best approach in this case is to change the type of x to always be an array, even when it only contains one element. Then, only the $elemMatch query is required to get correct results, with expected behaviour.
You can first check if the subdocument is not and array and provide a filter for the desired values:
db.collection.find(
{
$and :
[
{ $where : "!Array.isArray(this.x)" },
{ x: { $gt: 10, $lt: 20 } }
]
}
)
which returns:
{ "_id" : ObjectId("52fb4ec1cfe34ac4b9bab163"), "x" : 15 }

Mongodb query with fields in the same documents

I have the following json:
{
"a1": {"a": "b"},
"a2": {"a": "c"}
}
How can I request all documents where a1 and a2 are not equal in the same document?
You could use $where:
db.myCollection.find( { $where: "this.a1.a != this.a2.a" } )
However, be aware that this won't be very fast, because it will have to spin up the java script engine and iterate each and every document and check the condition for each.
If you need to do this query for large collections, or very often, it's best to introduce a denormalized flag, like areEqual. Still, such low-selectivity fields don't yield good index performance, because he candidate set is still large.
update
using the new $expr operator available as of mongo 3.6 you can use aggregate expressions in find query like this:
db.myCollection.find({$expr: {$ne: ["$a1.a", "$a2.a"] } });
Although this comment solves the problem, I think a better match for this use case would be to use $addFields operator available as of version 3.4 instead of $project.
db.myCollection.aggregate([
{"$match":{"a1":{"$exists":true},"a2":{"$exists":true}}},
{"$addFields": {
"aEq": {"$eq":["$a1.a","$a2.a"]}
}
},
{"$match":{"aEq": false}}
]);
To avoid JavaScript use the aggregation framework:
db.myCollection.aggregate([
{"$match":{"a1":{"$exists":true},"a2":{"$exists":true}}},
{"$project": {
"a1":1,
"a2":1,
"aCmp": {"$cmp":["$a1.a","$a2.a"]}
}
},
{"$match":{"aCmp":0}}
])
On our development server the equivalent JavaScript query takes 7x longer to complete.
Update (10 May 2017)
I just realized my answer didn't answer the question, which wanted values that are not equal (sometimes I'm really slow). This will work for that:
db.myCollection.aggregate([
{"$match":{"a1":{"$exists":true},"a2":{"$exists":true}}},
{"$project": {
"a1":1,
"a2":1,
"aEq": {"$eq":["$a1.a","$a2.a"]}
}
},
{"$match":{"aEq": false}}
])
$ne could be used in place of $eq if the match condition was changed to true but I find using $eq with false to be more intuitive.
MongoDB uses Javascript in the background, so
{"a": "b"} == {"a": "b"}
would be false.
So to compare each you would have to a1.a == a2.a
To do this in MongoDB you would use the $where operator
db.myCollection.find({$where: "this.a1.a != this.a2.a"});
This assumes that each embedded document will have a property "a". If that isn't the case things get more complicated.
Starting in Mongo 4.4, for those that want to compare sub-documents and not only primitive values (since {"a": "b"} == {"a": "b"} is false), we can use the new $function aggregation operator that allows applying a custom javascript function:
// { "a1" : { "x" : 1, "y" : 2 }, "a2" : { "x" : 1, "y" : 2 } }
// { "a1" : { "x" : 1, "y" : 2 }, "a2" : { "x" : 3, "y" : 2 } }
db.collection.aggregate(
{ $match:
{ $expr:
{ $function: {
body: function(a1, a2) { return JSON.stringify(a1) != JSON.stringify(a2); },
args: ["$a1", "$a2"],
lang: "js"
}}
}
}
)
// { "a1" : { "x" : 1, "y" : 2 }, "a2" : { "x" : 3, "y" : 2 } }
$function takes 3 parameters:
body, which is the function to apply, whose parameter are the two fields to compare.
args, which contains the fields from the record that the body function takes as parameter. In our case, both "$a1" and "$a2".
lang, which is the language in which the body function is written. Only js is currently available.
Thanks all for solving my problem -- concerning the answers that use aggregate(), one thing that confused me at first is that $eq (or $in, or lots of other operators) has different meaning depending on where it is used. In a find(), or the $match phase of aggregation, $eq takes a single value, and selects matching documents:
db.items.aggregate([{$match: {_id: {$eq: ObjectId("5be5feb45da16064c88e23d4")}}}])
However, in the $project phase of aggregation, $eq takes an Array of 2 expressions, and makes a new field with value true or false:
db.items.aggregate([{$project: {new_field: {$eq: ["$_id", "$foreignID"]}}}])
In passing, here's the query I used in my project to find all items whose list of linked items (due to a bug) linked to themselves:
db.items.aggregate([{$project: {idIn: {$in: ["$_id","$header.links"]}, "header.links": 1}}, {$match: {idIn: true}}])