I have a find query that uses $in to check whether the specified array is contained within the collection string array:
db.Doc.find({ tags: { '$in': ['tag1','tag2'] } })
I am in the process of refactoring this query to use the aggregation framework, but I can't find the equivalent $in comparison operator at the $project or $match aggregation stages.
Is it possible to use the $in comparison operator at the $project or $match stages of an aggregation query.
To answer your question: yes, but not as you would expect. It is possible to use the $in operator at the $project or $match stages of an aggregation query, but the usage and the purpose aren't quite the same in each.
There are two extremely different types of the "same" $in operator (making a semantic confusion):
Non-aggregational $in: Usually narrows down the results, like a filter. It has no way to add information to the result set, if it doesn't match. Can be used both within find() collection method and inside the aggregational (quite confusing semantic ah?) $match.
Aggregational $in: Usually adds boolean information to the result set, can be used as a logic expression inside $cond, and might also remove some results when is used with $redact. Can be used in $project, $addFields, etc. (but cannot (!) be used within find() or $match). The structure is: { $in: [ <needle expression>, <array haystack expression> ] }, and all of this grey line becomes either true or false (I used PHP's documentation's in_array needle-heystack semantic to better explain). So, { $in [ 'foo', [ 'foo', 'bar', 'baz' ] ] } is true because foo is inside the array.
However, in the previous non-aggregational $in, the { maybeFooField: { $in: [ 'foo', 'bar', 'baz' ] } } structure query simply narrows down the result set, and it doesn't result in a boolean true or false.
Going back to your refactoring, the question is what are your intended results? Why did you switch to the aggregation framework from the beginning?
If you only want to narrow down or filter out the result set, and then use some other aggregation computations, use the simple non-aggregational $in operator.
db.Doc.aggregate([
{ $match: { tags: {$in: ['tag1','tag2'] } } } // non-aggregational $in
])
However, if you want to add information based on the existence or absence of certain tags, use the aggregational $in operator.
db.Doc.aggregate([
{ $project: { hasAnyTag: {$in: [$tags, ['tag1', 'tag2'] ] } } } // aggregational $in
])
Note, you have more aggregational operators to play with arrays, like: $setIntersection and $setIsSubset.
The query: db.Doc.find({ tags: { '$in': ['tag1','tag2'] } }) is equivalent to:
db.Doc.aggregate([
{$match:{tags: {$in: ['tag1','tag2'] }}}
])
And when u use $in at projection like below:
db.Doc.aggregate([
{$project:{tags: {$in: ['tag1','tag2'] }}}
])
Result will be tags:true or tags:false depending upon whether there's match or not.
Related
db.col.aggregate([
{
$match: {
field_nest: { $elemMatch: { /* conditions */ } }
}
}
])
This is my current set up. In addition to matching the parent document, it needs to also return only the subdocument that matches the $elemMatch.
Otherwise, I would have to $unwind and $match again. But this would no long be able to use the index. The idea is to be able to use the indexes.
No the $match stage selects documents to pass along the pipeline, it does not modify the documents being passed along.
You can use $elemMatch in the $match stage to select the documents, and then use $filter in an $addFields stage to filter out the non-matching elements.
Perhaps something like:
db.col.aggregate([
{$match: {
field_nest: { $elemMatch: { /* conditions */ } }
}},
{$addFields: {
field_nest: {$filter:{
input: "$field_nest",
as: "item",
cond: { /* conditions */ }
}}
}}
])
This may be able to use an index, depending on the exact conditions and available indexes.
For example, if the query were
db.col.aggregate([
{$match: {
field_nest:{$elemMatch:{a:1,b:2}}
}}
])
It could use an index on {"field_nest.a":1,"field_nest.b":1}, but it could not use an index on {field_nest:1} or {"field_next.c":1, "field_next.a":1}.
If the query were
db.col.aggregate([
{$match: {
top_field: "name",
some_value: {$gte: "never"},
field_nest:{$elemMatch:{a:1,b:2}}
}}
])
the query executor would look at all of the available indexes, but may use an index that does not include the array field.
If the query were
{$match: {
top_field: "name",
some_value: {$gte: "never"},
field_nest:{$elemMatch:{a:{$regex:"unanchored"},b:2}}
}}
])
it would not be able to use an index for selecting field_nest.a, but might be able to use one for field_nest.b.
The determination of whether or not an index will be used depends greatly on the exact nature of the match conditions and available indexes.
I want to match documents in my pipeline based on whether the field to match is contained within an array that is within my documents.
Example document to match:
{
'wishlist': ['123','456','789'],
'productId': '123'
}
Example match aggregation:
{
$match: {
'productId': {$in: '$wishlist'}
}
}
This isn't working - error is '$in needs an array' - but '$wishlist' is an array? so clearly the stage isn't picking up the path reference.
How would I get something like this to work?
Thanks!
If you want to match the internal field of the document, you can use $expr expression operator, and I see that field has an array value then you have to use $in operator's aggregation syntax,
{
$match: {
$expr: {
$in: ["$productId", '$wishlist']
}
}
}
How can I get documents from mongo with an array containing some elements but IN THE SAME ORDER?
I know that $all do the job but ignoring the order of elements. The order in my case is important and I can't sort my arrays since it's describing a path that I want to keep the order.
111,222,333 is not the same as 222,111,333
Is there a way to do it using $all or maybe another operator in mongo aggregation framework?
You can avoid the first "intersect" field, is just to give you back as debug what MongoDB make with this command. You should create the $and operator dynamically.
db.Test6.aggregate([
{
$project: {
_id:1,
pages:1,
intersect: {$setIntersection: [[111,666], "$pages"]},
theCondition: {$let: {
vars: {
intersect: {$setIntersection: [[111,666], "$pages"]}
},
in: {
$cond:[ {$and:[
{$eq:[{$arrayElemAt:["$$intersect", 0]}, 111]},
{$eq:[{$arrayElemAt:["$$intersect", 1]}, 666]}
]} , true, false]
}
}
}
}
}
]);
This question already has answers here:
MongoDb query condition on comparing 2 fields
(4 answers)
Closed 3 years ago.
Is it possible to find only those documents in a collections with same value in two given fields?
{
_id: 'fewSFDewvfG20df',
start: 10,
end: 10
}
As here start and end have the same value, this document would be selected.
I think about something like...
Collection.find({ start: { $eq: end } })
... which wouldn't work, as end has to be a value.
You can use $expr in mongodb 3.6 to match the two fields from the same document.
db.collection.find({ "$expr": { "$eq": ["$start", "$end"] } })
or with aggregation
db.collection.aggregate([
{ "$match": { "$expr": { "$eq": ["$start", "$end"] }}}
])
You have two options here. The first one is to use the $where operator.
Collection.find( { $where: "this.start === this.end" } )
The second option is to use the aggregation framework and the $redact operator.
Collection.aggregate([
{ "$redact": {
"$cond": [
{ "$eq": [ "$start", "$end" ] },
"$$KEEP",
"$$PRUNE"
]
}}
])
Which one is better?
The $where operator does a JavaScript evaluation and can't take advantage of indexes so query using $where can cause a drop of performance in your application. See considerations. If you use $where each of your document will be converted from BSON to JavaScript object before the $where operation which, will cause a drop of performance. Of course your query can be improved if you have an index filter. Also There is security risk if you're building your query dynamically base on user input.
The $redact like the $where doesn't use indexes and even perform a collection scan, but your query performance improves when you $redact because it is a standard MongoDB operators. That being said the aggregation option is far better because you can always filter your document using the $match operator.
$where here is fine but could be avoided. Also I believe that you only need $where when you have a schema design problem. For example adding another boolean field to the document with index can be a good option here.
this query is fast, since least function calls are involved,
Collection.find("this.start == this.end");
This is my object:
{ "_id" : ObjectId("53fdcb6796cb9b9aa86f05b9"), "list" : [ "a", "b" ], "complist" : [ { "a" : "a", "b" : "b" }, { "a" : "c", "b" : "d" } ] }
And this is what I want to accomplish: check if "list" contains a certain element and get only the field "a" from the objects on "complist" while reading the document regardless of any of these values. I'm building a forum system, this is the query that will return the details of a forum. I need to read the forum information while knowing if the user is in the forum's white list.
With a find I can use the query
db.itens.find({},{list:{$elemMatch:{$in:["a"]}}})
to get only the first element that matches a certain value. This way I can just check if the returned array is not empty and I know if "list" contains the value I'm looking for. I can't do it on the query because I want the document regardless of it containing the value I'm looking for in the "list" value. I need the document AND know if "list" has a certain value.
With an aggregate I can use the query
db.itens.aggregate({$project:{"complist.a":1}})
to read only the field "a" of the objects contained in complist. This is going to get the forum's threads basic information, I don't want all the information of the threads, just a couple of things.
But when I try to use the query
db.itens.aggregate({$project:{"complist.b":1,list:{$elemMatch:{$in:["a"]}}}})
to try and do both, it throws me an error saying the operator $elemMatch is not valid.
Am I doing something wrong here with the $elemMatch in aggregate? Is there a better way to accomplish this?
Quite on old question but literally none of the proposed answers are good.
TLDR:
You can't use $elemMatch in a $project stage. but you can achieve the same result using other aggregation operators like $filter.
db.itens.aggregate([
{
$project: {
compList: {
$filter: {
input: "$complist",
as: "item",
cond: {$eq: ["$$item.a", 1]}
}
}
}
}
])
And if you want just the first item from the array that matches the condition similarly to what $elemMatch does you can incorporate $arrayElemAt
In Depth Explanation:
First let's understand $elemMatch:
$elemMatch is a query expressions while also this projection version of it exists this refers to a query projection and not $project aggregation stage.
So what? what does this have to do with anything? well a $project stage has certain input structure it can have while the one we want to use is:
<field>: <expression>
What is a valid expression?
Expressions can include field paths, literals, system variables, expression objects, and expression operators. Expressions can be nested.
So we want to use an expression operator, but as you can see from the doc's $elemMatch is not part of it. hence it's not a valid expression to be used in an aggregation $project stage.
For some reason $elemMatch doesn't work in aggregations. You need to use the new $filter operator in Mongo 3.2. See https://docs.mongodb.org/manual/reference/operator/aggregation/filter/
The answer to this question maybe help.
db.collection_name.aggregate({
"$match": {
"complist": {
"$elemMatch": {
"a": "a"
}
}
}
});
Actually, the simplest solution is to just $unwind your array, then $match the appropriate documents. You can wind-up the appropriate documents again using $group and $push.
Although the question is old, here is my contribution for November 2017.
I had similar problem and doing two consecutive match operations worked for me. The code below is a subset of my whole code and I changed elements names, so it's not tested. Anyway this should point you in the right direction.
db.collection.aggregate([
{
"$match": {
"_id": "ID1"
}
},
{
"$unwind": "$sub_collection"
},
{
"$match": {
"sub_collection.field_I_want_to_match": "value"
}
}
])
For aggregations simply use $expr:
db.items.aggregate([
{
"$match": {
"$expr": {"$in": ["a", "$list"]}
}
},
])
Well, it happens you can use "array.field" on a find's projection block.
db.itens.find({},{"complist.b":1,list:{$elemMatch:{$in:["a"]}}})
did what I needed.