I have a document which looks like this:
"tokens":
[
{
"index": 1,
"word": "I",
"pos": "NNP",
},
{
"index": 2,
"word": "played",
"pos": "VBZ",
},
{
"index": 3,
"word": "football",
"pos": "IN",
}
]
And my query is:
db.test.find({
$and: [
{
'tokens.word': 'I'
},
{
tokens: {
$elemMatch: {
word: /f.*/,
pos: 'IN'
}
}
}
]
})
The output of my query is the document above. But the result should be no match in this case as I'm searching for
word: "I" followed by [word: /f.*/ and pos:'IN']
which doesn't match the tokens array in the document since the token I is followed by played and then football. However, in the query the order of the filters is different as the searching is started with
word: "I"
followed by
f.*
[football in this case].
The $and operator is a purely logical one where the order of the filter conditions does not play any role.
So the following queries are absolutely equivalent from a result set point of view:
$and: [
{ "a": 1 },
{ "b": 2 }
]
$and: [
{ "b": 2 },
{ "a": 1 }
]
The documentation states:
$and performs a logical AND operation on an array of two or more
expressions (e.g. , , etc.) and selects the
documents that satisfy all the expressions in the array. The $and
operator uses short-circuit evaluation. If the first expression (e.g.
) evaluates to false, MongoDB will not evaluate the
remaining expressions.
In your example, the "index1" entry matches the first filter "tokens.word": "I" and the "index3" document matches the second $elemMatch filter. So the document has to get returned.
UPDATE:
Here is an idea - more of a starting point really - for you to get closer to what you want:
db.collection.aggregate({
$addFields: { // create a temporary field
"differenceBetweenMatchPositions": { // called "differenceBetweenMatchPositions"
$subtract: [ // which holds the difference between
{ $indexOfArray: [ "$tokens.word", "I" ] }, // the index of first element matching first condition and
{ $min: { // the lowest index of
$filter: { // a filtered array
input: { $range: [ 0, { $size: "$tokens" } ] }, // that holds the indices [ 0, 1, 2, ..., n ] where n is the number of items in the "tokens" array - 1
as: "this", // which we want to access using "$$this"
cond: {
$let: {
vars: { "elem": { $arrayElemAt: [ "$tokens", "$$this" ] } }, // get the n-th element in our "tokens" array
in: {
$and: [
{ $eq: [ "$$elem.pos", "IN" ] }, // "pos" field must be "IN"
{ $eq: [ { $substrBytes: [ "$$elem.word", 0, 1 ] }, "f" ] } // 1st character of the "word" must be "f"
]
}
}
}
}
}
}]
}
}
}, {
$match: {
"differenceBetweenMatchPositions": { $lt: 0 } // we only want documents where the first condition is matched by an item in our array before the second one gets matched, too.
}
})
Related
I need to match all documents where field arr, which is an array of object Ids, has same elements of a given one no matther the position.
I tried:
[
{
$match:{
arr: givenArr
}
}
]
or
[
{
$match:{
arr: {
$in: givenArr
}
}
}
]
The first pipeline matches all the documents that have same elements in the same position.
The second pipeline matches all the documents that has at least one element in the given array.
For example if I have a couple of documents like:
[
{
_id: ObjectId("639a0e4cc0f6595d90a84de4"),
arr: [
ObjectId("639a0e4cc0f6595d90a84de1"),
ObjectId("639a0e4cc0f6595d90a84de2"),
ObjectId("639a0e4cc0f6595d90a84de3"),
]
},
{
_id: ObjectId("639a0e4cc0f6595d90a84de5"),
arr: [
ObjectId("639a0e4cc0f6595d90a84de7"),
ObjectId("639a0e4cc0f6595d90a84de8"),
ObjectId("639a0e4cc0f6595d90a84de9"),
]
},
]
If I need to match all of those documents that have arr same as
[
ObjectId("639a0e4cc0f6595d90a84de8"),
ObjectId("639a0e4cc0f6595d90a84de9"),
ObjectId("639a0e4cc0f6595d90a84de7"),
]
I want to get only the second document.
How could I do that?
You could treat each array as a set and use "$setEquals".
db.collection.find({
$expr: {
$setEquals: [
"$arr",
[
ObjectId("639a0e4cc0f6595d90a84de8"),
ObjectId("639a0e4cc0f6595d90a84de9"),
ObjectId("639a0e4cc0f6595d90a84de7")
]
]
}
})
Try it on mongoplayground.net.
You can compare the sorted results of your 2 arrays computed by $sortArray
db.collection.find({
$expr: {
$eq: [
{
$sortArray: {
input: "$arr",
sortBy: 1
}
},
{
$sortArray: {
input: [
ObjectId("639a0e4cc0f6595d90a84de8"),
ObjectId("639a0e4cc0f6595d90a84de9"),
ObjectId("639a0e4cc0f6595d90a84de7"),
],
sortBy: 1
}
}
]
}
})
Mongo Playground
I’m using $indexOfCP for locating the index of some specific words.
Unfortunately, it only returns the first result. I want all the occurrences.
Plus, $indexOfCP is case sensitive and I want it to be case-insensitive.
here is an example:
db.inventory.aggregate(
[
{
$project:
{
cpLocation: { $indexOfCP: [ "$item", "foo" ] },
}
}
]
)
{ $indexOfCP: [ "cafeteria", "e" ] }
result: 3
You can use $regexFindAll which returns an array with the indexes in the key idx. So you can add an stage to get cpLocation.idx like this:
Also adding "options": "i" the regex is case insensitive.
db.collection.aggregate([
{
$project: {
cpLocation: {
"$regexFindAll": {
"input": "$item",
"regex": "e",
"options": "i"
}
},
}
},
{
"$addFields": {
cpLocation: "$cpLocation.idx"
}
}
])
Example here
I have an array of objects and I want to check if there is an object that matches multiple properties. I have tried using $in and $and but it does not work the way I want it to.
Here is my current implementation.
I have an array like
"choices": [
{
"name": "choiceA",
"id": 0,
"l": "k"
},
{
"name": "choiceB",
"id": 1,
"l": "j"
},
{
"name": "choiceC",
"id": 2,
"l": "l"
}
]
I am trying to write aggregation code that can check if there is an object that contains both "id":2 and "l":"j" properties. My current implementation checks if there is an object containing the first property then checks if there is an object containing the second one.
How can I get my desired results?
Below, see my aggregation query. The full code is here
db.poll.aggregate([
{
"$match": {
"_id": 100
}
},
{
$project: {
numberOfVotes: {
$and: [
{
$in: [
2,
"$choices.id"
]
},
{
$in: [
"j",
"$choices.l"
]
}
]
},
}
}
])
The above query returns true yet there is no object in the array both of the properties id:2 and "l":"J". I know the code works as expected. How can I get my desired results?
You want to use something like $elemMatch
db.collection.find({
choices: {
$elemMatch: {
id: 2,
l: "j"
}
}
})
MongoPlayground
EDIT
In an aggregation $project stage I would use $filter
db.poll.aggregate([
{
"$match": {
"_id": 100
}
},
{
$project: {
numberOfVotes: {
$gt: [
{
$size: {
$filter: {
input: "$choices",
as: "choice",
cond: {
$and: [
{
$eq: [
"$$choice.id",
2
]
},
{
$eq: [
"$$choice.l",
"j"
]
}
]
}
}
}
},
0
]
}
}
}
])
MongoPlayground
I have a collection with a structure like this:
{
"toystore": 22,
"toystore_name": "Toystore A",
"toys": [
{
"toy": "buzz",
"code": 17001,
"price": 500
},
{
"toy": "woddy",
"code": 17002,
"price": 1000
},
{
"toy": "pope",
"code": 17003,
"price": 300
}
]
},
{
"toystore": 11,
"toystore_name": "Toystore B",
"toys": [
{
"toy": "jessie",
"code": 17005,
"price": 500
},
{
"toy": "rex",
"code": 17006,
"price": 2000
}
]
}
]
I have n toy stores, and within each toy store I have the toys that this store has available within the toys field (is an array).
There may be repeated codes that I want to search for
[ { "toys.code": 17001 }, { "toys.code": 17003 }, { "toys.code": 17005 }, { "toys.code": 17005 }]
and I want the result to be generated by each of these toys.code no matter if they are repeated, currently the result is not repeated (for example with the code 17005)
this is my current output:
[
{
"_id": "Toystore A",
"toy_array": [
{
"price_original": 500,
"toy": "buzz"
},
{
"price_original": 300,
"toy": "pope"
}
]
},
{
"_id": "Toystore B",
"toy_array": [
//**********
//as i searched 2 times the code:17005, this object should be shown 2 times. only is showed 1 time.
{
"price_original": 500,
"toy": "jessie"
}
]
}
]
how can I get a result to return for every match in my array?
this is my live code:
db.collection.aggregate([
{
$unwind: "$toys"
},
{
$match: {
$or: [
{
"toys.code": 17001
},
{
"toys.code": 17003
},
{
"toys.code": 17005
},
{
"toys.code": 17005
}
],
}
},
{
$group: {
_id: "$toystore_name",
toy_array: {
$push: {
price_original: "$toys.price",
toy: "$toys.toy"
},
},
},
},
])
https://mongoplayground.net/p/g1-oST015y0
The $match stage examines each document in the pipeline and evaluates the provided criteria, and either eliminates the document, or passes it along to the next stage. It does not iterate the match criteria and examine the entire stream of documents for each one, which is what needs to happen in order to duplicate the document that is referenced twice.
This can be done, but you will need to pass the array of codes twice in the pipeline, once to eliminate documents that don't match at all, and again to allow the duplication you are looking for.
The stages needed are:
$match to eliminate toy store that don't have any of the requested toy
$project using
o $map to iterate the search array
o $filter to selection matching toys
o $reduce to eliminate empty arrays, and recombine the entries into a single array
an additional $project to remove the codes from toy_array
var codearray = [17001, 17003, 17005, 17005];
db.collection.aggregate([
{$match: {"toys.code": {$in: codearray }}},
{$project: {
_id: "$toystore_name",
toy_array: {
$reduce: {
input: {
$map: {
input: codearray,
as: "qcode",
in: {
$filter: {
input: "$toys",
as: "toy",
cond: {$eq: [ "$$toy.code","$$qcode" ]}
}
}
}
},
initialValue: [],
in: {
$cond: {
if: {$eq: ["$$this",[]]},
then: "$$value",
else: {$concatArrays: ["$$value", "$$this"]}
}
}
}
}
}},
{$project: {"toy_array.code": 0}}
])
Playground
Example dataset:
{
"source": "http://adress.com/",
"date": ISODate("2016-08-31T08:41:00.000Z"),
"author": "Some Guy",
"thread": NumberInt(115265),
"commentID": NumberInt(2693454),
"title": ["A", "title", "for", "a", "comment"],
"comment": ["This", "is", "a", "comment", "with", "a", "duplicate"]
}
The dataset I'm using is basically a comment from a user, with a unique commentID. The comment itself is held as an array of words. I've managed to unwind the array, match the buzzword and get back all finds.
My problem now is getting rid of duplicates, where buzzwords show up several times in a comment. I suppose I have to use a group, but can't find a way to do it.
The current pipeline is:
[
{"$unwind": "$comment"},
{"$match": {"comment": buzzword } }
]
Which does work just fine. But if I'm searching for the buzzword "a", in the above example it will find the comment twice, as the word "a" shows up twice.
What I need is a JSON for the pipeline to drop all duplicates past the first.
You could run a single pipeline without $unwind that takes advantage of the array operators $arrayElemAt and $filter. The former will give you the first element in a given array and this array will be a result of filtering elements using the latter, $filter.
Follow this example to get the desired result:
db.collection.aggregate([
{ "$match": { "comment": buzzword } },
{
"$project": {
"source": 1,
"date": 1,
"author": 1,
"thread": 1,
"commentID": 1,
"title": 1,
"comment": 1,
"distinct_matched_comment": {
"$arrayElemAt": [
{
"$filter": {
"input": "$comment",
"as": "word",
"cond": {
"$eq": ["$$word", buzzword]
}
}
}, 0
]
}
}
}
])
Explanations
In the above pipeline, the trick is to first filter the comment array by selecting just the elements which satisfy a given criteria. For example, to demonstrate this concept, run this pipeline:
db.collection.aggregate([
{
"$project": {
"filtered_comment": {
"$filter": {
"input": ["This", "is", "a", "comment", "with", "a", "duplicate"], /* hardcoded input array for demo */
"as": "word", /* The variable name for the element in the input array.
The as expression accesses each element in the input array by this variable.*/
"cond": { /* this condition determines whether to include the element in the resulting array. */
"$eq": ["$$word", "a"] /* condition where the variable equals the buzzword "a" */
}
}
}
}
}
])
Output
{
"_id" : ObjectId("57dbd747be80cdcab63703dc"),
"filtered_comment" : [
"a",
"a"
]
}
As the $filter's input parameter accepts an expression that resolves to an array, you can use an array field instead.
Taking the result above further, we can show how the $arrayElemAt operator works:
db.collection.aggregate([
{
"$project": {
"distinct_matched_comment": {
"$arrayElemAt": [
["a", "a"], /* array produced by the above $filter expression */
0 /* the index position of the element we want to return, here being the first */
]
}
}
}
])
Output
{
"_id" : ObjectId("57dbd747be80cdcab63703dc"),
"distinct_matched_comment": "a"
}
Since the expression in the $arrayElemAt operator
{ "$arrayElemAt": [ <array>, <idx> ] }
can be any valid expression as long as it resolves to an array, you can combine the $filter expression from the beginning of this example as the array expression since it returns an array thus your final pipeline will look like:
db.collection.aggregate([
{
"$project": {
"distinct_matched_comment": {
"$arrayElemAt": [
{ /* expression that produces an array with elements that match a condition */
"$filter": {
"input": "$comment",
"as": "word",
"cond": {
"$eq": ["$$word", buzzword]
}
}
},
0 /* the index position of the element we want to return, here being the first */
]
}
}
}
])
One possible solution could be with $group like so
...
{ $unwind: "$comment"},
{ $match: {"comment": buzzword } },
{
$group: {
_id : "$_id",
source: { $first: "$source" },
date: { $first: "$date" },
author: { $first: "$author" },
thread: { $first: "$thread" },
commentID: { $first: "$commentID" },
title: { $first: "$title" }
}
}
...
Another way would be to use $project prior unwinding the array in order to get rid of the duplicate words like so
...
{
$project: {
source: 1,
date: 1,
author: 1,
thread: 1,
commentID: 1,
title: 1,
comment: { $setUnion: ["$comment"] }
}
},
{$unwind: "$comment"},
{$match: {"comment": buzzword } }
...
Update due to comment:
To retain the comment array you could project the array to another field and unwind that instead like so
...
{
$project: {
source: 1,
date: 1,
author: 1,
thread: 1,
commentID: 1,
title: 1,
comment: 1,
commentWord: { $setUnion: ["$comment"] }
}
},
{$unwind: "$commentWord"},
{$match: {"commentWord": buzzword } }
...
Hope that helps