Getting the position of specific words in documents in mongoDB - mongodb

I’m using $indexOfCP for locating the index of some specific words.
Unfortunately, it only returns the first result. I want all the occurrences.
Plus, $indexOfCP is case sensitive and I want it to be case-insensitive.
here is an example:
db.inventory.aggregate(
[
{
$project:
{
cpLocation: { $indexOfCP: [ "$item", "foo" ] },
}
}
]
)
{ $indexOfCP: [ "cafeteria", "e" ] }
result: 3

You can use $regexFindAll which returns an array with the indexes in the key idx. So you can add an stage to get cpLocation.idx like this:
Also adding "options": "i" the regex is case insensitive.
db.collection.aggregate([
{
$project: {
cpLocation: {
"$regexFindAll": {
"input": "$item",
"regex": "e",
"options": "i"
}
},
}
},
{
"$addFields": {
cpLocation: "$cpLocation.idx"
}
}
])
Example here

Related

MongoDB: return a single field from the matched item of a sub-document array

Consider I have the following document structure:
{
"_id": ObjectID(),
"foo": "FOO",
"bar": "BAR",
"items": [
{
"foo": "FOO",
"bar": "BAR",
"name": "hello",
"value": "50"
},
{
"foo": "FOO",
"bar": "BAR",
"name": "bye",
"value": "300"
},
{
"foo": "FOO",
"bar": "BAR",
"name": "welcome",
"value": "500"
}
],
}
I would like to find all items that match both the following conditions:
name = "hello"
value != 0
And for each matched item I would like to return only the value field. I don't need all the other fields (foo/bar in this example).
So the ideal result should look like this:
[
{ value: "50" },
{ value: "100" },
{ value: "30" },
…
]
How do I do this with MongoDB?
I've tried this query:
// filter
{
items: {
$elemMatch: {
name: "hello",
value: { $ne: "0" },
},
}
}
// projection
{
"_id": 0,
"items.$": 1
}
It matches the items correctly, but it returns the whole items and I want only a single field from it.
Sadly, I can't use projection like this: "items.$.value": 1.
I've also tried the following aggregation:
{
$unwind: {
path: "$items"
}
}
{
$match: {
"items.name": "hello",
"items.value": { $ne: "0" },
}
}
{
$replaceRoot: {
newRoot: "$items"
}
}
{
$project: {
"value": 1
}
}
It works perfectly and returns the expected result, but I have a feeling that it will have poorer performance.
Is there a way to achieve what I want with optimal performance?
Maybe something like this:
db.collection.aggregate([
{
$match: {
items: {
$elemMatch: {
"name": "hello",
"value": {
$ne: "0"
}
}
}
}
},
{
$project: {
items: {
"$map": {
input: {
"$filter": {
"input": "$items",
"as": "i",
"cond": {
$and: [
{
$ne: [
"$$i.value",
0
]
},
{
$eq: [
"$$i.name",
"hello"
]
}
]
}
}
},
as: "m",
"in": {
"value": "$$m.value"
}
}
}
}
},
{
$unwind: "$items"
},
{
"$replaceRoot": {
"newRoot": "$items"
}
}
])
Explained:
Match only documents having at least 1x items.element with name:"hello" and value!=0 ( good to have index on items.name+items.value , this match stage is expected to reduce the data that you want to pass to the next stages -> less data = better performance )
Filter only the values for matched items in the project stage.
( This will remove unnecessary items array sub-items , again less data = better performance )
Unwind only the already filtered ( keeping unwind in the later stages will save alot of resources if the collection is big ... )
replace the root with the necessary values only ( this is to have the output as in expected format )
Playground
Indeed as identified simple match stage will not provide correct results and $elemMatch must be used in the first $match stage ...

MongoDB Aggregation: How to check if an object containing multiple properties exists in an array

I have an array of objects and I want to check if there is an object that matches multiple properties. I have tried using $in and $and but it does not work the way I want it to.
Here is my current implementation.
I have an array like
"choices": [
{
"name": "choiceA",
"id": 0,
"l": "k"
},
{
"name": "choiceB",
"id": 1,
"l": "j"
},
{
"name": "choiceC",
"id": 2,
"l": "l"
}
]
I am trying to write aggregation code that can check if there is an object that contains both "id":2 and "l":"j" properties. My current implementation checks if there is an object containing the first property then checks if there is an object containing the second one.
How can I get my desired results?
Below, see my aggregation query. The full code is here
db.poll.aggregate([
{
"$match": {
"_id": 100
}
},
{
$project: {
numberOfVotes: {
$and: [
{
$in: [
2,
"$choices.id"
]
},
{
$in: [
"j",
"$choices.l"
]
}
]
},
}
}
])
The above query returns true yet there is no object in the array both of the properties id:2 and "l":"J". I know the code works as expected. How can I get my desired results?
You want to use something like $elemMatch
db.collection.find({
choices: {
$elemMatch: {
id: 2,
l: "j"
}
}
})
MongoPlayground
EDIT
In an aggregation $project stage I would use $filter
db.poll.aggregate([
{
"$match": {
"_id": 100
}
},
{
$project: {
numberOfVotes: {
$gt: [
{
$size: {
$filter: {
input: "$choices",
as: "choice",
cond: {
$and: [
{
$eq: [
"$$choice.id",
2
]
},
{
$eq: [
"$$choice.l",
"j"
]
}
]
}
}
}
},
0
]
}
}
}
])
MongoPlayground

Match in an array within the current document

In an aggregation pipeline operating on documents like
{
"availablePackages": [
{
"title": "Silver",
"code": "001",
},
{
"title": "Gold",
"code": "002",
},
{
"title": "Platinum",
"code": "003",
},
"selectedPackageCode": "002"
}
I need to replace everything in the above document with the title of the package whose code matches the selectedPackageCode. So I want to the pipeline to end up with
{
"packageTitle": "Gold"
}
This is not a lookup, because it's in the current document. I thought I might be able to use $let to create a variable and then a $match to find the right array element, but I have not found a syntax that works.
You need $filter to match availablePackages with selectedPackageCode and $arrayElemAt to get first matching element. In order to make it in one aggregation stage you can use $let to define temporary variable:
db.col.aggregate([
// ... other stages
{
$project: {
packageTitle: {
$let: {
vars: {
selectedPackage: {
$arrayElemAt: [
{ $filter: { input: "$availablePackages", cond: { $eq: [ "$$this.code", "$selectedPackageCode" ] } } }, 0
]
}
},
in: "$$selectedPackage.title"
}
}
}
}
])

MongoDB: Ordered matching of array entries

I have a document which looks like this:
"tokens":
[
{
"index": 1,
"word": "I",
"pos": "NNP",
},
{
"index": 2,
"word": "played",
"pos": "VBZ",
},
{
"index": 3,
"word": "football",
"pos": "IN",
}
]
And my query is:
db.test.find({
$and: [
{
'tokens.word': 'I'
},
{
tokens: {
$elemMatch: {
word: /f.*/,
pos: 'IN'
}
}
}
]
})
The output of my query is the document above. But the result should be no match in this case as I'm searching for
word: "I" followed by [word: /f.*/ and pos:'IN']
which doesn't match the tokens array in the document since the token I is followed by played and then football. However, in the query the order of the filters is different as the searching is started with
word: "I"
followed by
f.*
[football in this case].
The $and operator is a purely logical one where the order of the filter conditions does not play any role.
So the following queries are absolutely equivalent from a result set point of view:
$and: [
{ "a": 1 },
{ "b": 2 }
]
$and: [
{ "b": 2 },
{ "a": 1 }
]
The documentation states:
$and performs a logical AND operation on an array of two or more
expressions (e.g. , , etc.) and selects the
documents that satisfy all the expressions in the array. The $and
operator uses short-circuit evaluation. If the first expression (e.g.
) evaluates to false, MongoDB will not evaluate the
remaining expressions.
In your example, the "index1" entry matches the first filter "tokens.word": "I" and the "index3" document matches the second $elemMatch filter. So the document has to get returned.
UPDATE:
Here is an idea - more of a starting point really - for you to get closer to what you want:
db.collection.aggregate({
$addFields: { // create a temporary field
"differenceBetweenMatchPositions": { // called "differenceBetweenMatchPositions"
$subtract: [ // which holds the difference between
{ $indexOfArray: [ "$tokens.word", "I" ] }, // the index of first element matching first condition and
{ $min: { // the lowest index of
$filter: { // a filtered array
input: { $range: [ 0, { $size: "$tokens" } ] }, // that holds the indices [ 0, 1, 2, ..., n ] where n is the number of items in the "tokens" array - 1
as: "this", // which we want to access using "$$this"
cond: {
$let: {
vars: { "elem": { $arrayElemAt: [ "$tokens", "$$this" ] } }, // get the n-th element in our "tokens" array
in: {
$and: [
{ $eq: [ "$$elem.pos", "IN" ] }, // "pos" field must be "IN"
{ $eq: [ { $substrBytes: [ "$$elem.word", 0, 1 ] }, "f" ] } // 1st character of the "word" must be "f"
]
}
}
}
}
}
}]
}
}
}, {
$match: {
"differenceBetweenMatchPositions": { $lt: 0 } // we only want documents where the first condition is matched by an item in our array before the second one gets matched, too.
}
})

select documents with sub arrays that match some critieria

I have a collections with documents such as:
{
_id: "1234",
_class: "com.acme.classA",
a_collection: [
{
otherdata: 'somedata',
type: 'a'
},
{
otherdata: 'bar',
type: 'a'
},
{
otherdata: 'foo',
type: 'b'
}
],
lastChange: ISODate("2014-08-17T22:25:48.918Z")
}
I want to find all document by id and a subset of the sub array. for example I want to find all documents with id "1234" and a_collection.type is 'a' giving this result:
{
_id: "1234",
_class: "com.acme.classA",
a_collection: [
{
otherdata: 'somedata',
type: 'a'
},
{
otherdata: 'bar',
type: 'a'
}
],
lastChange: ISODate("2014-08-17T22:25:48.918Z")
}
I have tried this :
db.collection_name.aggregate({
$match: {
'a_collection.type': 'a'
}
},
{
$unwind: "$a_collection"
},
{
$match: {
"a_collection.type": 'a'
}
},
{
$group: {
_id: "$_id",
a_collection: {
$addToSet: "$a_collection"
},
}
}).pretty()
but this doesnt return other properties ( such as 'lastChange' )
what is the correct way to do this ?
Are you using PHP?
And is this the only way you can get the "text"?
maybe you can rewrite it that it is like an JSON element.
something like that:
{
"_id": "1234",
"_class": "com.acme.classA",
"a_collection": [
{
"otherdata": "somedata",
"type": "a"
},
{
"otherdata": "bar",
"type": "a"
},
{
"otherdata": "foo",
"type": "b"
}
]
}
Then you can use the json_decode() function from PHP to make an array and then you can search and return only the needed data.
Edit: I read read false. do you search for a funktion like this?
db.inventory.find( {
$or: [ { _id: "1234" }, { 'a_collection.type': 'a' }]
} )
[Here][1] I found the code ;) [1]: http://docs.mongodb.org/manual/tutorial/query-documents/
this is the correct query:
db.collection_name.aggregate({
$match: {
'a_collection.type': 'a'
}
},
{
$unwind: "$a_collection"
},
{
$match: {
"a_collection.type": 'a'
}
},
{
$group: {
_id: "$_id",
a_collection: {
$addToSet: "$a_collection"
},
lastChange : { $first : "$lastChange" }
}
}).pretty()
Something is very strange about your desired query (and your pipelines). First of all, _id is a reserved field with a unique index on it. The result of finding all documents with _id = "1234" can only be 0 or 1 documents. Second, to find documents with a_collection.type = "a" for some element of the array a_collection, you don't need the aggregation framework. You just need a find query:
> db.test.find({ "a_collection.type" : "a" })
So all the work here appears to be winnowing the subarray of one document down to just those elements with a_collection.type = "a". Why do you have these objects in the same document if most of what you do is split them up and eliminate some to find a result set? How common and how truly necessary is it to harvest just the array elements with a_collection.type = "a"? Perhaps you want to model your data differently so a query like
> db.test.find({ <some condition>, "a_collection.type" : "a" })
returns you the correct documents. I can't say how you can do it best with the given information, but I can say that your current approach strongly suggests revision is needed (and I'm happy to help with suggestions if you include further information or post a new question).
I would agree with the answer you have submitted yourself, but for that in MongoDB 2.6 and greater there is a better way to do this with $map and $setDifference. Which wer both introduced at that version. But where available, this is much faster in the approach:
db.collection.aggregate([
{ "$match": { "a_collection.type": "a" } },
{ "$project": {
"$setDifference": [
{ "$map": [
"input": "$a_collection",
"as": "el",
"in": {
"$cond": [
{ "$eq": [ "$$el.type", "a" ] },
"$$el",
false
]
}
]},
[false]
]
}}
])
So that has no "group" or initial "unwind" which both can be costly options, along with the $match stage. So MongoDB 2.6 does it better.