Getting the position of specific words in documents in mongoDB

Getting the position of specific words in documents in mongoDB - mongodb

I’m using $indexOfCP for locating the index of some specific words.
Unfortunately, it only returns the first result. I want all the occurrences.
Plus, $indexOfCP is case sensitive and I want it to be case-insensitive.
here is an example:
db.inventory.aggregate(
[
{
$project:
{
cpLocation: { $indexOfCP: [ "$item", "foo" ] },
}
}
]
)
{ $indexOfCP: [ "cafeteria", "e" ] }
result: 3

You can use $regexFindAll which returns an array with the indexes in the key idx. So you can add an stage to get cpLocation.idx like this:
Also adding "options": "i" the regex is case insensitive.
db.collection.aggregate([
{
$project: {
cpLocation: {
"$regexFindAll": {
"input": "$item",
"regex": "e",
"options": "i"
}
},
}
},
{
"$addFields": {
cpLocation: "$cpLocation.idx"
}
}
])
Example here

Related

MongoDB: return a single field from the matched item of a sub-document array

Consider I have the following document structure:
{
"_id": ObjectID(),
"foo": "FOO",
"bar": "BAR",
"items": [
{
"foo": "FOO",
"bar": "BAR",
"name": "hello",
"value": "50"
},
{
"foo": "FOO",
"bar": "BAR",
"name": "bye",
"value": "300"
},
{
"foo": "FOO",
"bar": "BAR",
"name": "welcome",
"value": "500"
}
],
}
I would like to find all items that match both the following conditions:
name = "hello"
value != 0
And for each matched item I would like to return only the value field. I don't need all the other fields (foo/bar in this example).
So the ideal result should look like this:
[
{ value: "50" },
{ value: "100" },
{ value: "30" },
…
]
How do I do this with MongoDB?
I've tried this query:
// filter
{
items: {
$elemMatch: {
name: "hello",
value: { $ne: "0" },
},
}
}
// projection
{
"_id": 0,
"items.$": 1
}
It matches the items correctly, but it returns the whole items and I want only a single field from it.
Sadly, I can't use projection like this: "items.$.value": 1.
I've also tried the following aggregation:
{
$unwind: {
path: "$items"
}
}
{
$match: {
"items.name": "hello",
"items.value": { $ne: "0" },
}
}
{
$replaceRoot: {
newRoot: "$items"
}
}
{
$project: {
"value": 1
}
}
It works perfectly and returns the expected result, but I have a feeling that it will have poorer performance.
Is there a way to achieve what I want with optimal performance?

Maybe something like this:
db.collection.aggregate([
{
$match: {
items: {
$elemMatch: {
"name": "hello",
"value": {
$ne: "0"
}
}
}
}
},
{
$project: {
items: {
"$map": {
input: {
"$filter": {
"input": "$items",
"as": "i",
"cond": {
$and: [
{
$ne: [
"$$i.value",
0
]
},
{
$eq: [
"$$i.name",
"hello"
]
}
]
}
}
},
as: "m",
"in": {
"value": "$$m.value"
}
}
}
}
},
{
$unwind: "$items"
},
{
"$replaceRoot": {
"newRoot": "$items"
}
}
])
Explained:
Match only documents having at least 1x items.element with name:"hello" and value!=0 ( good to have index on items.name+items.value , this match stage is expected to reduce the data that you want to pass to the next stages -> less data = better performance )
Filter only the values for matched items in the project stage.
( This will remove unnecessary items array sub-items , again less data = better performance )
Unwind only the already filtered ( keeping unwind in the later stages will save alot of resources if the collection is big ... )
replace the root with the necessary values only ( this is to have the output as in expected format )
Playground
Indeed as identified simple match stage will not provide correct results and $elemMatch must be used in the first $match stage ...

MongoDB Aggregation: How to check if an object containing multiple properties exists in an array

I have an array of objects and I want to check if there is an object that matches multiple properties. I have tried using $in and $and but it does not work the way I want it to.
Here is my current implementation.
I have an array like
"choices": [
{
"name": "choiceA",
"id": 0,
"l": "k"
},
{
"name": "choiceB",
"id": 1,
"l": "j"
},
{
"name": "choiceC",
"id": 2,
"l": "l"
}
]
I am trying to write aggregation code that can check if there is an object that contains both "id":2 and "l":"j" properties. My current implementation checks if there is an object containing the first property then checks if there is an object containing the second one.
How can I get my desired results?
Below, see my aggregation query. The full code is here
db.poll.aggregate([
{
"$match": {
"_id": 100
}
},
{
$project: {
numberOfVotes: {
$and: [
{
$in: [
2,
"$choices.id"
]
},
{
$in: [
"j",
"$choices.l"
]
}
]
},
}
}
])
The above query returns true yet there is no object in the array both of the properties id:2 and "l":"J". I know the code works as expected. How can I get my desired results?

You want to use something like $elemMatch
db.collection.find({
choices: {
$elemMatch: {
id: 2,
l: "j"
}
}
})
MongoPlayground
EDIT
In an aggregation $project stage I would use $filter
db.poll.aggregate([
{
"$match": {
"_id": 100
}
},
{
$project: {
numberOfVotes: {
$gt: [
{
$size: {
$filter: {
input: "$choices",
as: "choice",
cond: {
$and: [
{
$eq: [
"$$choice.id",
2
]
},
{
$eq: [
"$$choice.l",
"j"
]
}
]
}
}
}
},
0
]
}
}
}
])
MongoPlayground

Match in an array within the current document

In an aggregation pipeline operating on documents like
{
"availablePackages": [
{
"title": "Silver",
"code": "001",
},
{
"title": "Gold",
"code": "002",
},
{
"title": "Platinum",
"code": "003",
},
"selectedPackageCode": "002"
}
I need to replace everything in the above document with the title of the package whose code matches the selectedPackageCode. So I want to the pipeline to end up with
{
"packageTitle": "Gold"
}
This is not a lookup, because it's in the current document. I thought I might be able to use $let to create a variable and then a $match to find the right array element, but I have not found a syntax that works.

You need $filter to match availablePackages with selectedPackageCode and $arrayElemAt to get first matching element. In order to make it in one aggregation stage you can use $let to define temporary variable:
db.col.aggregate([
// ... other stages
{
$project: {
packageTitle: {
$let: {
vars: {
selectedPackage: {
$arrayElemAt: [
{ $filter: { input: "$availablePackages", cond: { $eq: [ "$$this.code", "$selectedPackageCode" ] } } }, 0
]
}
},
in: "$$selectedPackage.title"
}
}
}
}
])

MongoDB: Ordered matching of array entries

I have a document which looks like this:
"tokens":
[
{
"index": 1,
"word": "I",
"pos": "NNP",
},
{
"index": 2,
"word": "played",
"pos": "VBZ",
},
{
"index": 3,
"word": "football",
"pos": "IN",
}
]
And my query is:
db.test.find({
$and: [
{
'tokens.word': 'I'
},
{
tokens: {
$elemMatch: {
word: /f.*/,
pos: 'IN'
}
}
}
]
})
The output of my query is the document above. But the result should be no match in this case as I'm searching for
word: "I" followed by [word: /f.*/ and pos:'IN']
which doesn't match the tokens array in the document since the token I is followed by played and then football. However, in the query the order of the filters is different as the searching is started with
word: "I"
followed by
f.*
[football in this case].

The $and operator is a purely logical one where the order of the filter conditions does not play any role.
So the following queries are absolutely equivalent from a result set point of view:
$and: [
{ "a": 1 },
{ "b": 2 }
]
$and: [
{ "b": 2 },
{ "a": 1 }
]
The documentation states:
$and performs a logical AND operation on an array of two or more
expressions (e.g. , , etc.) and selects the
documents that satisfy all the expressions in the array. The $and
operator uses short-circuit evaluation. If the first expression (e.g.
) evaluates to false, MongoDB will not evaluate the
remaining expressions.
In your example, the "index1" entry matches the first filter "tokens.word": "I" and the "index3" document matches the second $elemMatch filter. So the document has to get returned.
UPDATE:
Here is an idea - more of a starting point really - for you to get closer to what you want:
db.collection.aggregate({
$addFields: { // create a temporary field
"differenceBetweenMatchPositions": { // called "differenceBetweenMatchPositions"
$subtract: [ // which holds the difference between
{ $indexOfArray: [ "$tokens.word", "I" ] }, // the index of first element matching first condition and
{ $min: { // the lowest index of
$filter: { // a filtered array
input: { $range: [ 0, { $size: "$tokens" } ] }, // that holds the indices [ 0, 1, 2, ..., n ] where n is the number of items in the "tokens" array - 1
as: "this", // which we want to access using "$$this"
cond: {
$let: {
vars: { "elem": { $arrayElemAt: [ "$tokens", "$$this" ] } }, // get the n-th element in our "tokens" array
in: {
$and: [
{ $eq: [ "$$elem.pos", "IN" ] }, // "pos" field must be "IN"
{ $eq: [ { $substrBytes: [ "$$elem.word", 0, 1 ] }, "f" ] } // 1st character of the "word" must be "f"
]
}
}
}
}
}
}]
}
}
}, {
$match: {
"differenceBetweenMatchPositions": { $lt: 0 } // we only want documents where the first condition is matched by an item in our array before the second one gets matched, too.
}
})

select documents with sub arrays that match some critieria

I have a collections with documents such as:
{
_id: "1234",
_class: "com.acme.classA",
a_collection: [
{
otherdata: 'somedata',
type: 'a'
},
{
otherdata: 'bar',
type: 'a'
},
{
otherdata: 'foo',
type: 'b'
}
],
lastChange: ISODate("2014-08-17T22:25:48.918Z")
}
I want to find all document by id and a subset of the sub array. for example I want to find all documents with id "1234" and a_collection.type is 'a' giving this result:
{
_id: "1234",
_class: "com.acme.classA",
a_collection: [
{
otherdata: 'somedata',
type: 'a'
},
{
otherdata: 'bar',
type: 'a'
}
],
lastChange: ISODate("2014-08-17T22:25:48.918Z")
}
I have tried this :
db.collection_name.aggregate({
$match: {
'a_collection.type': 'a'
}
},
{
$unwind: "$a_collection"
},
{
$match: {
"a_collection.type": 'a'
}
},
{
$group: {
_id: "$_id",
a_collection: {
$addToSet: "$a_collection"
},
}
}).pretty()
but this doesnt return other properties ( such as 'lastChange' )
what is the correct way to do this ?

Are you using PHP?
And is this the only way you can get the "text"?
maybe you can rewrite it that it is like an JSON element.
something like that:
{
"_id": "1234",
"_class": "com.acme.classA",
"a_collection": [
{
"otherdata": "somedata",
"type": "a"
},
{
"otherdata": "bar",
"type": "a"
},
{
"otherdata": "foo",
"type": "b"
}
]
}
Then you can use the json_decode() function from PHP to make an array and then you can search and return only the needed data.
Edit: I read read false. do you search for a funktion like this?
db.inventory.find( {
$or: [ { _id: "1234" }, { 'a_collection.type': 'a' }]
} )
[Here][1] I found the code ;) [1]: http://docs.mongodb.org/manual/tutorial/query-documents/

this is the correct query:
db.collection_name.aggregate({
$match: {
'a_collection.type': 'a'
}
},
{
$unwind: "$a_collection"
},
{
$match: {
"a_collection.type": 'a'
}
},
{
$group: {
_id: "$_id",
a_collection: {
$addToSet: "$a_collection"
},
lastChange : { $first : "$lastChange" }
}
}).pretty()

Something is very strange about your desired query (and your pipelines). First of all, _id is a reserved field with a unique index on it. The result of finding all documents with _id = "1234" can only be 0 or 1 documents. Second, to find documents with a_collection.type = "a" for some element of the array a_collection, you don't need the aggregation framework. You just need a find query:
> db.test.find({ "a_collection.type" : "a" })
So all the work here appears to be winnowing the subarray of one document down to just those elements with a_collection.type = "a". Why do you have these objects in the same document if most of what you do is split them up and eliminate some to find a result set? How common and how truly necessary is it to harvest just the array elements with a_collection.type = "a"? Perhaps you want to model your data differently so a query like
> db.test.find({ <some condition>, "a_collection.type" : "a" })
returns you the correct documents. I can't say how you can do it best with the given information, but I can say that your current approach strongly suggests revision is needed (and I'm happy to help with suggestions if you include further information or post a new question).

I would agree with the answer you have submitted yourself, but for that in MongoDB 2.6 and greater there is a better way to do this with $map and $setDifference. Which wer both introduced at that version. But where available, this is much faster in the approach:
db.collection.aggregate([
{ "$match": { "a_collection.type": "a" } },
{ "$project": {
"$setDifference": [
{ "$map": [
"input": "$a_collection",
"as": "el",
"in": {
"$cond": [
{ "$eq": [ "$$el.type", "a" ] },
"$$el",
false
]
}
]},
[false]
]
}}
])
So that has no "group" or initial "unwind" which both can be costly options, along with the $match stage. So MongoDB 2.6 does it better.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Getting the position of specific words in documents in mongoDB - mongodb

Related

MongoDB: return a single field from the matched item of a sub-document array

MongoDB Aggregation: How to check if an object containing multiple properties exists in an array

Match in an array within the current document

MongoDB: Ordered matching of array entries

select documents with sub arrays that match some critieria

Categories

Resources