Mongodb: Indexing field of sub-document that can be either text or array - mongodb

I have a collection of documents representing messages. Each message has multiple fields that change from message to message. They are stored in a "fields" array of sub-documents.
Each element in this array contains the label and value of a field.
Some fields may contain long lists of strings (IP addresses, URLs, etc.) - each string appears in a new line within that field. Lists can be thousands of lines long.
For that purpose, each element also stores a "type" - type 1 represents a standard text, while type 2 represents a list. When there's a type 2 field, the "value" in the sub-document is an array of the list.
It looks something like this:
"fields" : [
{
"type" : 1,
"label" : "Observed on",
"value" : "01/09/2016"
},
{
"type" : 1,
"label" : "Indicator of",
"value" : "Malware"
},
{
"type" : 2,
"label" : "Relevant IP addresses",
"value" : [
"10.0.0.0",
"190.15.55.21",
"11.132.33.55",
"109.0.15.3"
]
}
]
I want all fields values to be searchable and indexed, whether these values are in a standard string or in an array within "value".
Would setting up a standard index on "fields.value" index both type 1 and type 2 content? do I need to set up two indexes?
Thanks in advance!

When creating a new index, mongodb will automatically switch to Multikey index if it stumbles across an array in a document on the indexed field.
Which means that simply:
collection.createIndex( { "fields.value": 1 } )
should work just fine.
see: https://docs.mongodb.com/v3.2/core/index-multikey/

Related

MongoDB match a subdocument inside array (not positional reference)

My MongoDB has a key-value pair structure, inside my document has a data field which is an array that contains many subdocuments of two fields: name and value.
How do I search for a subdocument e.g ( {"name":"position", "value":"manager"}) and also multiple (e.g. {"name":"age", "value" : {$ge: 30}})
EDIT: I am not looking for a specific subdocument as I mentioned in title (not positional reference), rather, I want to retrieve the entire document but I need it to match the two subdocuments exactly.
Here are 2 queries to find the following record:
{
"_id" : ObjectId("sometobjectID"),
"data" : [
{
"name" : "position",
"value" : "manager"
}
]
}
// Both value and name (in the same record):
db.demo.find({$elemMatch: {"value": "manager", "name":"position"}})
// Both value and name (not necessarily in the same record):
db.demo.find({"data.value": "manager", "data.name":"position"})
// Just value:
db.demo.find({"data.value": "manager"})
Note how the . is used, this works for all subdocuments, even if they are in an array.
You can use any operator you like here, including $gte
edit
$elemMatch added to answer because of #Veeram's response
This answer explains the difference between $elemMatch and .

How to filter a Meteor Collection by elements in an array data field?

I'm currently designing a system similar to Gmail's labeling system. In my "Messages" Collection, I have a field that holds an array containing the IDs of the labels associated with the current Message, which are held in a different Collection. The JSON data for some Message looks like this:
{
"_id" : "W9uCWJCqx8ozsbX6t",
"name" : "Issue",
// ... some more data fields ...
"labels" : [ "R2syna2dnRdf4TDfC", "FHrjNbAT7Da2dRR5F" ] // IDs of labels in an array
}
How would I use something along the lines of the .find() method to search for all Messages that contain a certain label ID in its labels field?
You can use $elemMatch operator of MongoDB.
Example query :
Messages.find({labels : {$elemMatch : {$eq: id}}});
More use cases can be found in the docs

Search full document in mongodb for a match

Is there a way to match a value with every array and sub document inside the document in mongodb collection and return the document
{
"_id" : "2000001956",
"trimline1" : "abc",
"trimline2" : "xyz",
"subtitle" : "www",
"image" : {
"large" : 0,
"small" : 0,
"tiled" : 0,
"cropped" : false
},
"Kytrr" : {
"count" : 0,
"assigned" : 0
}
}
for eg if in the above document I am searching for xyz or "ab" or "xy" or "z" or "0" this document should be returned.
I actually have to achieve this at the back end using C# driver but a mongo query would also help greatly.
Please advice.
Thanks
You could probably do this using '$where'
db.mycollection({$where:"JSON.stringify(this).indexOf('xyz')!=-1"})
I'm converting the whole record to a big string and then searching to see if your element is in the resulting string. Probably won't work if your xyz is in the fieldnames!
You can make it iterate through the fields to make a big string and then search it though.
This isn't the most elegant way and will involve a full tablescan. It will be faster if you look through the individual fields!
While Malcolm's answer above would work, when your collection gets large or you have high traffic, you'll see this fall over pretty quickly. This is because of 2 things. First, dropping down to javascript is a big deal and second, this will always be a full table scan because $where can't use an index.
MongoDB 2.6 introduced text indexing which is on by default (it was in beta in 2.4). With it, you can have a full text index on all the fields in the document. The documentation gives the following example where a text index is created for every field and names the index "TextIndex".
db.collection.ensureIndex(
{ "$**": "text" },
{ name: "TextIndex" }
)

mongoDB text index on subdocuments

I have a collection that looks something like this
{ "text1" : "text",
"url" : "http:....",
"title" : "the title",
......,
"search_metadata" : { "tags" : [ "tag1", "tag2", "tag3" ],
"title" : "the title",
"topcis": [ "topic1", "topic2"]
}
}
I want to be able to add a text index to search_metadata and all it's subdocuments.
ensureIndex({search_metadata:"text"}) Gives me no results
and:
ensureIndex({"$**":"text"}) will give me irrelevant data
How can I make it happen?
From the text indexes page:
text indexes can include any field whose value is a string or an array
of string elements. To perform queries that access the text index, use
the $text query operator
Your search_metadata field is a series of sub-documents, not a string or an array of strings, so it basically is not in the right format to make use of a text index in MongoDB as it is currently structured.
Now, embedded in search_metadata you have both strings and arrays of strings, so you could use a text index on those, so an index on {search_metadata.tags : "text"} for example fits the criteria and should work just fine.
Hence, it's a choice between restructuring the field to meet the text index criteria, or a matter of indexing the relevant sub-fields. If you take the latter approach you may find that you don't need text indexes on each of the fields and a simpler (and far smaller) index may serve you just as well (using a normal index on tags and then $elemMatch for example).

How do I remove an element in an array based on content?

I am working with MongoDB and Perl. Here is my data structure:
{
"_id" : ObjectId("501976f8005c8b541d000000"),
"err_id" : "err",
"solution" : [
{
"attachment" : "attach",
"macr" : "macrs",
"yammer" : "yam",
"resolution" : "l",
"salesforce" : "salesforce",
"username" : "bob"
},
{
"attachment" : "attach",
"macr" : "macrs",
"yammer" : "yam",
"resolution" : "losssss",
"salesforce" : "salesforce",
"username" : "bob"
}
]
}
As you can see, I have an array with objects inside. I have created this using the Perl MongoDB library.
I am familiar with some syntax for manipulating arrays in the Perl MongoDB lib. For example, I use this to find entries with a username the same as $username.
$users->find({"solution.username" => $username});
I thought removing an element would be as simple:
$users->remove({"solution.username" => $username});
But alas, it is not so. I have tried this and using pull, but to no avail! I've had a hard time finding this. Does anybody know the syntax to remove an array element based on the contents of one of its fields?
The MongoDB::Collection remove() method will remove documents matched by your query .. so definitely not what you are looking for.
To delete specific fields you should use $unset.
Your solution.usernames are actually in an array, so you would have to include an array index for the fields to delete, eg:
$users->update({"_id" => '123'}, {
'$unset' => {
'solution.0.username' => 1,
'solution.1.username' => 1
}
});
I'm not aware of a shorter syntax to unset all fields matching username within the solution array, but you can add multiple solution.#.username fields to the $unset command.
My example above deletes the first two username entries from the array. If the matching document(s) had more than two username entries, each time you ran this update you would delete up to two more entries (if they exist).