mongoDB text index on subdocuments - mongodb

I have a collection that looks something like this
{ "text1" : "text",
"url" : "http:....",
"title" : "the title",
......,
"search_metadata" : { "tags" : [ "tag1", "tag2", "tag3" ],
"title" : "the title",
"topcis": [ "topic1", "topic2"]
}
}
I want to be able to add a text index to search_metadata and all it's subdocuments.
ensureIndex({search_metadata:"text"}) Gives me no results
and:
ensureIndex({"$**":"text"}) will give me irrelevant data
How can I make it happen?

From the text indexes page:
text indexes can include any field whose value is a string or an array
of string elements. To perform queries that access the text index, use
the $text query operator
Your search_metadata field is a series of sub-documents, not a string or an array of strings, so it basically is not in the right format to make use of a text index in MongoDB as it is currently structured.
Now, embedded in search_metadata you have both strings and arrays of strings, so you could use a text index on those, so an index on {search_metadata.tags : "text"} for example fits the criteria and should work just fine.
Hence, it's a choice between restructuring the field to meet the text index criteria, or a matter of indexing the relevant sub-fields. If you take the latter approach you may find that you don't need text indexes on each of the fields and a simpler (and far smaller) index may serve you just as well (using a normal index on tags and then $elemMatch for example).

Related

MongoDb Indexing full text search

I have a document named "posts", it is like this :
{
"_id" : ObjectId("5afc22290c06a67f081fa463"),
"title" : "Cool",
"description" : "this is amazing"
}
And I have putted index on title and description :
db.posts.createIndex( { title: "text", description: "text" } )
The problem is when I search and type for example "amaz" it return the data with "this is amazing" above, while it should return data only when I type "amazing"
db.posts.find({ $text: { $search: 'amaz' } }, (err, results) => {
return res.json(results);
});
Credit to #amenadiel for the original data here:
https://stackoverflow.com/a/24316510/7948962
From the MongoDB docs:
https://docs.mongodb.com/manual/core/index-text/
Index Entries
text index tokenizes and stems the terms in the indexed fields for the index entries. text index stores one index entry for each unique stemmed term in each indexed field for each document in the collection. The index uses simple language-specific suffix stemming.
This is to allow you to search partial "stem" terms in the index, and have the database return all related results. In your specific scenario, amaz is a bit of an odd token as it is a bit irregular compared to other words such as talking, which is tokenized to the word talk, or talked to talk. Similarly walking and walked to walk.
In your case, the word amazing in your text will be tokenized as amaz. If your column contained data such as amazed, it would receive the same amaz token as well. And those results would also be returned from a search of amaz.

How to filter a Meteor Collection by elements in an array data field?

I'm currently designing a system similar to Gmail's labeling system. In my "Messages" Collection, I have a field that holds an array containing the IDs of the labels associated with the current Message, which are held in a different Collection. The JSON data for some Message looks like this:
{
"_id" : "W9uCWJCqx8ozsbX6t",
"name" : "Issue",
// ... some more data fields ...
"labels" : [ "R2syna2dnRdf4TDfC", "FHrjNbAT7Da2dRR5F" ] // IDs of labels in an array
}
How would I use something along the lines of the .find() method to search for all Messages that contain a certain label ID in its labels field?
You can use $elemMatch operator of MongoDB.
Example query :
Messages.find({labels : {$elemMatch : {$eq: id}}});
More use cases can be found in the docs

Get text words from query

I've read the MongoDB documentation on getting the indexes within a collection, and have also searched SO and Google for my question. I want to get the actual indexed values.
Or maybe my understanding of how MongoDB indexes is incorrect. If I've been indexing a field called text that contains paragraphs, am I right in thinking that what gets indexed is each word in the paragraph?
Either case I want to retrieve the values that were indexed, which db.collection.getIndexes() doesn't seem to be returning.
Well yes and no, in summary.
Indexes work on the "values" of the fields they are supplied to index, and are much like a "card index" in that there is a point of reference to look at to find the location of something that matches that term.
What "you" seem to be asking about here is "text indexes". This is a special index format in MongoDB and other databases as well that looks at the "text" content of a field and breaks down every "word" in that content into a value in that "index".
Typically we do:
db.collection.createIndex({ "text": "text" })
Where the "field name" here is "text" as you asked, but more importantly the type of index here is "text".
This allows you to then insert data like this:
db.collection.insert({ "text": "The quick brown fox jumped over the lazy dog" })
And then search like this, using the $text operator:
db.collection.find({ "$text": { "$search": "brown fox" } })
Which will return and "rank" in order the terms you gave in your query depending how they matched the given "text" of your field in the index on your collection.
Note that a "text" index and it's query does not interact on a specific field. But the index itself can be made over multiple fields. The query and the constraints on the "index" itself are that there can "only be one" text index present on any given collection otherwise errors will occur.
As per mongodb's docs:
"db.collection.getIndexes() returns an array of documents that hold index information for the collection. Index information includes the keys and options used to create the index. For information on the keys and index options, see db.collection.createIndex()."
You first have to create the index on the collection, using the createIndex() method:
db.records.createIndex( { userid: 1 } )
Queries on the userid field are supported by the index:
Example:
db.records.find( { userid: 2 } )
db.records.find( { userid: { $gt: 10 } } )
Indexes help you avoid scanning the whole document. They basically are references or pointers to specific parts of your collection.
The docs explain it better:
http://docs.mongodb.org/manual/tutorial/create-an-index/

Search full document in mongodb for a match

Is there a way to match a value with every array and sub document inside the document in mongodb collection and return the document
{
"_id" : "2000001956",
"trimline1" : "abc",
"trimline2" : "xyz",
"subtitle" : "www",
"image" : {
"large" : 0,
"small" : 0,
"tiled" : 0,
"cropped" : false
},
"Kytrr" : {
"count" : 0,
"assigned" : 0
}
}
for eg if in the above document I am searching for xyz or "ab" or "xy" or "z" or "0" this document should be returned.
I actually have to achieve this at the back end using C# driver but a mongo query would also help greatly.
Please advice.
Thanks
You could probably do this using '$where'
db.mycollection({$where:"JSON.stringify(this).indexOf('xyz')!=-1"})
I'm converting the whole record to a big string and then searching to see if your element is in the resulting string. Probably won't work if your xyz is in the fieldnames!
You can make it iterate through the fields to make a big string and then search it though.
This isn't the most elegant way and will involve a full tablescan. It will be faster if you look through the individual fields!
While Malcolm's answer above would work, when your collection gets large or you have high traffic, you'll see this fall over pretty quickly. This is because of 2 things. First, dropping down to javascript is a big deal and second, this will always be a full table scan because $where can't use an index.
MongoDB 2.6 introduced text indexing which is on by default (it was in beta in 2.4). With it, you can have a full text index on all the fields in the document. The documentation gives the following example where a text index is created for every field and names the index "TextIndex".
db.collection.ensureIndex(
{ "$**": "text" },
{ name: "TextIndex" }
)

How to Retrieve any element value from mongoDB?

Suppose I have following collection :
{ _id" : ObjectId("4f1d8132595bb0e4830d15cc"),
"Data" : "[
{ "id1": "100002997235643", "from": {"name": "Joannah" ,"id": "100002997235643"} , "label" : "test" } ,
{ "id1": "100002997235644", "from": {"name": "Jon" ,"id": "100002997235644"} , "label" : "test1" }
]" ,
"stat" : "true"
}
How can I retrieve id1 , name , id ,label or any other element?
I am able to get _id field , DATA (complete array) but not the inner elements in DATA.
You cannot query for embedded structures. You always query for top level documents. If you want to query for individual elements from your array you will have to make those element top level documents (so, put them in their own collection) and maintain an array of _ids in this document.
That said, unless the array becomes very large it's almost always more efficient to simply grab your entire document and find the appropriate element in your app.
I don't think you can do that. It is explained here.
If you want to access specific fields, then following MongoDB Documentation,
you could add a flag parameter to your query, but you should redesign your documents for this to be useful:
Field Selection
In addition to the query expression, MongoDB queries can take some additional arguments. For example, it's possible to request only certain fields be returned. If we just wanted the social security numbers of users with the last name of 'Smith,' then from the shell we could issue this query:
// retrieve ssn field for documents where last_name == 'Smith':
db.users.find({last_name: 'Smith'}, {'ssn': 1});
// retrieve all fields *except* the thumbnail field, for all documents:
db.users.find({}, {thumbnail:0});