Exclude some text fields in MongoDB text index - mongodb

I have a collection that has a text index like this:
db.mycollection.createIndex({ "$**" : "text"},{ "name" : "AllTextIndex"}))
The collection's documents have many text data blocks. I want to exclude some of them in order to not get results that includes text matched from the excluded blocks.
However, I don't want to define each field in the text index like below in order to exclude the, for example, NotTextSearchableBlock block:
db.application.ensureIndex({
"SearchableTextBlock1": "text",
"SearchableTextBlock2": "text",
"SearchableTextBlock3": "text"
})
Here is a document example:
{
"_id": "e0832e2d-6fb3-47d8-af79-08628f1f0d84",
"_class": "com.important.enterprise.acme",
"fields": {
"Organization": "testing"
},
"NotTextSearchableBlock": {
"something": {
"SomethingInside": {
"Text":"no matcheable text"
}
}
},
"SearchableTextBlock1": {
"someKey": "someTextCool"
},
"SearchableTextBlock2": {
"_id": null,
"fields": {
"Status": "SomeText"
}
},
"SearchableTextBlock3": {
"SomeSubBlockArray": [
{
"someText": "Pepe"
}
]
}
}

To answer your question, there is no documented way to exclude certain fields from a text index. (See Text Indexes in mongodb's documention.)
As you already know:
When establishing a text index, you have to identify the specific text field(s) to index. The field must be a string or an array of string elements.
db.mycollection.createIndex({
mystringfield : "text"
mystringarray : "text"
})
You may also index all text found within a collection's documents by using the wildcard specifier $**.
db.mycollection.createIndex(
{ "$**" : "text" },
{ name : "mytextindex" }
)

Related

how to update partial document of an array

i have a person document, that have list of pets:
{
"personId": "kjadfh97r0",
"pets": [
{
"petId": "dfjkh32476",
"name": "kitty",
"kind": "cat"
},
{
"petId": "askdjfh2794857",
"name": "rexy",
"kind": "dog"
}
]
}
I want to find certain pen inside of certain person and update just some fields, so I did something like:
db.people.findAndModify({
query: { "personId": "kjadfh97r0", "pets.petId": "dfjkh32476" },
update: {"$set":{"pets.$":{"kind":"tiger"}}}
})
but what happens to me is that the whole document is replaced with "kind":"tiger", and I just wanted to update the "kind" field any keep the rest.
You should specify entire path for $set when you update nested document using positional operator, otherwise the document will be replaced:
db.people.findAndModify({
query: { "personId": "kjadfh97r0", "pets.petId": "dfjkh32476" },
update: { $set: {"pets.$.kind": "tiger"} }
})

Index on nested document in MongoDB

I have a nested JSON document like:
{
"docId": 1901603742,
"sl": [ {"slid","val"}],
"accounts": {
"123": {
"smartAccountId": "123",
"smartAccountName": "Dummy name",
"101": {
"virtualAccountId": "101",
"virtualAccountName": "DEFAULT"
},
"102": {
"virtualAccountId": "102",
"virtualAccountName": "DEFAULT"
}
},
"234": {
"smartAccountId": "234",
"smartAccountName": "Dummy name",
"201": {
"virtualAccountId": "201",
"virtualAccountName": "DEFAULT"
}
}
}
}
here I need to put an Index on the "smartAccountId" and "virtualAccountId". The problem is the key for the nested document is not fixed, its the "smartAccountId" or "virtualAccountId" we are using as the key (123 in the example), how can we get such a document indexed on MongoDB?
Thanks
PS: I already have an array in the original document, so cant introduce one more array, as we wont be able to index more than one array in a given document.

mongodb fast tags query

I have a very large collection ( more than 800k ) and I need to implement a query for auto-complete ( based on word beginnings only ) functionality based on tags. my documents look like this:
{
"_id": "theid",
"somefield": "some value",
"tags": [
{
"name": "abc tag1",
"vote": 5
},
{
"name": "hij tag2",
"vote": 22
},
{
"name": "abc tag3",
"vote": 5
},
{
"name": "hij tag4",
"vote": 77
}
]
}
if for example my query would be for all tags that start with "ab" and has a "somefield" that is "some value" the result would be "abc tag1","abc tag3" ( only names ).
I care about the speed of the queries much more than the speed of the inserts and updates.
I assume that the aggregation framework would be the right way to go here, but what would be the best pipeline and indexes for very fast querying ?
the documents are not 'tag' documents they are documents representing a client object, they contain much more data fields that I left out for simplicity, each client has several tags and another field ( I changed its name so it wont be confused with the tags array ). I need to get a set without duplicates of all tags that a group of clients have.
Your document structure doesn't make sense - I'm assuming tags is an array and not an object. Try queries like this
db.tags.find({ "somefield" : "some value", "tags.name" : /^abc/ })
with an index on { "maintag" : 1, "tags.name" : 1 }. MongoDB optimizes left-anchored regex queries into range queries, which can be fulfilled efficiently using an index (see the $regex docs).
You can get just the tags from this document structure using an aggregation pipeline:
db.tags.aggregate([
{ "$match" : { "somefield" : "some value", "tags.name" : /^abc/ } },
{ "$unwind" : "$tags" },
{ "$match" : { "tags.name" : /^abc/ } },
{ "$project" : { "_id" : 0, "tag_name" : "$tags.name" } }
])
Index only helps for first $match, so same indexes for the pipeline as for the query.

How to use full text search for unknown number of children of a field in Mongodb?

I have a document with one field description like this:
{
"_id": "item0",
"description": {
"parlist": [
{
"listitem": {
"text": {
"child": "page rous lady",
"keyword": "officer e"
}
}
},
{
"listitem": {
"text": "shepherd noble "
}
}
]
}
}
How to create text index on description and search for specific word? I don't know how depth can description go and how many children will description have. I tried with index creation like this:
db.collection.ensureIndex({description:"text"})
and then for query like this:
db.collection.runCommand("text",{$search:"shepherd"})
But it doesn't work.
You can just simply build an text index for all text field:
db.collection.ensureIndex({ "$**": "text" })
Then search with keyword $text:
db.collection.find( { $text: { $search: "shepherd" } } )
Text-indexes do not work on full sub-documents, but you can create a text index which includes more than one field:
db.collection.ensureIndex(
{
"description.parlist.listitem.text": "text",
"description.parlist.listitem.text.child": "text",
"description.parlist.listitem.text.keyword": "text"
}
)

Nested documents and _id indexes in mongodb

I have a collection with nested documents in it. Each document also has an _id field.
Here's an example of a documents structure
{
"_id": ObjectId("top_level_doc"),
"title": "Cadernos",
"parent": "4fd55bbc5d1709793b000008",
"criterias": {
"0": {
"_id": ObjectId("a_nested_doc"),
"value": "caderno",
"operator": "contains",
"field": "design0"
}
}
}
I want to be able to find the nested document just by searching it's _id
With this query
{
"criterias._id" : ObjectId("a_nested_doc")
}
It returns the parent document (i just want the one that's nested).
Ideally I would do this
{
"_id" : ObjectId("a_nested_doc")
}
And it would return the document with that id (either its nested or not).
Ps. I edited the "_id" values for the sake of simplicity just for this example.
You may have to live with selecting criterias._id (without writing a wrapper around the query, at least), but you can select the document itself by simply retrieving a subset of the fields.
http://www.mongodb.org/display/DOCS/Retrieving+a+Subset+of+Fields
// The simplest case converted to your use case
db.collection.find( { criterias._id : ObjectId("a_nested_doc") }, { criterias : 1 } );