Suppose I have sample db structure
[
{ name: 'hello world', description: { key: 'something' } },
{ name: 'user', description: { key: 'hello world' } },
]
with index
db.fulltext.createIndex({ name: 'text', '$**': 'text' }, { weights: { name: 10, '$**': 5 } })
I am finding documents with the query
db.fulltext.find({ $text: { $search: 'hello world' } }, { score: { $meta: 'textScore' } })
But... It gives me 15.0 score for both documents... It's impossible to add weight to wildcard operator? Why second document multiply score from name key?
The wildcard index "$**" includes all the string fields in the document in the text index. In the above scenario, name is a string attribute for which weight was given as 10 and in general all string fields weight was assigned as 5 (including name field because wild card is used). So, the weight is overridden.
When the text search is done, equal weightage is given for all String fields. So, the score is same for both the documents as there is no relative significance to the other indexed fields (i.e. because the wild card was used while creating the index).
The $text operator assigns a score to each document that contains the
search term in the indexed fields. The score represents the relevance
of a document to a given text search query.
When different weight is need for different fields, you need to provide the field names specifically while creating the index. In other words, you should not provide a weight for a String field and include wild card weight for all string fields. Obviously, one weight will override the other.
If you can change the index as mentioned below, you can see the difference.
Create Index:-
db.fulltext.createIndex({ name: 'text', 'description.key' : 'text' }, { weights: { name: 10, 'description.key' : 5 } })
Search:-
db.fulltext.find({ $text: { $search: 'hello world' } }, { score: { $meta: 'textScore' } })
Result:-
{
"_id" : ObjectId("57e119cbf522cc85b5595797"),
"name" : "hello world",
"description" : {
"key" : "something"
},
"score" : 15
}
{
"_id" : ObjectId("57e119cbf522cc85b5595798"),
"name" : "user",
"description" : {
"key" : "hello world"
},
"score" : 7.5
}
Related
I have 3 text(tx_field1,tx_field2,tx_field3) fields in mongo document and created a compound text index on three of them.
createIndex({ "tx_field1": "text", "tx_field3": "text", "tx_field3": "text" })
I have to match exact text on these 3 fields. Consider below are the documents.
{
"tx_field1" : "Control Number",
"tx_field2" : "Education & Employment History",
"tx_field3" : "Expires"
}
{
"tx_field1" : "Self Identify",
"tx_field2" : "Form",
"tx_field3" : "Education"
}
I am using the below query to match exact text
{ $text: { $search: "\"Education\"","$caseSensitive": true }
The query returns the above two documents because both the documents contain the exact phrase. Is there a way to match the entire document key such that only the second document is returned,so that is only matches Education not Education & Employment History
The only way I could do that is by the following below query
{ $text: { $search: "\"Education\"","$caseSensitive": true },
$or : [
{
"tx_field1":"Education"
},{
"tx_field2":"Education"
},{
"tx_field3":"Education"
}
]}
For example, I have this structure :
{
name:{
"en":"london",
"fr":"londres",
"sq":"londra"
},
...
},
{
name:{
"de":"barcelona",
"sv":"barcelone"
},
...
}
...
I would like to know how can I retrieve in this exemple, all cities which name contains "lon", but without specifying the key ("de" or "fr")?
So, not this :
db.cities.find({$or:{"name.en":/lon/,"name.fr":/lon/, ...}})
But something like :
db.cities.find({"name":/lon/}})
-> find in the children of "name, don't care about the key
to get this you could create an text index, which include all fields:
db.collection.createIndex( { "$**": "text" ,
} )
and then use $search in your query - more here
db.cities.find( { $text: {
$search: "lon",
$caseSensitive: true,
$diacriticSensitive: true
} } )
Is it possible to search a string if I have some data stored like
Names:
{
name: 'john'
},
{
name: 'pete'
},
{
name: 'jack smith'
}
Then I perform a query like
{ $stringContainsKeys: 'pete said hi to jack smith' }
and it would return
{
name: 'pete'
},
{
name: 'jack smith'
}
I'm not sure that this is even possible in mongoDB or if this kind of searching has a specific name.
Yes, quite possible indeed through the use of the $text operator which performs a text search on the content of the fields indexed with a text index.
Suppose you have the following test documents:
db.collection.insert([
{
_id: 1, name: 'john'
},
{
_id: 2, name: 'pete'
},
{
_id: 3, name: 'jack smith'
}
])
First you need to create a text index on the name field of your document:
db.collection.createIndex( { "name": "text" } )
And then perform a logical OR search on each term of a search string which is space-delimited and returns documents that contains any of the terms
The following query searches specifies a $search string of six terms delimited by space, "pete said hi to jack smith":
db.collection.find( { "$text": { "$search": "pete said hi to jack smith" } } )
This query returns documents that contain either pete or said or hi or to or jack or smith in the indexed name field:
/* 0 */
{
"_id" : 3,
"name" : "jack smith"
}
/* 1 */
{
"_id" : 2,
"name" : "pete"
}
Starting from Mongodb 2.6 you can search mongodb collection to match any of the search terms.
db.names.find( { $text: { $search: "pete said hi to jack smith" } } )
This will search for each of the terms separated by space.
You can find more information about this at
http://docs.mongodb.org/manual/reference/operator/query/text/#match-any-of-the-search-terms
However, it will work only with individual terms. If you have to search for exact phrase which is not a single term, e.g. you want to find "jack smith', but not "smith jack", it will not work, so you will have to use search for a phrase.
http://docs.mongodb.org/manual/reference/operator/query/text/#search-for-a-phrase which searches for exact phrases in the text.
If you need more advanced text-based search features in your application, you might consider using something like Elasticsearch https://www.elastic.co/guide/en/elasticsearch/reference/1.3/query-dsl-mlt-field-query.html.
Zoran
In MongoDB, I need to use group aggregation (I believe), in order to get the number of documents in a collection with the same value. I need to get these results returned to me from greatest to least, and then get the common value for each result.
Eg.
I have a normal query with a range (eg. field "value" > 5). I assume for this I should use the "match" feature when aggregating
Get all documents with the same value for the "id" field, that also match the above query parameters
Sort the results from most matching values to least
Give me the common value of "id" for each result.
Sample documents:
#1. Type: "Like", value: 6, id: 123
#2. Type: "Like", value: 7, id: 123
#3. Type: "Like", value: 7, id: 123
#4. Type: "Like", value: 8, id: 12345
#5. Type: "Like", value: 7, id: 12345
#6. Type: "Like", value: 6, id: 1234
#7. Type: "Like", value: 2, id: 1234
#7. Type: "Like", value: 2, id: 1234
#7. Type: "Like", value: 2, id: 1234
Expected output (assume I have a limit of 3 documents, and the query asks for only documents with the "value" field > 5):
1. id: 123
2. id: 12345
3. id: 1234
I expect these in this order, as the id 123 is most popular, and 1234 is least popular, of the documents where the "value" field > 5.
Ideally, I would have a method that would return something like a String[] of the resulting Ids, in order.
db.data.aggregate([
{$match: {value:{$gt:5}}},
{$group: {'_id':"$id", num:{$sum:1}, avg:{$avg:"$value"}}},
{$sort:{num:-1}}, { $limit : 50}
])
db.getCollection('my_collection').aggregate([
//Only include documents whose field named "value" is greater than 5
{
$match: {
value: {
$gt:5
}
}
},
//Using the documents gathered from the $match above, create a new set of
// documents grouped by the "id" field, and use the "id" field as the "_id"
// for the group. Make a new field called "num" that increments by 1 for
// every matching document. Make a new field named "avg" that is the average
// of the field named "value".
{
$group: {
'_id' : "$id",
num : {
$sum : 1
},
avg : {
$avg : "$value"
}
}//end $group
},
// -- //
// Note: you could do another $match here, which would run on the new
// documents created by $group.
// -- //
//Sort the new documents by the "num" field in descending order
{
$sort : {
num : -1
}
},
//Only return the first 3 of the new documents
{
$limit : 3
}
])
I am trying to count word usage using MongoDB. My collection currently looks like this:
{'_id':###, 'username':'Foo', words:[{'word':'foo', 'count':1}, {'word':'bar', 'count':1}]}
When a new post is made, I extract all the new words to an array but I'm trying to figure out to upsert to the words array and increment the count if the word already exists.
In the example above, for example, if the user "Foo" posted "lorem ipsum foo", I'd add "lorem" and "ipsum" to the users words array but increment the count for "foo".
Is this possible in one query? Currently I am using addToSet:
'$addToSet':{'words':{'$each':word_array}}
But that doesn't seem to offer any way of increasing the words count.
Would very much appreciate some help :)
If you're willing to switch from a list to hash (object), you can atomically do this.
From the docs: "$inc ... increments field by the number value if field is present in the object, otherwise sets field to the number value."
{ $inc : { field : value } }
So, if you could refactor your container and object:
words: [
{
'word': 'foo',
'count': 1
},
...
]
to:
words: {
'foo': 1,
'other_word: 2,
...
}
you could use the operation update with:
{ $inc: { 'words.foo': 1 } }
which would create { 'foo': 1 } if 'foo' doesn't exist, else increment foo.
E.g.:
$ db.bar.insert({ id: 1, words: {} });
$ db.bar.find({ id: 1 })
[
{ ..., "words" : { }, "id" : 1 }
]
$ db.bar.update({ id: 1 }, { $inc: { 'words.foo': 1 } });
$ db.bar.find({ id: 1 })
[
{ ..., "id" : 1, "words" : { "foo" : 1 } }
]
$ db.bar.update({ id: 1 }, { $inc: { 'words.foo': 1 } });
$ db.bar.find({ id: 1 })
[
{ ..., "id" : 1, "words" : { "foo" : 2 } }
]
Unfortunately it is not possible to do this in a single update with your schema. Your schema is a bit questionable and should probably be converted to having a dedicated collection with word counters, e.g :
db.users {_id:###, username:'Foo'}
db.words.counters {_id:###, word:'Word', userId: ###, count: 1}
That will avoid quite a few issues such as :
Running into maximum document size limits
Forcing mongo to keep moving around your documents as you increase their size
Both scenarios require two updates to do what you want which introduces atomicity issues. Updating per word by looping through word_array is better and safer (and is possible with both solutions).