One of my collection no longer returns anything on some search values. Here is a console dump to illustrate the probleme :
meteor:PRIMARY> db['test'].insert({ sku: 'Barrière' });
WriteResult({ "nInserted" : 1 })
meteor:PRIMARY> db['test'].insert({ sku: 'Bannière' });
WriteResult({ "nInserted" : 1 })
meteor:PRIMARY> db['test'].createIndex({ sku: 'text' });
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
meteor:PRIMARY> db['test'].find({ sku: /ba/i });
{ "_id" : ObjectId("57bbb447fc77800b1e63ba64"), "sku" : "Barrière" }
{ "_id" : ObjectId("57bbb455fc77800b1e63ba65"), "sku" : "Bannière" }
meteor:PRIMARY> db['test'].find({ $text: { $search: 'ba' } });
meteor:PRIMARY> db['test'].find({ $text: { $search: 'Ba' } });
meteor:PRIMARY>
The search returned nothing, even though I clearly added two documents that should match. What's going on? What option/config am I missing?
** Edit **
I tried this query
meteor:PRIMARY> db['test'].find({ $or: [ { $text: { $search: 'ba' } }, { sku: { $regex: 'ba', $options: 'i' } } ] });
Error: error: {
"waitedMS" : NumberLong(0),
"ok" : 0,
"errmsg" : "error processing query: ns=meteor.testTree: $or\n sku regex /ba/\n TEXT : query=ba, language=english, caseSensitive=0, diacriticSensitive=0, tag=NULL\nSort:
{}\nProj: {}\n planner returned error: Failed to produce a solution for TEXT under OR - other non-TEXT clauses under OR have to be indexed as well.",
"code" : 2
}
But I'm not sure how I can make an index to search partial values (i.e. using $regex or other operator). Using a third party indexer seems overkill to me... Surely there is a way to perform a full-text search, as well as a pattern match at once?
Is my only solution to perform two queries and merge the results manually?
Try this:
db['test'].insert({ sku: 'ba ba' });
db['test'].find({ $text: { $search: 'ba' } });
Also refer to mongodb document:
If the search string is a space-delimited string, $text operator performs a logical OR search on each term and returns documents that contains any of the terms.
I think mongodb $text $search just split the string by space and match the whole word. If you need to search part of the word, you may need to use some other framework for help. Maybe you can also use $regex to do this.
If the only requirement is to query the word by prefix, you can use $regex, it can use index if you are only querying by the prefix. Otherwise if will scan the whole collection.
Related
I use MongoDB, Version 3.4.5 and I tried to exclude a term with -(minus).
For any reason it does not work.
These are my tries:
db.Product.find()
{ "_id" : ObjectId("59cbfcd01889a9fd89a3565c"), "name" : "Produkt Neu", ...
{ "_id" : ObjectId("59cc7d941889a4f4c2f43b14"), "name" : "Produkt2", ...
db.Product.find( { $text: { $search: 'Produkt -Neu' } } );
db.Product.find( { $text: { $search: "Produkt -Neu" } } );
db.Product.find( { $text: { $search: "Produkt2" } } );
{ "_id" : ObjectId("59cc7d941889a4f4c2f43b14"), "name" : "Produkt2", ...
db.Product.dropIndexes()
db.Product.createIndex({ name: "text" })
{
"nIndexesWas" : 2,
"msg" : "non-_id indexes dropped for collection",
"ok" : 1
}
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
db.Product.find( { $text: { $search: "Produkt -Neu" } } );
db.Product.find( { $text: { $search: "Produkt Neu" } } );
{ "_id" : ObjectId("59cbfcd01889a9fd89a3565c"), "name" : "Produkt Neu", ...
Does anyone know what I have to do in order to get it work with -(minus).
I created a collection: Product with the following documents ...
{
"_id" : ObjectId("59d0ada3c26584cd8b79fc51"),
"name" : "Produkt Neu"
}
{
"_id" : ObjectId("59d0adafc26584cd8b79fc54"),
"name" : "Produkt2"
}
... and I declared a text index on this collection as follows:
db.Product.createIndex({ name: "text" })
I ran the following queries which faithfully reproduce the situation described in your question:
// returns one document since there is one document
// which has the text indexed value: "Produkt Neu"
db.Product.find( { $text: { $search: "Produkt Neu" } } );
// returns no documents since there is no document
// which has the text indexed value: "Produkt2"
db.Product.find( { $text: { $search: "Produkt -Neu" } } )
You are, I think, expecting this query ...
db.Product.find( { $text: { $search: "Produkt -Neu" } } )
... to return the second document on the grounds that excluding Neu should allow a match on the document having name=Produkt2 but this is not how MongoDB $text searches work. MongoDB $text searches do not support partial matching so the search term Produkt -Neu (which evaluates as Produkt) will not match Produkt2. To verify this, I ran the following query:
db.Product.find( { $text: { $search: "Produkt2 -Neu" } } )
This query returns the second document (i.e. the one with name=Produkt2) which proves that the hyphen-minus (-) successfully negated the term: Neu.
On a side note; MongoDB text indexes do support language stemming, to verify this behaviour I added the following document...
{
"_id" : ObjectId("59d0b2b4c26584cd8b79fd7c"),
"name" : "Produkts"
}
...and then ran this query ...
db.Product.find( { $text: { $search: "Produkt -Neu" } } );
This query returns the document with name=Produkts because Product is a stem of Produkts.
In summary, a $text search will find matches where each search term has either (a) a match on a whole world in the text index or (b) is a recognised stem of a whole word in the text index. Note: there are also phrase matches but those are not relevant to the examples in your question. Use of the hyphen-minus serves to change the search terms but it does not change how the search term is evaluated.
More details in the docs and there is an open issue with MongoDB relating to supporting partial matching on text indexes.
If you really need to support partial matching then you'll probably want to discard the text index and use the $regex operator instead. Though it's worth noting that index coverage with the $regex operator is probably not what you expect, the brief summary is this: if your search value is anchored (i.e. Produk, rather than rodukt) then MongoDB can use an index but otherwise it cannot.
Problem Description
MongoDB version: 3.4.4
Documents in the MongoDB collection were created from the XML files (not GridFS) and look like this one:
{
...
"СвНаимЮЛ" : {
"#attributes" : {
"НаимЮЛПолн" : "ОБЩЕСТВО С ОГРАНИЧЕННОЙ ОТВЕТСТВЕННОСТЬЮ \"КОНСАЛТИНГОВАЯ КОМПАНИЯ \"ГОТЛИБ ЛИМИТИД\"",
...
},
...
}
...
}
Language is Russian. Collection has about 10,000,000 documents and a text index on the field "СвНаимЮЛ.#attributes.НаимЮЛПолн".
Search by one word is very fast:
db.records.find({
$text: {
$search: "ГОТЛИБ"
}
})
But search by several words with logical AND is so slow that I can't even wait until it ends to get explain('executionStats') results.
E.g. next query is very slow. Find all documents which contain words "ГОТЛИБ" AND "ЛИМИТИД":
db.records.find({
$text: {
$search: "\"ГОТЛИБ\" \"ЛИМИТИД\""
}
})
Search by phrase is also slow. E.g find all documents which contain phrase "ГОТЛИБ ЛИМИТИД":
db.records.find({
$text: {
$search: "\"ГОТЛИБ ЛИМИТИД\""
}
})
getIndexes() output:
[
{
"v" : 2,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "egrul.records"
},
...
{
"v" : 2,
"key" : {
"_fts" : "text",
"_ftsx" : 1
},
"name" : "СвНаимЮЛ.#attributes.НаимЮЛПолн_text",
"ns" : "egrul.records",
"default_language" : "russian",
"weights" : {
"СвНаимЮЛ.#attributes.НаимЮЛПолн" : 1
},
"language_override" : "language",
"textIndexVersion" : 3
}
]
Question
Can I somehow increase search-by-several-words (with logical AND) or search-by-phrase speed?
Edited
Just found that search by multiple words with logical OR is also slow:
db.records.find({
$text: {
$search: "ГОТЛИБ ЛИМИТИД"
}
})
Looks like the problem is not with slow search-by-multiple-words, but with slow search if search term appears in many documents.
E. g. the word "МИЦУБИСИ" appears only in 24 (from 10,000,000) documents so the query
db.records.find({
$text: {
$search: "МИЦУБИСИ"
}
}).count()
is very fast.
But the word "СЕРВИС" appears in 160,000 documents and the query
db.records.find({
$text: {
$search: "СЕРВИС"
}
}).count()
is very slow (takes about 40 minutes).
Query
db.records.find({
$text: {
$search: "\"МИЦУБИСИ\" \"СЕРВИС\""
}
}).count()
is also slow because (I suppose) MongoDB looks for terms "МИЦУБИСИ" (fast) and "СЕРВИС" (slow) and then make intersection or something.
Now I want to find a way to limit the number of results something like find 10 documents and stop because limit() doesn't work with text queries. .
Or maybe upgrade my server hardware.
Or look at the Elasticsearch.
Is it possible to have $and operator on multiple $text index search in mongo?
I have documents in tp collection of my db
> db.tp.find()
{ "_id" : ObjectId("...."), "name" : "tp", "dict" : { "item1" : "random", "item2" : "some" } }
{ "_id" : ObjectId("...."), "name" : "tp", "dict" : { "item3" : "rom", "item4" : "tttt" } }
Then I do
> db.tp.createIndex({ "$**": "text" })
> db.tp.find({ $and: [{$text : { $search: "random" } }, {$text : { $search: "redruth" } }]})
And it fails with
Error: error: {
"waitedMS" : NumberLong(0),
"ok" : 0,
"errmsg" : "Too many text expressions",
"code" : 2
}
but text index search works for single search so is it not possible to bind multiple text searches with $and operator? By the way I am using wildcard character $** for indexing because I want to search over entire document.
Base on mongoDB docs, AND operator can use directly in search term by combining quote and space. For example, we search for "ssl certificate" AND "authority key", so the query should like:
> db.tp.find({'$text': {'$search': '"ssl certificate" "authority key"'}})
A query can specify at most one $text expression. See:
https://docs.mongodb.com/manual/reference/operator/query/text/
I have mongodb with a $text-Index and elements like this:
{
foo: "my super cool item"
}
{
foo: "your not so cool item"
}
If i do search with
mycoll.find({ $text: { $search: "super"} })
i get the first item (correct).
But i also want to search with "uper" to get the fist item - but if i try:
mycoll.find({ $text: { $search: "uper"} })
I dont get any results.
My Question:
If there is a way to use $text so its finds results with a part of the searching string? (e.g. like '%uper%' in mysql)
Attention: I dont ask for a regex only search - i ask for a regex-search within a $text-search!
It's not possible to do it with $text operator.
Text indexes are created with the terms included in the string value or in an array of strings and the search is based in those indices.
You can only group terms on a phrase but not take part of them.
Read $text operator reference and text indexes description.
The best solution is to use both a text index and a regex.
The index will provide excellent speed performances but won't match as many documents as a regex.
The regex will allow a fallback in case the index doesn't return enough results.
db.mycoll.createIndex({ foo: 'text' });
db.mycoll.createIndex({ foo: 1 });
db.mycoll.find({
$or: [
{ $text: { $search: 'uper' } },
{ foo: { $regex: 'uper' } }
]
});
For even better performances (but slightly different results), use ^ inside the regex:
db.mycoll.find({
$or: [
{ $text: { $search: 'uper' } },
{ foo: { $regex: '^uper' } }
]
});
What you are trying to do in your second example is prefix wildcard search in your collection mycoll on field foo. This is not something the textsearch feature is designed for and it is not possible to do it with $text operator. This behaviour does not include wildcard prefix search on any given token in the indexed field. However you can alternatively perform regex search as others suggested. Here is my walkthrough:
>db.mycoll.find()
{ "_id" : ObjectId("53add9364dfbffa0471c6e8e"), "foo" : "my super cool item" }
{ "_id" : ObjectId("53add9674dfbffa0471c6e8f"), "foo" : "your not so cool item" }
> db.mycoll.find({ $text: { $search: "super"} })
{ "_id" : ObjectId("53add9364dfbffa0471c6e8e"), "foo" : "my super cool item" }
> db.mycoll.count({ $text: { $search: "uper"} })
0
The $text operator supports search for a single word, search for one or more words or search for phrase. The kind of search you wish is not supported
The regex solution:
> db.mycoll.find({foo:/uper/})
{ "_id" : ObjectId("53add9364dfbffa0471c6e8e"), "foo" : "my super cool item" }
>
The answer to your final question: to do mysql style %super% in mongoDB you would most likely have to do:
db.mycoll.find( { foo : /.*super.*/ } );
It should work with /uper/.
See http://docs.mongodb.org/manual/reference/operator/query/regex/ for details.
Edit:
As per request in the comments:
The solution wasn't necessarily meant to actually give what the OP requested, but what he needed to solve the problem.
Since $regex searches don't work with text indices, a simple regex search over an indexed field should give the expected result, though not using the requested means.
Actually, it is pretty easy to do this:
db.collection.insert( {foo: "my super cool item"} )
db.collection.insert( {foo: "your not so cool item"})
db.collection.ensureIndex({ foo: 1 })
db.collection.find({'foo': /uper/})
gives us the expected result:
{ "_id" : ObjectId("557f3ba4c1664dadf9fcfe47"), "foo" : "my super cool item" }
An added explain shows us that the index was used efficiently:
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "test.collection",
"indexFilterSet" : false,
"parsedQuery" : {
"foo" : /uper/
},
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"filter" : {
"foo" : /uper/
},
"keyPattern" : {
"foo" : 1
},
"indexName" : "foo_1",
"isMultiKey" : false,
"direction" : "forward",
"indexBounds" : {
"foo" : [
"[\"\", {})",
"[/uper/, /uper/]"
]
}
}
},
"rejectedPlans" : [ ]
},
"serverInfo" : {
// skipped
},
"ok" : 1
}
To make a long story short: No, you can not reuse a $text index, but you can do the query efficiently. Like written in Implement auto-complete feature using MongoDB search , one could probably be even more efficient by using a map/reduce approach, eliminating redundancy and unnecessary stop words from the indices, at the cost of being not real time any more.
As francadaval said, text index is searching by terms but if you combine regex and text-index you should be good.
mycoll.find({$or: [
{
$text: {
$search: "super"
}
},
{
'column-name': {
$regex: 'uper',
$options: 'i'
}
]})
Also, make sure that you have normal index applied to the column other than text index.
if you go with regex you can achieve search for "super cool" but not "super item", to achieve both request do an or request with $text and $regex for the search term.
make sure you index both text indexing and normal indexing to work.
You could have achieved is as-
db.mycoll.find( {foo: { $regex : /uper/i } })
Here 'i' is an option, denotes case-insensitive search
Using pymongo I am trying to retrieve the documents in a collection that have a SmallUrl different from null. I'm trying to get the names key and the SmallUrl key.
If I look for the Name only, the query runs fine. However, since I want to filter out from the results the documents that have a null value for SmallUrl, when I include the this in the query, the query returns nothing.
This is the MongoDB structure:
{u'Images': {u'LargeUrl': u'http://somelink.com/images/7960/53.jpg',
u'SmallUrl': u'http://somelink.com/images/7960/41.jpg'}
u'Name': u'Some Name',
u'_id': ObjectId('512b90f157dd5920ee87049e')}
{u'Images': {u'LargeUrl': u'http://somelink.com/images/8001/53.jpg',
u'SmallUrl': null}
u'Name': u'Some Name Variation',
u'_id': ObjectId('512b90f157dd5820ee87781e')}
This is the function for the query:
def search_title(search_title):
$ne
''' Returns a tuple with cursor (results) and the size of the cursor'''
query = {'Name': {'$regex': search_title, '$options': 'i'}, 'SmallUrl': {'$exists': True}}
projection = {'Name': 1, 'Images': 1}
try:
results = movies.find(query, projection)
except:
print "Unexpected error: ", sys.exc_info()[0]
$ne
return results, results.count()
I am new to MongoDB I tried different queries already. I have used $and, $not, {'$ne': 'null'}}. I also ran the queries in the mongoShell, but same result. This is an example of what I have queried in the shell:
db.myCollection.find({'Name': {'$regex': 'luis', '$options': 'i'}, 'SmallUrl': {'$ne': null}})
I would like to know what I am doing wrong.
The pymongo version of null is the Python None. So query should look like:
query = {
'Name': {'$regex': search_title, '$options': 'i'},
'Images.SmallUrl': {'$ne': None}}
your query does not work because you should use 'Images.SmallUrl' instead of 'SmallUrl' for the key of query.
my test collection:
> db.t.find()
{ "_id" : ObjectId("512cdbb365fa12a0db9d8c35"), "Images" : { "LargeUrl" : "http://aaa.com", "SmallUrl" : "http://bb.com" }, "Name" : "yy" }
{ "_id" : ObjectId("512cdc1765fa12a0db9d8c36"), "Images" : { "LargeUrl" : "http://aaa.com", "SmallUrl" : null }, "Name" : "yy" }
and my test query:
> db.t.find({'Images.SmallUrl': {$ne: null}})
{ "_id" : ObjectId("512cdbb365fa12a0db9d8c35"), "Images" : { "LargeUrl" : "http://aaa.com", "SmallUrl" : "http://bb.com" }, "Name" : "yy" }
Hope to help ^_^