Is it possible to perform a query and return result based on matching queries? Please see my example below.
Example objects:
ID="1" product="car"
ID="2" product="racing car"
ID="3" product="electric racing car"
Example search:
racing car
Returns the following objects in order: 2, 3, 1
To consider all words as optional with Algolia, you can pass an array composed of all the words of the query in your request:
index.search('racing car', {
optionalWords: ['racing', 'car']
});
This will give you the results in the order you expect.
Algolia provides another related option, removeWordsIfNoResults, which will consider some words as optional if and only if it doesn't find results matching every word of the query.
Related
I'm very confused by this behavior. It seems inconsistent and strange, especially since I've read that Mongo isn't supposed to support partial search terms in full text search. I'm using version 3.4.7 of Mongo DB Community Server. I'm doing these tests from the Mongo shell.
So, I have a Mongo DB collection with a text index assigned. I created the index like this:
db.submissions.createIndex({"$**":"text"})
There is a document in this collection that contains these two values:
"Craig"
"Dr. Bob".
My goal is to do a text search for a document that has multiple matching terms in it.
So, here are tests I've run, and their inconsistent output:
SINGLE TERM, COMPLETE
db.submissions.find({"$text":{"$search":"\"Craig\""}})
Result: Gets me the document with this value in it.
SINGLE TERM, PARTIAL
db.submissions.find({"$text":{"$search":"\"Crai\""}})
Result: Returns nothing, because this partial search term doesn't exactly match anything in the document.
MULTIPLE TERMS, COMPLETE
db.submissions.find({"$text":{"$search":"\"Craig\" \"Dr. Bob\""}})
Result: Returns the document with both of these terms in it.
MULTIPLE TERMS, ONE PARTIAL
db.submissions.find({"$text":{"$search":"\"Craig\" \"Dr. Bo\""}})
Result: Returns the document with both terms in it, despite the fact that one term is partial. There is nothing in the document that matches "Dr. Bo"
MULTIPLE TERMS, BOTH PARTIAL
db.submissions.find({"$text":{"$search":"\"Crai\" \"Dr. Bo\""}})
Result: Returns the document with both terms in it, despite the fact that both terms are partial and incomplete. There is nothing in the document that matches either "Crai" or "Dr. Bo".
Question
So, it all boils down to: why? Why is it, when I do a text search with a partial term with only a single value, nothing gets returned. When I do a text search with two partial terms, I get the matching result? It just seems so strange and inconsistent.
MongoDB $text searches do not support partial matching. MongoDB allows text search queries on string content with support for case insensitivity, delimiters, stop words and stemming. And the terms in your search string are, by default, OR'ed.
Taking your (very useful :) examples one by one:
SINGLE TERM, PARTIAL
// returns nothing because there is no world word with the value `Crai` in your
// text index and there is no whole word for which `Crai` is a recognised stem
db.submissions.find({"$text":{"$search":"\"Crai\""}})
MULTIPLE TERMS, COMPLETE
// returns the document because it contains all of these words
// note in the text index Dr. Bob is not a single entry since "." is a delimiter
db.submissions.find({"$text":{"$search":"\"Craig\" \"Dr. Bob\""}})
MULTIPLE TERMS, ONE PARTIAL
// returns the document because it contains the whole word "Craig" and it
// contains the whole word "Dr"
db.submissions.find({"$text":{"$search":"\"Craig\" \"Dr. Bo\""}})
MULTIPLE TERMS, BOTH PARTIAL
// returns the document because it contains the whole word "Dr"
db.submissions.find({"$text":{"$search":"\"Crai\" \"Dr. Bo\""}})
Bear in mind that the $search string is ...
A string of terms that MongoDB parses and uses to query the text index. MongoDB performs a logical OR search of the terms unless specified as a phrase.
So, if at least one term in your $search string matches then MongoDB matches that document.
To verify this behaviour, if you edit your document changing Dr. Bob to DrBob then the following queries will return no documents:
db.submissions.find({"$text":{"$search":"\"Craig\" \"Dr. Bo\""}})
db.submissions.find({"$text":{"$search":"\"Crai\" \"Dr. Bo\""}})
These now return no matches because Dr is no longer a whole word in your text index because it is not followed by the . delimiter.
You can do partial searching in mongoose database using mongoose external library called mongoose-fuzzy-search where the search text is broken in various anagrams.
for more information visit this link
User.fuzzySearch('jo').sort({ age: -1 }).exec(function (err, users) {
console.error(err);
console.log(users);
});
Example:
{
shortName: "KITT",
longName: "Knight Industries Two Thousand",
fromZeroToSixty: 2,
year: 1982,
manufacturer: "Pontiac",
/* 25 more fields */
}
Ability to query by at least 20 fields which means that only 10 fields are left unindexed
There's 3 fields (all number) that could be used for sorting (both ways)
This leaves me wondering that how does sites with lots of searchable fields do it: e.g real estate or car sale sites where you can filter by every small detail and can choose between several sort options.
How could I pull this off with MongoDB? How should I index that kind of collection?
Im aware that there are dbs specifically made for searching but there must be general rules of thumb to do this (even if less performant) in every db. Im sure not everybody uses Elasticsearch or similar.
---
Optional reading:
My reasoning is that index could be huge but the index order matters. You'll always make sure that fields that return the least results are first and most generic fields are last in index. However, what if user chooses only generic fields? Should I include non-generic fields to query anyway? How to solve ordering in both ways? Or index intersection saves the day and I should just add 20 different indexes?
text index is your friend.
Read up on it here: https://docs.mongodb.com/v3.2/core/index-text/
In short, it's a way to tell mongodb that you want full text search over a specific field, multiple fields, or all fields (yay!)
To allow text indexing of all fields, use the special symbol $**, and define it of type 'text':
db.collection.createIndex( { "$**": "text" } )
you can also configure it with Case Insensitivity or Diacritic Insensitivity, and more.
To perform text searches using the index, use the $text query helper, see: https://docs.mongodb.com/v3.2/reference/operator/query/text/#op._S_text
Update:
In order to allow user to select specific fields to search on, it's possible to use weights when creating the text-index: https://docs.mongodb.com/v3.2/core/index-text/#specify-weights
If you carefully select your fields' weights, for example using different prime numbers only, and then add the $meta text score to your results you may be able to figure out from the "textScore" which field was matched on this query, and so filter out the results that didn't get a hit from a selected search field.
Read more here: https://docs.mongodb.com/v3.2/tutorial/control-results-of-text-search/
I am trying to search based on multiple conditions which works but the problem is that does not behave like this.
Assuming i have a search query like
Orders.find({$or: {"status":{"$in":["open", "closed"]},"paymentStatus":{"$in":["unpaid"]}}}
)
and i add another filter parameter like approvalStatus it does not leave the previously found items but rather it treats the query like an AND that will return an empty collection of items if one of the queries does not match.
How can i write a query that regardless of what is passed into it, it will retain previously found items even if there is no record in one of the conditions.
like a simple OR query in sql
I hope i explained this well enough
Using $or here is the right approach, but its value needs to be an array of query expressions, not an object.
So your query should look something like this instead:
Orders.find({$or: [
{"status": {"$in": ["open", "closed"]}},
{"paymentStatus": {"$in": ["unpaid"]}},
{"approvalStatus": {"$in": ["approved"]}}
]})
I have a list of about 50 tags in an array, and want to search through my documents to find records that match these tags.
Because they're user-submitted and mongoDB is case-sensitive, I'm using /wildcard/i as a means of searching. I know this is not the fastest way to do a search but I can't think of a better solution.
I can do my query in two ways. The first is to run a for loop over my tags array, and for each result, perform:
db.collection.find({tags: /<tag[x]>/i})
Or, I can collect all of the tags and run one single lookup using $or, like so:
db.collection.find({$or:[{tags:/<tag1>/i},{tags:/<tag2>/i},{tags:/<tag3>/i}, ... {tags:/<tag50>/i}]});
I have tried both, and found using $or to be significantly faster - but because of the work-in-progress state of my application, it's very difficult to tell whether this is because it's actually faster or whether my app is causing significant overhead in other areas (it is).
So for clarification, in MongoDB is a big query performed once faster than small queries performed many times?
EDIT: Another example would be whether looking up 3 individual records based on _id is faster than doing one lookup using {$or:[{_id: ObjectId([id1])},{_id: ObjectId([id2])},{_id: ObjectId([id3])}]}. Is less more?
I recommend you adjust your schema so it keeps a normalized array of tags. When you insert a new document, do it like this:
tags : [ "business", "Computing", "PayPal" ],
lowercaseTags : [ "business", "computing", "paypal" ]
Similarly when you update the tags, update both arrays.
Create an index on lowercaseTags, and then when you want to query them, use a single query with the $in operator, and the normalized form of the search terms.
For example, to search for business iTunes YouTube, use this query:
db.collection.find( { tags : $in: [ "business", "itunes", "youtube" ] } )
This answer gives an example of this approach. It should be loads faster than what you have.
An alternate approach you can take is to create a text index and use the text command.
Both of these approaches are geared toward index optimization, and designing your schema to work well with Mongo. The payoff should be a lot higher than whatever difference there is between a single $or query and 50 simpler queries.
I have some documents which have 2 sets of attributes: tag and lieu. Here is an example of what they look like:
{
title: "doc1",
tag: ["mountain", "sunny", "forest"],
lieu: ["france", "luxembourg"]
},
{
title: "doc2",
tag: ["sunny", "lake"],
lieu: ["france", "germany"]
},
{
title: "doc3",
tag: ["sunny"],
lieu: ["belgium", "luxembourg", "france"]
}
How can I map/reduce and query my DB to be able to retrieve only the intersection of documents that match these criteria:
lieu: ["france", "luxembourg"]
tag: ["sunny"]
Returns: doc1 and doc3
I cannot figure out any format map/reduce could return to be able to have only one query. What I am doing now is: emit every lieu/tag as key and the documents' id related as value, then reduce for every keys have an array of docs' ids. Then from my app I query this view, on the app side do an intersection of the documents (only take the docs that have the 3 keys (luxembourg, france and sunny) and then requery couchdb with these docs' ids to retrieve the actual docs. I feel that's not the right/best way to do it?
I am using lists to do the intersection job, it works quite well. But I still need to do an other request to get the documents using the documents ids. Any idea what could I do differently to retrieve the documents directly?
Thank you!
This is going to be awkward. The basic idea is that you have to build a view where the map function emits every possible combination of tags and countries as the key, and there's no reduce function. This way, looking for ["france","luxembourg"] would return all documents that emitted that key (and therefore are in the intersection), because views without a reduce function return the emitting document for every entry. This way, you only have to do one request.
This causes a lot of emits to happen, but you can lower that number by sorting the tags both when emitting and when searching (automatically turn ["luxembourg","france"] into ["france","luxembourg"]), and by taking advantage of the ability of CouchDB to query prefixes (this means that emitting ["belgium","france","luxembourg"] will let you match searches for ["belgium"] and ["belgium","france"]).
In your example above, for the countries, you would only emit:
// doc 1
emit(["luxembourg"],null);
emit(["france","luxembourg"],null);
// doc 2
emit(["germany"],null);
emit(["france","germany"],null);
// doc 3
emit(["luxembourg"],null);
emit(["belgium","luxembourg"],null);
emit(["france","luxembourg"],null);
emit(["belgium","france","luxembourg"],null);
Anyway, for complex queries like this one, consider looking into a CouchDB-Lucene combination.