I am trying to make a query that searches for the exact keyword but case insensitively.
It works fine but the issue that it searches for the whole keywords into the DB that CONTAINS my search term.
mongoTemplate.findOne(Query.query(Criteria.where("resourceID").regex(id, "i")), Resource.class);
I need to make like the follwoing script but in java:
db.stuff.find( { foo: /^bar$/i } );
Resource resource = mongoTemplateGoVacation.findOne(Query.query(Criteria.where("resourceID").regex("^"+id+"$", "i")), Resource.class);
Related
I have indexed an object which has a list of strings like ["ZZA-KL-2A", "ZZA-KL-ZZB"]. I want to search and get all items which starts with a certain 3 letter code. So I want to check each item in the list and check something like 'StartsWith'.
I can see from documentation that we have something like Match, MatchContained but nothing for start with for the list of string items.
Please note that this question is not related to ordinary string comparison in C# or LINQ before flagging the question.
Just use a filter
var searchQuery = client.Search<MyContent>()
.Filter(x => x.OrderNumber.StartsWith("Find"));
https://world.episerver.com/documentation/developer-guides/search-navigation/NET-Client-API/searching/Filtering/
You may use Prefix or PrefixCaseInsensitive:
Matching by beginning of a string (startsWith)
The Prefix method lets you match by the beginning of a string. The
following search matches blog posts titled Find and Find rocks! but
not find, Finding or Hello Find.
var searchQuery = client.Search<BlogPost>().Filter(x =>
x.Title.Prefix("Find"));
Use the PrefixCaseInsensitive method to match by the beginning of a
string in a case-insensitive way. The following search matches blog
posts titled Find, Find rocks! and Find but not Finding or Hello Find.
var searchQuery = client.Search<BlogPost>().Filter(x =>
x.Title.PrefixCaseInsensitive("Find"));
Source: https://docs.developers.optimizely.com/digital-experience-platform/v1.1.0-search-and-navigation/docs/strings#matching-by-beginning-of-a-string-startswith
I am writing an app and one feature I am working on is to allow users to search for users in the search bar. At the moment it works, however the search is case sensitive. I was wondering if there is anyway to make the search non-case sensitive? For example, when I did a similar thing in PHP, I would do something like this:
$searchTerm = strtolower($searchTerm);
I would then compare it to the username converted to lower case too.
Here is the kind of thing I am using right now:
var findUsers:PFQuery = PFUser.query()
if !name.isEmpty{
findUsers.whereKey("username", containsString: name) //name is what the user entered
}
Instead of using contains string use matchesRegex: "(?i)(name)"
i am using whoosh to index over 200,000 books. but i have encountered some problems with it.
the whoosh query parser returns NullQuery for words like "C#", "C++" with meta-characters in them and also for some other short words. this words are used in the title and body of some documents so i am not using keyword type for them. i guess the problem is in the analysis or query-parsing phase of searching or indexing but i can't touch my data blindly. can anyone help me to correct this issue. Tnx.
i fixed the problem by creating a StandardAnalyzer with a regex pattern that meets my requirements,here is the regex pattern:
'\w+[#+.\w]*'
this will make tokenizing of fields to be done successfully, and also the searching goes well.
but when i use queries like "some query++*" or "some##*" the parsed query will be a single Every query, just the '*'. also i found that this is not related to my analyzer and this is the Whoosh's default behavior. so here is my new question: is this behavior correct or it is a bug??
note: removing the WildcardPlugin from the query-parser solves this problem but i also need the WildcardPlugin.
now i am using the following code:
from whoosh.util import rcompile
#for matching words like: '.NET', 'C++' and 'C#'
word_pattern = rcompile('(\.|[\w]+)(\.?\w+|#|\+\+)*')
#i don't need words shorter that two characters so i don't change the minsize default
analyzer = analysis.StandardAnalyzer(expression=word_pattern)
... now in my schema:
...
title = fields.TEXT(analyzer=analyzer),
...
this will solve my first problem, yes. but the main problem is in searching. i don't want to let users to search using the Every query or *. but when i parse queries like C++* i end up an Every(*) query. i know that there is some problem but i can't figure out what it is.
I had the same issue and found out that StandardAnalyzer() uses minsize=2 by default. So in your schema, you have to tell it otherwise.
schema = whoosh.fields.Schema(
name = whoosh.fields.TEXT(stored=True, analyzer=whoosh.analysis.StandardAnalyzer(minsize=1)),
# ...
)
I have tried this myself for a considerable period and looked everywhere around the net - but have been unable to find ANY examples of Fuzzy Phrase searching via Lucene.NET 2.9.2. ( C# )
Is something able to advise how to do this in detail and/or provide some example code - I would seriously seriously appreciate any help as I am totally stuck ?
I assume that you have Lucene running and created a search index with some fields in it. So let's assume further that:
var fields = ... // a string[] of the field names you wish to search in
var version = Version.LUCENE_29; // your Lucene version
var queryString = "some string to search for";
Once you have all of these you can go ahead and define a search query on multiple fields like this:
var analyzer = LuceneIndexProvider.CreateAnalyzer();
var query = new MultiFieldQueryParser(version, fields, analyzer).Parse(queryString);
Maybe you already got that far and are only missing the fuzzy part. I simply add a tilde ~ to every word in the queryString to tell Lucene to do a fuzzy search for all words in the queryString:
if (fuzzy && !string.IsNullOrEmpty(queryString)) {
// first escape the queryString so that e.g. ~ will be escaped
queryString = QueryParser.Escape(queryString);
// now split, add ~ and join the queryString back together
queryString = string.Join("~ ",
queryString.Split(' ', StringSplitOptions.RemoveEmptyEntries)) + "~";
// now queryString will be "some~ string~ to~ search~ for~"
}
The key point here is that Lucene uses fuzzy search only for terms that end with a ~. That and some more helpful info was found on
http://scatteredcode.wordpress.com/2011/05/26/performing-a-fuzzy-search-with-multiple-terms-through-multiple-lucene-net-document-fields/.
string q = "m";
Query query = new QueryParser("company", new StandardAnalyzer()).Parse(q+"*");
will result in query being a prefixQuery :company:a*
Still I will get results like "Fleet Africa" where it is rather obvious that the A is not at the start and thus gives me undesired results.
Query query = new TermQuery(new Term("company", q+"*"));
will result in query being a termQuery :company:a* and not returning any results. Probably because it interprets the query as an exact match and none of my values are the "a*" literal.
Query query = new WildcardQuery(new Term("company", q+"*"));
will return the same results as the prefixquery;
What am I doing wrong?
StandardAnalyzer will tokenize "Fleet Africa" into "fleet" and "africa". Your a* search will match the later term.
If you want to consider "Fleet Africa" as one single term, use an analyzer that does not break up your string on whitespaces. KeywordAnalyzer is an example, but you may still want to lowercase your data so queries are case insensitive.
The short answer: all your queries do not constrain the search to the start of the field.
You need an EdgeNGramTokenFilter or something like it.
See this question for an implementation of autocomplete in Lucene.
Another solution could be to use StringField to store the data for ex: "Fleet Africa"
Then use a WildCardQuery.. Now f* or F* would give results but A* or a* won't.
StringField is indexed but not tokenized.