How can i perform an sql LIKE search using Hibernate Search? - hibernate-search

I want to perform LIKE search (e.g. all words containing 'abc' i.e. %abc%) but by using the Hibernate Search API.
Is there a way to do it by using the existing analyzers ?
If so which one is better in terms of performance; SQL or Hibernate Search for this case ?

Maybe have a look at this:
http://lucene.apache.org/core/4_10_2/core/org/apache/lucene/search/RegexpQuery.html?is-external=true
But note this:
"Note this query can be slow, as it needs to iterate over many terms. In order to prevent extremely slow RegexpQueries, a Regexp term should not start with the expression .*"
This should be included in Hibernate-Search

Correct, Hibernate Search is much more efficient for this than using a SQL LIKE criteria.
The StandardAnalyzer (org.apache.lucene.analysis.standard.StandardAnalyzer) is a good fit, other analyzers will do more advanced text splitting.

Related

Are MongoDB “starts with” Or "Contains" query are fast as FindByID Query?

I Want to query using part of id to get all the matched documents. So I tried “starts with” and "contains" which works find but is there any performance issue for large collection?
The best way to make this search optimum :
Add $text index on the fields you want to do search in. This is really important because internally it tokenize your string to that you could search for a part of it.
Use regex which is also quicker to do.
If you are using aggregate, read this mongodb official doc about aggregation optimization which might help you to implement this in efficient manner : https://docs.mongodb.com/manual/core/aggregation-pipeline-optimization/
Last but not the least, if you are not yet fully inclined towards mongodb and project is fresh, look out for elasticsearch service which is based on Lucene. Its extremely powerful doing these kinds of searches.

Is There a Way to Search by Phrases and Using Regular Expressions in Hibernate Search?

I know the Hibernate Search 5.5 Reference Guide describes the phrase queries at the section 5.1.2.4. Phrase queries, but that kind of phrase queries only allow slop factors instead of the regular expressions.
Is there a way to allow me in Hibernate Search to search by phrases and using the regular expressions? Thanks.
Section 5.1.2.4 shows a simple example using Phrase queries using the Hibernate Search DSL.
When you use the DSL you get some help, such as automatic type conversion, so it's the suggested way for most simple use case, however you can bypass the DSL and create any Lucene Query using the Lucene APIs, and use the Query instance like it was built using the DQL.
For regex queries, see org.apache.lucene.search.RegexpQuery.
All Apache Lucene query instances are compatible with Hibernate Search.

Is there a direct comparison between Lucene.Net syntax and Amazon Cloud Search syntax

I have a large application that has hundreds of lines of complex queries in lucene.net, and I want to be able to move to Amazon Cloud Search.
Instead of re-writing all the queries, I was thinking of writing some sort of converter. Before I do though, I thought I would make sure that there is a direct comparison for every type of Lucene Query? Things like inner clauses etc.
Better yet, is there already a library that does it?
I aware that there is a .net library for query cloud search, and also the aws sdk, but I want to have something that allows easy switching between local lucene.net and ACS.
It's way easier than that -- just select CloudSearch's Lucene query parser via the parameter q.parser=lucene with your queries. http://docs.aws.amazon.com/cloudsearch/latest/developerguide/searching.html
lucene—specify search criteria using the Apache Lucene query parser
syntax. If you currently use the Lucene syntax, using the lucene query
parser enables you to migrate your search services to an Amazon
CloudSearch domain without having to completely rewrite your search
queries in the Amazon CloudSearch structured search syntax.

Does MongoDB support soundex or fuzzy matching?

Does MongoDB support soundex or fuzzy matching? I want to spot dupes of basic contact name and address fields. I'm using the official C# driver. Thanks
Mongodb doesn't support soundex matching, but it has Full Text Search.
Also,
You can always just store the
soundex-encoded string in a separate
field in mongo and search against
that. Soundex is a really trivial
algorithm and should only take a handful of
lines.
-- from mongodb-user
MongoDB does not support real fulltext search and nothing like soundex (which is a very bad part for matching terms - something like Levensthein distance calculation is much better).
In addition look at my last comment here:
Full-text search in NoSQL databases

advanced searching mongodb using mongomapper, sunspot/solr or sphinx?

I have am using mongodb with mongomapper to store all my products. Each product belongs to multiple categories that have many levels i.e. category, sub category etc.
Each product has many search fields that are embedded documents in product.
All this is working and I now want to add search to the app.
The search system needs text search: multiple, dynamic, faceted search including min/max range search.
I have been looking into sunspot gem but having difficulty setting it up on dev let alone trying to run it in production! And I have also looked at sphinx.
But I am wondering if using just mongomapper / mongodb will be quick enough and the best way, as its quite a complex search system ?
Any help / suggestions / experiences / tutorials and examples on this would be most appreciated.
Thanks a lot,
Rick
I've been involved with a very large Sphinx powered search and I think its awful. Very difficult to configure if you want anything past a very simple full-text search. Solr\Lucene, on the other hand, is incredibly flexible and was unbelievably easier to setup and get running.
I am not using Solr in conjunction with MongoDB to power full text search with all the extra goodies, like facets, etc. Depending on how you configure Solr, you may not need to even hit your MongoDB for data. Or, you may tell Solr to index fields, but not to store them and instead you just store the ObjectId's that correspond to data inside of MongoDB.
If your search truly is a complex search system, I very strongly recommend that you do not use MongoDB for search and go with Solr. One big reason is that MongoDb doesnt have a full text feature - instead, it has regular expression matches. The Regex matches work wonderfully but will only use indexes in certain cases.