Is There a Way to Search by Phrases and Using Regular Expressions in Hibernate Search? - hibernate-search

I know the Hibernate Search 5.5 Reference Guide describes the phrase queries at the section 5.1.2.4. Phrase queries, but that kind of phrase queries only allow slop factors instead of the regular expressions.
Is there a way to allow me in Hibernate Search to search by phrases and using the regular expressions? Thanks.

Section 5.1.2.4 shows a simple example using Phrase queries using the Hibernate Search DSL.
When you use the DSL you get some help, such as automatic type conversion, so it's the suggested way for most simple use case, however you can bypass the DSL and create any Lucene Query using the Lucene APIs, and use the Query instance like it was built using the DQL.
For regex queries, see org.apache.lucene.search.RegexpQuery.
All Apache Lucene query instances are compatible with Hibernate Search.

Related

Self join(hierarchical data) in Hibernate-search

I have two fields in an entity:
id
parentId
I want a self-join to fetch (hierarchical data) the childrens of parent id.
Something like oracle Hierarchical Queries:
At the moment, Hibernate Search does expose runtime joins capabilities.
If your goal is to order results "parents first", I think you may be able to create a getter that creates a string similar to "rootId.grandParentId.parentId.thisId", and index the result of that getter. Then you can sort on that string. That would clearly be a hack, but it may work.
Alternatively, you may be able to leverage native join capabilities of Lucene or Elasticsearch within Hibernate Search But that will require extensive knowledge of Lucene or Elasticsearch.
With Hibernate Search 5, you may be able to implement it for Lucene, but probably not for Elasticsearch. Unforunately, documentation of Lucene features is sparse.
With Hibernate Search 6, you may be able to implement it in both cases.
You will need:
native fields (Lucene/ES)
native predicates
obviously a good deal of knowledge of advanced Lucene/Elasticsearch features. For Lucene, documentation is sparse. For Elasticsearch, here is a good place to start: https://www.elastic.co/guide/en/elasticsearch/reference/current/parent-join.html

NLP integration with mongodb

Anyone done integration of NLP with mongodb database?
Currently multiple APIs available to identify entities from natural language .
Basic requirement is to generate query .
Regards,
Jalpesh
My own .NET library AboditNLP can do this. (Excuse the plug but there are no others that can do this as far as I know)
It converts an English query into an expression tree and can then rewrite that expression tree into a SQL, LINQ-to-objects or MongoDB query.
e.g.

Is there a direct comparison between Lucene.Net syntax and Amazon Cloud Search syntax

I have a large application that has hundreds of lines of complex queries in lucene.net, and I want to be able to move to Amazon Cloud Search.
Instead of re-writing all the queries, I was thinking of writing some sort of converter. Before I do though, I thought I would make sure that there is a direct comparison for every type of Lucene Query? Things like inner clauses etc.
Better yet, is there already a library that does it?
I aware that there is a .net library for query cloud search, and also the aws sdk, but I want to have something that allows easy switching between local lucene.net and ACS.
It's way easier than that -- just select CloudSearch's Lucene query parser via the parameter q.parser=lucene with your queries. http://docs.aws.amazon.com/cloudsearch/latest/developerguide/searching.html
lucene—specify search criteria using the Apache Lucene query parser
syntax. If you currently use the Lucene syntax, using the lucene query
parser enables you to migrate your search services to an Amazon
CloudSearch domain without having to completely rewrite your search
queries in the Amazon CloudSearch structured search syntax.

How can i perform an sql LIKE search using Hibernate Search?

I want to perform LIKE search (e.g. all words containing 'abc' i.e. %abc%) but by using the Hibernate Search API.
Is there a way to do it by using the existing analyzers ?
If so which one is better in terms of performance; SQL or Hibernate Search for this case ?
Maybe have a look at this:
http://lucene.apache.org/core/4_10_2/core/org/apache/lucene/search/RegexpQuery.html?is-external=true
But note this:
"Note this query can be slow, as it needs to iterate over many terms. In order to prevent extremely slow RegexpQueries, a Regexp term should not start with the expression .*"
This should be included in Hibernate-Search
Correct, Hibernate Search is much more efficient for this than using a SQL LIKE criteria.
The StandardAnalyzer (org.apache.lucene.analysis.standard.StandardAnalyzer) is a good fit, other analyzers will do more advanced text splitting.

Scala integration with Mongodb

We're using mongodb, and rewriting parts of our stack with scala. I'm wondering if I should stick with mophia, or use a scala mongodb library such as subset.
Question is what do I get out of subset? e.g. with mophia I don't have to manually define the mongodb field names like i have to do in subset...
Is subset really worth using?
We use casbah + salat and it works well in almost all cases.
With Scala you should consider using Casbah, which is an officially supported interface for MongoDB that builds on the Java driver.
Casbah's approach is intended to add fluid, Scala-friendly syntax on top of MongoDB and handle conversions of common types. If you try to save a Scala List or Seq to MongoDB, we automatically convert it to a type the Java driver can serialize. If you read a Java type, we convert it to a comparable Scala type before it hits your code. All of this is intended to let you focus on writing the best possible Scala code using Scala idioms. A great deal of effort is put into providing you the functional and implicit conversion tools you’ve come to expect from Scala, with the power and flexibility of MongoDB.
Casbah provides improved interfaces to GridFS, Map/Reduce and the core Mongo APIs. It also provides a fluid query syntax which emulates an internal DSL and allows you to write code which looks like what you might write in the JS Shell. There is also support for easily adding new serialization/deserialization mechanisms for common data types.
Additionally to the ORM-Mapper/Client-Libraries, I would suggest you give Rouge a try. It will serve you with a nice Query DSL for Mongo. Rogue 1.X will only support Lift-MongoDB but version 2.x (which will ship in very near future) will work for a lot more MongoDB libraries.
A sample query would be (pure Scala code with compiletime typechecking):
Venue where (_.mayor eqs 1234) and (_.categories contains "Thai") fetch(10)
which queries for 10 entries in the Venue collection where 1234 is the mayor and Thai is one of its categories.
I am the author of Subset. I would say "Subset" is not really a kind of ORM library. It has no methods for working with databases and collections, leaving it to Java/Scala drivers. But it is more focused on transformations of MongoDB documents. This transformation core is rather generic and suitable not only for reading/writing of fields, but for applications that need perform e.g. document migrations as well. Query/Update builders Subset provides are built on top of this "core".
That said, if you need ORM, there are simpler alternatives indeed. I never had an intent for Subset to compete with true ORM libraries, I've filled the gap I met in my projects.