According to this post on how to do query auto-completionsuggestions in lucene getting "Did You Mean" functionality best involves using a LuceneDictionary. But I probably would have used a fuzzy query for this before reading this post. Now I'm wondering which is faster, which is easier to implement?
Have you looked at some NGram wrappers for Lucene. They are the best ways to do the "did you mean" functionality in Lucene. I found this page for the docs.
Related
I am new to MongoDB, but have gone pretty far with studying some of the fundamentals. So far, I have been using the mongo shell to write my queries. I was wondering if anyone here can introduce me to a simple and straightforward user-friendly interface I can use to type out my queries.
I have examined Studio3T and Robo3T. Both do not really meet my requirements as they are too heavy and full of menus I don't need at the moment. I am tempted to go on a wild search and download spree, but I figured some help here might be a better approach.
I need it to be the following:
-Simple and lightweight
-Completes curly and square brackets
-May or may not offer error checks while writing queries
-May or may not offer auto-complete of keywords.
I'd really appreciate any pointers. Thanks
My project needs some natural language processing. I'm completely new to the field.
what I'm trying to achieve is that when the User enter the description of the product I look for in my database which description is nearest and suggest that the category, product group and sub-group (the tree of the product).
For this titles 250 extracts products for each subgroup.
What is the specific term in NLP for doing this? I tried googling for a while, but had no luck since I don't know the term. Any good tutorials to start with? Are there any good libraries in doing this specific task?
Thank you.
From what I can tell autocomplete or text prediction/predictive search isn't really a big research area in NLP. It wasn't even covered in any of my graduate level classes and I do research in this area. I think the reason is that there are solutions that exist which are good enough for the vast majority of real world problems.
I'm not sure which language you work in, but the library you want to work with is probably Lucene if you are dealing with java, perhaps setting up a Solr instance if this is a general problem for you and you are dealing with a large number of ontologies.
You can find some reason tutorials/examples here on stack overflow, such as:
How to implements auto suggest using Lucene's new AnalyzingInfixSuggester API?
I come from an SQL background, where grasping the possible relationships between different models and schemas seems to be quite straightforward to me.
How can I shift the same thing to the MEAN world? For example, let's just assume I have a basic blog engine with a posts table and a comments table, where posts have many comments and each comment has a post. While coding this is easy in, say, Rails, I'm getting stuck here and couldn't find good tutorials.
Also, I'm not sure if adding authors to the party is any more complicated - let's just say posts and comments each have an author, and the author has many comments and also has many posts (once I get this I think highlighting "OP" comments is just the matter of a query).
Can you give me a guideline regarding the differences between what I've been used to in Rails and the approach I need now?
You are used to think in terms of normalization. NoSQL databases let you design your data model in structured documents, meaning you can denormalize your data. It has advantages like data locality and atomicity, but can suffer from redundancy and inconsistency.
An example would be embedding the comments inside each post. Thus, you don't have several collections / tables, and can access your data swiftly.
I advice you to read the book MongoDB Applied Design Patterns to better understand the benefits you would earn.
In my application I want to know which field of the Document has been matched during search.
So I opted for searcher.explain(). However later I found out that this is a costly approach.
As an alternative I pondered in contrib dlls and saw FastVectorHighlighter class having IsFieldMatch() api. But there is no documentation available stating if this can be used without any performance constraint.
So kindly let me know which is better searcher.explain() or FastVectorHighlighter.IsFieldMatch()
It will also be great if you people suggest me any alternative approach as well.
Anyone got any good hints for working with an RTree in Perl? Either a pure RTree implementation which is performant or something I could hijack from a GIS project? Or would it be easier to use something like SQLite's spatial index support?
Cheers
Did you try Tree::R?
There doesn't seem to be much activity on that module, so it may not be good enough, but then again, it might be just what you're looking for. Just play with it for a few.