not_analyzed for search and analyzed for visual in kibana - charts

I have a problem with analyzed or not string fields. Because i need "analyzed" data for searchs, and "not_analyzed" when i do a top N report, for example with "vertical line chart"
It's posible use any filter or exclude/include Pattern to keep the full string without analyzed in visuals ??
Im using Version 4.1.1 of Kibana.
Thanks a lot !

Elasticsearch has different analyzers called index analyzer and search analyzer. You can use different analyzers for both.
This link here might give you a solution: https://www.elastic.co/guide/en/elasticsearch/guide/current/_controlling_analysis.html

Related

How could I filter the result via wordnet synonyms /remove the negative synonyms from result

Suppose I am trying to build an app which returns synonyms via Wordnet based on Lucent.net
A customized Synonym Analyzer and WordNet Synonym Engine built completed and it does work.
A synonym dictionay that downloaded from http://wordnetcode.princeton.edu/3.0/WNprolog-3.0.tar.gz is currently I am using,
but unfortunately, the result is not what I have been expacting sometime. due to some synonyms are about negative and might
cause horrible experience (see 'black' i.e),so here's my problem need to solve:
How could I filter/remove the negative synonyms from result.
Thanks in advance. Any help is greatful ...

Using RegeX in Adobe CQ query builder

Is there any way to use Regular expressions in Query builder.
Is JCR supports this?
Any pointers on this would be helpful for us.
Thanks in advance.
San
If this QueryBuilder API documentation where to believed as being definitive, then no I would not say there is regex support. However there does seem to be some wildcard support that may be useful. What I would do in this case is try to craft a query around all the properties that you know of about your nodes that can identify them. For example using the debug tool at http://x.x.x.x:4502/libs/cq/search/content/querydebug.html a query like may give you some ideas
type=cq:Page
path=/content/myapp
nodename=*s
1_relativedaterange.property=jcr:content/cq:lastModified
1_relativedaterange.lowerBound=-48h
Where I'm looking for pages in my app content, that end is 's', that have been modified in the last 48 hours. You can even filter by resourceType, template, and any other property that can help you find those nodes. You may even consider adding your own just for this query.
Maybe you can have a sling job, where in Java you could iterate the node names (or whatever) and you do have regex, and tag nodes with a meaningful property that you can then use to query using the query builder.

Lucene.Net/SpellChecker - multi-word/phrase based auto-suggest

I've implemented Lucenet.NET on my site, using it to index my products which are theatre shows, tours and attractions around London.
I want to implement a "Did you mean?" feature for when users misspell product names that takes the whole product titles into account and not just single words. For example,
If the user typed:
Lodnon Eye
I would like to auto-suggest:
London
London Eye
I assume I nead to have the analyzer index the titles as if they are a single entity, so that SpellChecker can nearest-match on the phrase, as well as the individual words.
How would I do this?
There is a excellent blog series here:
Lucene.NET
Introduction to Lucene
Indexing basics
Search basics
Did you mean..
Faceted Search
Class Reference
I have also found another project called SimpleLucene which you can use to maintain your lucene indexes whenever you need to update or delete a document. Read about it here
i've just recently implemented a phrase autosuggest system in lucene.net.
basically, the java version of lucene has a shinglefilter in one of the contrib folders which breaks down a sentence into all possible phrase combinations. Unfortunately lucene.nets contrib filters aren't quite there yet and so we don't have a shingle filter.
but, a lucene index written in java can be read by lucene.net as long as the versions are the same. so what i did was the following :
created a spell index in lucene.net using the spellcheck.IndexDictionary method as laid out in the "did you mean" section of jake scotts link. please note that only creates a spelling index of single words, not phrases.
i then created a java app that uses the shingle filter to create phrases of the text i'm searching and saves it in a temporary index.
i then wrote another method in dotnet to open this temporary index and add each of the phrases as a line or document into my spelling index that already contains the single words. the trick is to make sure the documents you're adding have the same form as the rest of the spell documents, so i ripped out the methods used in the spellchecker code in the lucene.net project and edited those.
once you've done that you can call the spellcheck.suggestsimilar method and pass it a misspelled phrase and it will return you a valid suggestion.
This is probably not the best solution and I definitely would use the answer suggested by spaceman but here is another possible solution. Use the KeywordAnalyzer or the KeywordTonenizer on each title, this will not break down the title into separate tokens but keep it as one token. Using the SuggestSimilar method would return the whole title as suggestions.

Lucene.NET faceted search

I found a great tutorial on performing a faceted search.
http://www.devatwork.nl/articles/lucenenet/faceted-search-and-drill-down-lucenenet/
This article does not explain how to retrieve the narrowed available attributes to filter from (for further drill down).
Lets say I am looking for planners that are red. When I perform the faceted search, I want to return all available attributes to filter from that are red. Then when I add a "weekly format" filter, I want the attribute list to get even smaller, containing only filters available for the segmented group.
I want love to use Solr/SolrNET but I am in a shared hosting situation with limited access to the actual server.
I am fairly new to lucene.net, so examples are much appreciated.
IIUC, you get a BitArray containing the list of the filtered results. In the tutorial's example, you will have combinedResults as this list. If you want to further narrow this down, you need to reiterate the process: run another searchQuery and intersect the results with the BitArray you have for combinedResults.
I want love to use Solr/SolrNET but I am in a shared hosting situation with limited access to the actual server.
You can always use an off-site, hosted Solr solution. See this question for more information.

Lucene.NET - Search phrase containing "and"

Looking for advice on handling ampersands and the word "and" in Lucene queries. My test queries are (including quotes):
"oil and gas field" (complete phrase)
"research and development" (complete phrase)
"r&d" (complete phrase)
Ideally, I'd like to use the QueryParser as the input is coming from the user.
During testing and doc reading, I found that using the StandardAnalyzer doesn't work for what I want. For the first two queries, a QueryParser.Parse converts them to:
contents:"oil gas field"
contents:"research development"
Which isn't what I want. If I use a PhraseQuery instead, I get no results (presumably because "and" isn't indexed.
If I use a SimpleAnalyzer, then I can find the phrases but QueryParser.Parse converts the last term to:
contents:"r d"
Which again, isn't quite what I'm looking for.
Any advice?
if you want to search for "and" you have to index it. Write you own Analyzer or remove "and" from the list of stop words. The same applies to the "r&d". Write your own Analyzer that creates 3 words from the text: "r", "d", "r&d".
Step one of working with Lucene is to accept that pretty much all of the work is done at the time of indexing. If you want to search for something then you index it. If you want to ignore something then you don't index it. It is this that allows Lucene to provide such high speed searching.
The upshot of this is that for an index to work effectively you have to anticipate what your analyzer needs to do up front. In this case I would write my own analyzer that doesn't strip any stop words and also transforms & to 'and' (and optionally # to 'at' etc). In the case of r&d matching research & development you are almost certainly going to have to implement some domain specific logic.
There are other ways of dealing with this. If you can differentiate between phrase searches and normal keyword searches then there is no reason you can't maintain two or more indexes to handle different types of search. This gives very quick searching but will require some more maintenance.
Another option is to use the high speed of Lucene to filter your initial results down to something more manageable using an analyzer that doesn't give false negatives. You can then run some detailed filtering over the full text of those documents that it does find to match the correct phrases.
Ultimately I think you are going to find that Lucene sacrifices accuracy in more advanced searches in order to provide speed, it is generally good enough for most people. You are probably in uncharted waters trying to tweak your analyzer this much.