How to get autocomplete and spatial search in one query with Solr? - autocomplete

Current Features:
Autocomplete using solr terms component, parameter terms.prefix.
bounding box for geo location searches.
Things Tried:
I have tried to combine both queries into one. However I never get the results to filter by geo location.
Rather I get everything from the terms.prefix search.
I have also tried using dismax, edismax + bbox geo location search. I know that dismax wouldn't work because it doesn't have a prefix parameter.
I looked day and night on Google try to figure this out.
I would hate to stem on my field name "names", so that every letter gets considered a keyword.
Any help is really appreciated.

Unfortunately you cannot do this in the termscomponent as it simply does not support filtering based on other fields than the one that you are issuing the terms component on.
The simplest solution to the problem is to use the standard requesthandler (ie <requestHandler name="standard" class="solr.SearchHandler>) with your bounding box filter:
fq={!bbox}&sfield=store&pt=45.15,-93.85&d=5
and a facet on the field that you want to list terms for (assuming your field name is 'names'):
facet=true&facet.field=names&f.names.facet.prefix=$yourprefix$
you will end up with a query like:
/select?q=*:*&fq={!bbox}&sfield=store&pt=45.15,-93.85&d=5&facet=true&facet.field=names&f.names.facet.prefix=$yourprefix$
giving a result like:
<lst name="facet_counts">
<lst name="facet_queries"/>
<lst name="facet_fields">
<lst name="name">
<int name="maxtor">1</int>
<int name="memory">1</int>
<int name="mobile">1</int>
<int name="mp500">1</int>
<int name="mb">0</int>
<int name="mini">0</int>
</lst>
</lst>
</lst>
(in the facet section)

Related

EdgeNGramFilterFactory change in solr5

Short version:
Does anyone knows if something happened with EdgeNGramFilterFactory for solr5? It used to work fine on solr 4, but I just upgraded to solr5 and the cores having this fields using this filter refuses to load ...
Long story:
This configuration used to work in solr4.10 (schema.xml):
<field name="NAME" type="string" indexed="true" stored="true" required="true" multiValued="false"/>
<field name="PP" type="text_prefix" indexed="true" stored="false" required="false" multiValued="false"/>
<copyField source="NAME" dest="PP">
<fieldType name="text_prefix" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15" side="front"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
</analyzer>
</fieldType>
And the documentation says I did it right (no clear mention if it is for solr4 or solr5).
However, when I am trying to add a collection using this configuration, it fails with the following message:
<lst name="failure">
<str>
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error from server at http://localhost:8983/solr: Error CREATEing SolrCore 'test_collection': Unable to create core [test_collection] Caused by: Unknown parameters: {side=front}</str>
</lst>
I removed the side=front "unknown" parameter, started from scratch and it worked - meaning no more errors.
So, while it used to work for solr4 without any additional change, for solr5 it no longer works. Did something changed? Did I miss any doc regarding this filter? Any extra library I need to load to make this work?
And final, if the above is meant to be like this (bug/feature/whatever) - is there any workaround in order to have this "side-substring" indexing-functionality without me having to generate the values when I am adding docs to solr?
Update: with the "hacked" schema (i.e. without side=front), I indexed the documents and changed the PP field to be stored. when I searched, it looks like it indexes the entire value. For example, for NAME:ELEPHANT, I found PP:ELEPHANT ...
That attribute side has been removed in the context of LUCENE-3907 in Version 4.4. This filter now always behaves as if you gave in side="front". So you may just remove that attribute and are fine, since you are using it the "front-way".
As you can read in the conversation of the linked Lucene Issue
If you need reverse n-grams, you could always add a filter to do that
afterwards. There is no need to have this as separate logic in this
filter. We should split logic and keep filters as simple as possible.
And this is what has been done. The side attribute has been removed from the filter.
This has been done in Lucene, not directly in Solr. As Lucene is a Java-API it has been mentioned in the Java Doc of the filter
As of Lucene 4.4, this filter does not support
EdgeNGramTokenFilter.Side.BACK (you can use ReverseStringFilter
up-front and afterward to get the same behavior), handles
supplementary characters correctly and does not update offsets
anymore.
This may be the reason why you do not find a word about it in the Solr documentation. But this change has also been mentioned in Lucene's Change Log.

Typo3 6.2: Table Records in FCE (Flux)

I am trying to get a list of table-records in my FCE.
In documentation, i found the part "items" can use a Query.
But i can not find a way to make it work.
<flux:field.select name="myRecord" items="NOTHING WORKS HERE" label="Choose" maxItems="1" minItems="1" size="5" multiple="false" />
Does anybody know how the items can be filled with table-records ?
If you are trying to get the select box with all items maybe you can then switch to this:
<flux:field.relation size="1" minItems="0" table="tx_{YourExtensionName}_domain_model_{YourObjectName}" maxItems="1" name="package">
</flux:field.relation>
Of corse, you can use any table from the DB, like "pages"..
Hope it helps!

Using SOLR Autocomplete for multiple terms (i.e. comma-separated locations)

I've got SOLR up and running, indexing data via the DIH, and properly returning results for queries. I'm trying to setup another core to run suggester, in order to autocomplete geographical locations. We have a web application that needs to take a city, state / region, country input. We'd like to do this in a single entry box. Here are some examples:
Brooklyn, New York, United States of America
Philadelphia, Pennsylvania, United States of America
Barcelona, Catalunya, Spain
Assume for now that every location around the world can be split into this 3-form input. I've setup my DIH to create a TemplateTransformer field that combines the 4 tables (city, state and country are all independent tables connected to each other by a master places table) into a field called "fullplacename":
<field column="fullplacename" template="${city_join.plainname},
${region_join.plainname}, ${country_join.plainname}"/>
I've defined a "text_auto" field in schema.xml:
<fieldType class="solr.TextField" name="text_auto">
<analyzer>
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
and have defined these two fields as well:
<field name="name_autocomplete" type="text_auto" indexed="true" stored="true" multiValued="true" />
<copyField source="fullplacename" dest="name_autocomplete" />
Now, here's my problem. This works fine for the first term, i.e. if I type "brooklyn" I get the results I'd expect, using this URL to query:
http://localhost:8983/solr/places/suggest?q=brooklyn
However, as soon as I put a comma and/or a space in there, it breaks them up into 2 suggestions, and I get a suggestion for each:
http://localhost:8983/solr/places/suggest?q=brooklyn%2C%20ny
Gives me a suggestion for "brooklyn" and a suggestion for "ny" instead of a suggestion that matches "brooklyn, ny". I've tried every solution I can find via google and haven't had any luck. Is there something simple that I've missed, or is this the wrong approach?
Thanks!
EDIT: Just in case, here's the searchComponent and requestHandler definition:
<requestHandler name="/suggest" class="org.apache.solr.handler.component.SearchHandler">
<lst name="defaults">
<str name="spellcheck">true</str>
<str name="spellcheck.dictionary">suggest</str>
<str name="spellcheck.count">10</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
<searchComponent name="suggest" class="solr.SpellCheckComponent">
<lst name="spellchecker">
<str name="name">suggest</str>
<str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
<str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>
<str name="field">name_autocomplete</str>`<br/>
</lst>
</searchComponent>
The problem lies in the suggester. Like the spellchecker it tokenizes on whitespace.
http://lucene.472066.n3.nabble.com/suggester-issues-tp3262718p3266140.html has a solution for this problem.
You are using the KeywordTokenizer which will not create separate tokens for "Brooklyn", "NY" and "United States".
Your example queries do not look so much like autocomplete but more like regular searches.
Autocomplete query (IMHO) contains only partial terms:
http://localhost:8983/solr/places/suggest?q=brook
for type ahead lists. You want to use EdgeNGram for that: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.EdgeNGramFilterFactory
Most probably in combintation with StandardTokenizer and/or WordDelimiterFilterFactory.
For your query example:
http://localhost:8983/solr/places/suggest?q=brooklyn%2C%20ny
StandardTokenizer in combination with LowercaseFilter and dismax request handler with a good configuration of the mm parameter - restricting hits to those that contain all input terms - would work well, see: http://wiki.apache.org/solr/DisMaxQParserPlugin#mm_.28Minimum_.27Should.27_Match.29
I feel the accepted answer is a bit too complex. An elegant way of doing it would be to use http://localhost:8983/solr/places/suggest?spellcheck.q=brooklyn in place of http://localhost:8983/solr/places/suggest?q=brooklyn. As mentioned here

NSXMLParser : requesting guidance with making grouped tables from an RSS/XML feed

Im making a group table that is populated from an XML/RSS feed, ive managed to parse the data to the table just fine, but im stuck on how to make the table grouped?
ie, i want an events listing, and i want to organise the events in groups, using the Month for each group, how would i achieve this?
below is my XML structure, its pretty basic
<EventsUpcoming>
<Event id="1">
<month>July</month>
<title>Ian Moss</title>
<date>Saturday, July 1st</date>
<ticket>$35 On the door</ticket>
<description>
Ian Moss from Cold Chisel fame will be touring Australia and the only venue to secure him in Perth is the Blvd.
</description>
</Event>
<Event id="2">
<month>August</month>
<title>Cold Chisel</title>
<date>Saturday, August 3rd</date>
<ticket>$25 on the door</ticket>
<description>
From Khe San fame, Cold Chisel is back with the legendary Jimmy Barnes, dont miss out this gig. Its gonna go down in the books for sure!
</description>
</Event>
<Event id="3">
<month>September</month>
<title>Australian Crawl</title>
<date>Saturday, September 1st</date>
<ticket>Free</ticket>
<description>
They're one of Australia's most iconic band names, be sure to come down and check them out before they die.
</description>
</Event>
</EventsUpcoming>
If anyone knows any tutorial sites that might be helpful or just tips on how to go about doing this, thatd be much appreciated. Thanks in advance :)
NSMutableSet doesn't store the duplicate values,it only stores distinct ones.So at the time of parsing,you can use NSMutableSet to store the 'month' value of each xml element and set the number of sections in a tableview to the count of your NSMutableSet.

iPhone: How to handle Core Data relationships

I'm new to Core Data and databases in general. Now I need to parse a XML and store the contents in Core Data. The XML looks something like this:
<books>
<book id="123A" name="My Book">
<page id="95D" name="Introduction">
<text id="69F" type="header" value="author text"/>
<text id="67F" type="footer" value="author text"/>
</page>
<page id="76F" name="Chapter 1">
<text id="118" type="selection">
<value data="1">value1</value>
<value data="2">value2</value>
<value data="3">value3</value>
</text>
</page>
</book>
<book id="124A"...
From my understanding I would need four Entities, like "Books", "Book", "Pages" and "Text". I wonder how to set the relationships correctly and how to add for example a Page object to a Book object and how to retrieve a Text object attribute's value? The tutorials I have found mostly deal with one Entity so I didn't really get the idea.. Gtrateful for any help!
No, you'd need three entities. You can think of "Books" as the CoreData database you're using. The CoreData database then includes a number of entities called book.
I think the data model you have is a bit weird, but I guess it makes sense for your application. To map it to CoreData I would:
Add the entities Book, Page, Text
Add a bookId, pageId, textId to them, respectively.
Then add a relation from Page to Book, and from Text to Page.
By then you should be able to print out a whole book by asking for all Pages that have
Book = the book you're interested in
and then order all those Pages by their pageId
and in order, ask for all texts that have
Page = the current page
then order those Texts by their textId.
What might be a problem is that a Text can have multiple Values, as seen in your XML above. You could use this by adding another entity called Value, but I would probably solve it by adding the attributes "value" and "type" to the Text entity directly. (You could then use "value" as a second sort key when printing out a page.
Check out these links:
http://developer.apple.com/iphone/library/documentation/Cocoa/Conceptual/CoreData/
http://developer.apple.com/cocoa/coredatatutorial/index.html (for regular Cocoa, but the same principles hold so this should help)