Sphinx: JSON meta attributes stopped working - sphinx

I'm currently experimenting with sphinx realtime index. I inserted 4,5 millions documents.
Everything was working OK while my json meta attributes were like this:
{"result_type":"publications","publication_type":"essay"}
But yesterday, I wanted to add another value in 'publication_type' key and the json
resulted to:
{"result_type":"publications","publication_type":["essay","big_text"]}
Now I can't find document neither for 'essay', neither for 'big_text'.
The sphinxql query I'm using is like this:
select * from url where meta.publication_type='essay';
Sphinx version is Server version: 2.1.1-beta (rel21-r3701) running on Debian.
Hope you can help me. Is my json string wrong? Where is my mistake?
Thanks in advance.

SELECT *, ANY(x='essay' FOR x IN meta.publication_type) as p FROM url WHERE p=1;
Supported in 2.2.1-dev since r4217.

This was answered on the sphinx forum:
http://sphinxsearch.com/forum/view.html?id=11486
When store arrays, you access the values by index.
So could do
select * from url where meta.publication_type[0]='essay';
It doesnt appear to easy to search 'in any position'. So if essay was ever not the first index, it wouldnt work.
Note, I can't claim credit for figuring this out, just passing this information on.

Related

How to query from ListColumn[String] in cassandra using phantom

I am new to cassandra (started learning on my own interest few days back) and looking for help for the below problem.
I have a Cassandra table "User" and a ListColumn "interests extends ListColumn[String]".
Now, I want to fetch all users with an interest, say "playing".
like: select from user where interests.contains("playing")!
I scanned through the ListColumn api but not able to find any. Also, searched in google but no such helpful posts.
Any help guys please... Thanks in Advance :)
So there is contains among operators and here is an example how to use it. It looks like that it should work as any other operator, so just go for database.userTable.select.where(_.interests contains "playing").fetch() - of course, depending on your conventions.
This is possible with a secondary index on a collection column, which only works with a Set column, and not with a List.
Bottom line:
object interests extends SetColumn[String](this) with Index[Set[String]]
And then you can execute the following:
select.where(_.interests contains "test").fetch()
You can also use multiple restrictions if you allow filtering.
select.where(_.interests contains "test")
.and(_.interests contains "test2")
.allowFiltering()
.fetch()
The above will only match if both interests are found in a record.

Mysql to Sphinx query conversion

how can i write this query on sphinx select * from vehicle_details where make LIKE "%john%" OR id IN (1,2,3,4), can anyone help me? I've search a lot and i can't find the answer. please help
Well if you really want to use sphinx, could perhaps make id into a fake keyword, so can use it in the MATCH, eg
sql_query = SELECT id, CONCAT('id',id) as _id, make, description FROM ...
Now you have a id based keyword you can match on.
SELECT * FROM index WHERE MATCH('(#make *john*) | (#_id id1|id2|id3|id4)')
But do read up on sphinx keyword matching, as sphinx by default only matches whole words, you need to enable part word matching with wildcards, (eg with min_infix_len) so you can get close to a simple LIKE %..% match (which doesnt take into account words)
Actually pretty hard to do, becuase you mixing a string search (the LIKE which will be a MATCH) - with an attribute filter.
Would suggest two seperate queries, one to sphinx for the text filter. And the IN filter just do directly in database (mysql?). Merge the results in the application.

Django-Haystack autocomplete---get distinct results

I would like my autocomplete results with django-haystack to be distinct. However, if multiple objects in my database have a certain value for an attribute on which I am autocompleting, the result appears multiple times.
I am using Haystack with solr as my backend. My query, as in the tutorial, looks like:
SearchQuerySet().autocomplete(content_auto=request.GET.get('q', ''))[:5]
I'm new to Haystack, and the documentation seems limited.
Any help would be greatly appreciated.
Thanks!

I cannot make the CQL parameter IN work

When using the CQL IN operator in my URL i get no result. If I use CQL_FILTER=id=229539 the URL is working fine. I really need to specify a list so i tried to use the CQL IN operator with no luck. I tried CQL_FILTER=id IN (229539) and CQL_FILTER=id IN ('229539'), both not working - why?
CQL
id IN ('229539')
should actually be fine, we have such queries and they do work.
What you could do to troubleshoot:
Make sure that URL correctly encoded. Maybe some of the quotes or brackets get interpreted incorrectly.
Make sure taht nothing else in the query interacts with your query.
Make sure you don't request two or more layers.
Turn on and look into the GeoServer logs. I can't give you the exact pointer, but it is possible to turn on the verbose logging and you'll see how the query is processed (up to SQL). This may give good hints.
If nothing helps, post your metadata (WFS GetCapabilities) and exact URLs you're trying.
You may also want to ask here.
The problem was not the CQL. The CQL was correct. The problem was the definition of the view used for the request.
CREATE OR REPLACE VIEW data.yellowgreen AS
SELECT
markblok.id AS id,
temp_kvadrats.value,
st_intersection(markblok.geom, temp_kvadrats.geom) AS geom,
temp_kvadrats.session_id
FROM data.temp_kvadrats
JOIN data.markblok ON st_intersects(markblok.geom, temp_kvadrats.geom);
The CQL only works with 'markblok.id AS id' and not with just 'markblok.id', eventhough a list of the view names the column shows id in both cases.

Sphinx search debugging

We use Sphinx Search at work but and I am having an issue with a new index I'm setting up. Does anyone know of a tool or technique that so I can look at the data stored in Sphinx?
Basically I want to do something like - "show me the first 5 records in index 'X'", just to be sure that it is actually storing data. At the moment I'm about 90% sure that my query code is correct but have no way of knowing that my index is correct.
Cheers
*SELECT * FROM index_name* on Sphinx side should give you a list of IDs. This required MySQL protocol support to be enabled in Sphinx conf file:
listen = 9306:mysql41
To Find Ids in index, you can do this:
/usr/local/sphinx/bin/indextool -c /usr/local/sphinx/etc/sphinx.conf --dumpdocids indexname