SphinxSE Returning Strange Results - sphinx

I have the following SphinxSE query that is returning some strange results:
select distinct
model,
keywords,
`name`,
image,
products_parent_status,
`status`,
final_price,
source,
source_id,
description,
subdescription,
language_id,
feature,
var_val,
weight
from __search
where query='#(keywords,name,model,image,description,subdescription,feature,var_val) about;
fieldweights=keywords,11,model,10,name,9,feature,8,var_val,7,description,6,subdescription,5,image,4;
mode=extended;
maxmatches=500000;
ranker=proximity_bm25;
limit=20'
order by weight desc, `name`;
If I search for "about", I get the exact number of results I was anticipating but if I search for "abo" I get no results. Does it have something to do with the ranker I am using? I have tried other ones but I still get the same strange behavior. Any help on this will be greatly appreciated.

Closing issue. Added asterisks around the search term and min_prefix_len to index.

Related

Get entire record with max field for each group

There are a lot of answers about this problem, but none of them retrieves the entire record, but only the ID... and I need the whole record.
So, I have a table status_changes that is composed of 4 columns:
issue_id : the issue the change refers to
id: the id of the change, just a SERIAL
status_from and status_to that are infact the status that the issue had before, and the status that the issue got then
when that is a timestamp of when this happened
Nothing too crazy, but now, I would like to have the "most recent status_change" for each issue.
I tried something like:
select id
from change
group by issue_id
having when = max(when)
But this has obviously 2 big problems:
select contains fields that are not in the group by
2 having can't contains aggregate function in this way
I thought of "ordering every group by when and using something like top(1), but I can't figure out how to do it...
Use PostgreSQL's DISTINCT ON:
SELECT DISTINCT ON (issue_id)
id, issue_id, status_from, statue_to, when
FROM change
ORDER BY issue_id, when DESC;
This will return the first result (the one with the greatest when) for each issue.

Django-Haystack autocomplete---get distinct results

I would like my autocomplete results with django-haystack to be distinct. However, if multiple objects in my database have a certain value for an attribute on which I am autocompleting, the result appears multiple times.
I am using Haystack with solr as my backend. My query, as in the tutorial, looks like:
SearchQuerySet().autocomplete(content_auto=request.GET.get('q', ''))[:5]
I'm new to Haystack, and the documentation seems limited.
Any help would be greatly appreciated.
Thanks!

FOR LAST - Query, giving wrong result

I'm looking to use the following query to find the last tender id.
FOR EACH tender-table NO-LOCK WHERE tender-table.kco = 1 BY tender-table.id:
DISPLAY tender-table.id.
END.
This query looks at all the tender id's and brings back all the results of all the id's in ascending order. The results i get are
1,035
1.036
......
1,060
1,061
1,062
1,063
1,064
1,065
1,066
FOR LAST tender-table NO-LOCK WHERE tender-table.kco = 1 BY tender-table.id:
DISPLAY tender-table.id.
END.
However when i use this query to find the last id, i get the result,
1,061
When I should be seeing the result 1,066. Can anyone suggest why this is happening?
FOR LAST is a very deceptive statement. (So is FOR FIRST.) It does not behave in an intuitive manner. The sort order is NOT specified by the BY statement. You will get the LAST record according to the index which is used and no sorting will take place. When the BY refers to an unindexed field (or one which does not sort in the order of the index actually used) or when the WHERE clause does not obviously map to an index in the order that you are hoping for you will have mysterious records chosen.
Personally, I strongly suggest that you forget about using FOR FIRST & FOR LAST. A better option, which always sorts as expected, would be:
FOR EACH tableName WHERE someCriteria BREAK BY sortOrder:
LEAVE.
END.
DISPLAY whatEver.
(Add "DESCENDING" to flip from FIRST to LAST...)
Just in case anyone needs convincing -- try this with the "sports" database:
for first customer no-lock by discount:
display name discount.
end.
Sorry I have managed to figure it out that the 1,066 values didn't have tender-table.kco = 1. this solves the problem. thanks your time.

FQL subquery breaks top query and never returns

We have a FQL query that used to work and stopped somewhere around Oct 16. No help from Facebook on this.
This code used to work:
SELECT object_id, metric, end_time, period, value
FROM insights
WHERE object_id IN
(
SELECT page_id
FROM page_admin
WHERE uid=123
AND page_id<>456
AND page_id<>789
)
AND metric="page_audio_plays"
AND end_time=end_time_date("2011-11-11" )
AND period=86400
If I run the inner Select, it returns a large list of page_id's. IF I remove the inner select and replace with a list of comma seperated id's like this:
...where object_id in ( 123, 456, 8778, 999)
The overall query runs.
With the original code above, the query never returns and times out.
Question: Is anyone aware of something on FB side that broke around the middle of October in this regards? Or is there something inherently wrong with doing a subquery like this?
Any suggestions on how to work around?
Net: query returned TOO much data. If you have this problem, break up the result set somehow so it returns a smaller set of data. Would be nice if API returned some discernable status telling you so but....

Facebook - max number of parameters in “IN” clause?

In Facebook query language(FQL), you can specify an IN clause, like:
SELECT uid1, uid2 FROM friend WHERE uid1 IN (1000, 1001, 1002)
Does anyone know what's the maximum number of parameters you can pass into IN?
I think the maximum result set size is 5000
It may seem like an odd number (so perhaps I miss counted, but it's close ~1), but I can not seem to query more than 73 IN items. This is my query:
SELECT object_id, metric, value
FROM insights
WHERE object_id IN ( ~73 PAGE IDS HERE~ )
AND metric='page_fans'
AND end_time=end_time_date('2011-06-04')
AND period=period('lifetime')
This is using the JavaSCript FB.api() call.
Not sure if there is an undocumented limit or it might be an issue of facebook fql server timeout.
You should check if there is a error 500 returned from FB web server which might indicate you are passing a too long GET statement (see Facebook query language - long query)
I realized my get was too long so instead of putting many numbers in the IN statement, i put a sub-query there that fetches those numbers from FB FQL - but unfortunately it looks like FB couldn't handle the query and returned an 'unknown error' in the JSON which really doesn't help us understand the problem.
There shoud not be a maximum number of parameters as there isnt in SQL IN as far as I know.
http://www.sql-tutorial.net/SQL-IN.asp
just dont use more parameters than you have values for the function to check because you will not get any results (dont know if it will give away an error as I never tried to).