Using custom variables in Sphinx queries - sphinx

I need execute 2 queries by one queue and passing result of first query to IF expression of other query as well as ordinary MySQL queries.
As example, I try pass #average variable into second query:
SET #average=(SELECT AVG(weight()) avg_rank FROM common WHERE match('query text') OPTION ranker=expr('sum(word_count)*100 + sum(lcs*user_weight)*100 + bm25 + sum(exact_order)*200');
SELECT *, weight() as rank, 2000 * exp( - 9.594E-5 * abs(1486121357 - _rank_date)/1000) AS date_rank, IF(_importance > #average,_importance,0) AS importance_rank, (rank + date_rank + importance_rank) as total_rank FROM common WHERE match('query text') OPTION ranker=expr('sum(word_count)*100 + sum(lcs*user_weight)*100 + bm25 + sum(exact_order)*200')
But i had parse error. How i can do it?

Don't think you will be able to do that in Sphinx as such.
The application would just have to run the first query, capture the value, and write it explicitly into the second query.
But it also seems that the expression jsut modifies what is returned anyway (rather than say reordering or filtering results), so rather than getting sphinx to compute the IF expressions, just do it in the application.

Related

REGEXP_LIKE QUERY IN ORACLE DB

I need to currently match 00000012345 and 12345 in my DB search query.
I am currently using following query:
SELECT *
FROM BPP.CHECK_REGISTER
WHERE CHECK_NO like CONCAT('%',:checkNum)
for searching but here % can mean any character other than 0 hence I have replaced this query with following:
SELECT *
FROM BPP.CHECK_REGISTER
WHERE REGEXP_LIKE (CHECK_NO,'^(0*12345)$')
but in this query I don't want to mention 12345 but mention it as a user entered parameter like the first query :checkNum
How do I rephrase the REGEXP_LIKE condition with only 2 arguments with user input :checkNum as oracle db allows only a maximum of 2 arguments. (another problem)
You can concatenate the parameter:
SELECT *
FROM BPP.CHECK_REGISTER
WHERE REGEXP_LIKE (CHECK_NO,'^(0*'||:checkNum||')$');
Alternatively add the regex part to the user entered value (in your application code) before passing it to the query.

Oracle Text : How to not count a part of the query for scoring?

I have a multicolumn datastore indexed using Oracle Text, and I am running queries using Contains keyword.
To weight the different columns differently I proceed as follow.
If the user searches for "horrible", the query issued to oracle will look like this :
WHERE CONTAINS(indexname,
'((horrible WITHIN column1) * 3)
OR ((horrible WITHIN column2) * 2))') > 1
But to add a category filter that is also indexed, I do this :
WHERE CONTAINS(indexname,
'((horrible WITHIN Column1) * 3)
OR ((horrible WITHIN Column2) * 2))
AND (movie WITHIN CategoryColumn)', 1) > 1
This filters by category, but that messes up completely the scoring, because Oracle text will take the lowest score from any side of the AND keyword.
Instead I would like to instruct oracle to ignore the right side of my AND.
Is there a way to get this specific part of the query ignored by the scoring?
Basically, I want to score according to
(horrible WITHIN Column1) * 3
OR (horrible WITHIN Column2) * 2)
but I want to select according to
'((horrible WITHIN Column1) * 3)
OR ((horrible WITHIN Column2) * 2))
AND (movie WITHIN CategoryColumn)'
There is a mention of
Specify how the score from child elements of OR and AND operators should be merged.
in Oracle Docs in the Alternative and User-defined Scoring secion, but not a lot of examples.
Using query relaxation might be simpler in this case (if it works), e.g.:
where CONTAINS (indexname,
'<query>
<textquery lang="ENGLISH" grammar="CONTEXT">
<progression>
<seq>(horrible WITHIN Column1) AND (movie WITHIN CategoryColumn)</seq>
<seq>(horrible WITHIN Column2) AND (movie WITHIN CategoryColumn)</seq>
</progression>
</textquery>
<score datatype="INTEGER" algorithm="COUNT"/>
</query>')>0;
This way you don't need to assign weights, as scoring from the more relaxed query never exceeds the previous one in sequence.

In TSQL, is there any performance difference between SUM(A + B) vs SUM(A) + SUM(B)?

I have to do a sum of 2 fields that are then also summed. Is there any difference from a performance standpoint between doing the addition of the fields first or after the columns have been summed?
Method 1 = SELECT SUM(columnA + columnB)
Method 2 = SELECT SUM(columnA) + SUM(columnB)
(Environment = SQL Server 2008 R2)
I have checked on this, and what i see is that the sum(x) + sum(y) is faster.
Why? When you use a sum function you are working with an aggregate function. When you are aggregating, null values will be skipped in such. When you are combining two fields in an aggregate function the processor has to check if one of the fields is NULL, since a set can contain both a value and a NULL. Adding NULL (or UNKNOWN or NOTHING if you like) to something, is still nothing, so NULL. So for each record this has to be checked.
When you look into your execution plan and you check on the computer scalar operator you'll see exactly this behavior.
For the sum(x) + sum(y) method you see a estimated cpu cost of 0,0000001 where the other method takes up to 0,0000041. That is something more!
Also, when you take a closer look you'll see that the sum(x + y) will be made something like
[Expr1004] = Scalar Operator(CASE WHEN [Expr1006]=(0) THEN NULL ELSE [Expr1007] END)
So, eventually, the sum(x) + sum(y) can be considered faster.

same query, two different ways, vastly different performance

I have a Postgres table with more than 8 million rows. Given the following two ways of doing the same query via DBD::Pg, I get wildly different results.
$q .= '%';
## query 1
my $sql = qq{
SELECT a, b, c
FROM t
WHERE Lower( a ) LIKE '$q'
};
my $sth1 = $dbh->prepare($sql);
$sth1->execute();
## query 2
my $sth2 = $dbh->prepare(qq{
SELECT a, b, c
FROM t
WHERE Lower( a ) LIKE ?
});
$sth2->execute($q);
query 2 is at least an order of magnitude slower than query 1... seems like it is not using the indexes, while query 1 is using the index.
Would love hear why.
With LIKE expressions, b-tree indexes can only be used if the search pattern is left-anchored, i.e. terminated with %. More details in the manual.
Thanks to #evil otto for the link. This link to the current version.
Your first query provides this essential information at prepare time, so the query planner can use a matching index.
Your second query does not provide any information about the pattern at prepare time, so the query planner cannot use any indexes.
I suspect that in the first case the query compiler/optimizer detects that the clause is a constant, and can build an optimal query plan. In the second it has to compile a more generic query because the bound variable can be anything at run-time.
Are you running both test cases from same file using same $dbh object?
I think reason of increasing speed in second case is that you using prepared statement which is already parsed(but maybe I wrong:)).
Ahh, I see - I will drop out after this comment since I don't know Perl. But I would trust that the editor is correct in highlighting the $q as a constant. I'm guessing that you need to concatenate the value into the string, rather than just directly referencing the variable. So, my guess is that if + is used for string concatenation in perl, then use something like:
my $sql = qq{
SELECT a, b, c
FROM t
WHERE Lower( a ) LIKE '
} + $q + qq{'};
(Note: unless the language is tightly integrated with the database, such as Oracle/PLSQL, you usually have to create a completely valid SQL string before submitting to the database, instead of expecting the compiler to 'interpolate'/'Substitute' the value of the variable.)
I would again suggest that you get the COUNT() of the statements, to make sure that you are comparing apple to apples.
I don't know Postgres at all, but I think in Line 7 (WHERE Lower( a ) LIKE '$q'
), $q is actually a constant. It looks like your editor thinks so too, since it is highlighted in red. You probably still need to use the ? for the variable.
To test, do a COUNT(*), and make sure they match - I could be way offbase.

Google Fusion Tables SQL Query ORDER BY and GROUP BY in same query doesn't work

I'm writing a mobile web page using Google Fusion Tables for my data and I need to use both ORDER BY (lat and long) as well as GROUP BY (in place of DISTINCT - which GFT doesn't support).
But, it seems the two do not play well together. If I use GROUP BY, the statement seems to simply ignore the ORDER BY.
SQL statement:
SELECT Count(), Facility_Name FROM 2206340 GROUP BY Facility_Name
ORDER BY ST_DISTANCE(Lat, LATLNG(" + lat + "," + lng + "))LIMIT 10
Has anyone else run into this scenario?
You can't order by a column that isn't included in the select/group by clause (unless you use an aggregate function)