Does sphinxql support 'not between'? - sphinx

I tried
SELECT count(*) as count FROM rt_item WHERE MATCH('') AND sale between 1 and 2;
and got many rows.
But sphinx complains
ERROR 1064 (42000): sphinxql: syntax error, unexpected BETWEEN, expecting IN near 'between 1 and 2'
when I tried
SELECT count(*) as count FROM rt_item WHERE MATCH('') AND sale not between 1 and 2;
I searched Sphinx official document, which says :
WHERE clause. This clause will map both to fulltext query and filters. Comparison operators (=, !=, <, >, <=, >=), IN, AND, NOT, and BETWEEN are all supported and map directly to filters. OR is not supported yet but will be in the future. MATCH('query') is supported and maps to fulltext query.
But it has no example about NOT BETWEEN.
Can anybody tell me whether sphinxql support NOT BETWEEN? If if supports, what's the correct grammar?

No, no NOT BETWEEN. There is a NOT IN() operator, which might help in this specific case
SELECT count(*) as count FROM rt_item WHERE MATCH('') AND sale not in (1,2);
(gets cumbersome with long ranges)
It's a bit convoluted but can do
SELECT count(*) as count,sale>=1 AND sale<=2 AS f FROM rt_item WHERE f=0
Create a virtual attribute to mimic a 'between', then just check for false.
Might also find
SELECT count(*) as count,sale < 1 OR sale > 2 AS f FROM rt_item WHERE f=1
clearer. Inverts the logic.
(empty match() does nothing, can be omitted)

Related

Postgresql, how to query the database based on a certain condition?

I was trying to write a postgresql statement that would allow me to execute a certain query only if a condition is satisfied:
select SUM("TotalBookings") AS Availability,
from (select count (*) as "TotalBookings"
from bookings
where hotel_id=2 and ('2023-02-10') between symmetric check_in_date and check_out_date
union all
select -count (*) as "TotalRooms"
from hotels
inner join rooms on rooms.hotel_id = hotels.hotel_id
where hotels.hotel_id=2 ) Availability;
case
when Availability >= 1
then select *
from hotels
where hotel_id=2
end
I basically would like to show my hotels only when the sum is greater than one.
Below the error returned:
syntax error at or near "from"
Any ideas on whether mine is only a syntax error or if my logic is wrong as well ?
Thank you in advance.

select count(*) in WHERE Clause needing a comparison operator

Kinda new to SQL so I was reading up on some queries and chanced upon this (https://iggyfernandez.wordpress.com/2011/12/04/day-4-the-twelve-days-of-sql-there-way-you-write-your-query-matters/)
The part that got me curious is the aggregate query in the WHERE Clause. This is probably my misunderstanding but how does the author's code (shown below) run? I presumed that Count(*) - or rather aggregate functions cannot be used in the WHERE clause and you need a HAVING for that ?
SELECT per.empid, per.lname
FROM personnel per
WHERE (SELECT count(*) FROM payroll pay WHERE pay.empid = per.empid AND pay.salary = 199170) > 0;
My second question would be why the comparison operator (>0) is needed ? I was playing around and noticed that it would not run in PostgreSQL without the >0; also reformatting it to have a HAVING by Clause massively improves the query execution time
SELECT per.empid, per.lname
FROM personnel per
WHERE EXISTS (SELECT per.empid FROM payroll pay WHERE pay.empid = per.empid AND pay.salary = 199170)
GROUP BY per.empid, per.lname
HAVING COUNT(*) > 0;
Omit the GROUP BY and HAVING clauses in your version, then your query will be a more efficient query that is equivalent to the original.
In the original query, count(*) appears in the SELECT list of a subquery. You can use a parenthesized subquery almost anywhere in an SQL statement.

Are there plans to add 'OR' to attribute searches in Sphinx?

A little background is in order for this question since it is on surface too generic:
Recently I ran into an issue where I had to move the attribute values I was pushing into my sphinxql query as full-text because the attribute needed to be part of an 'OR' query.
In other words I was doing:
Select * from idx_test where MATCH('Terms') and name_id in (1,2,3)
When I tried to add an 'OR' to the attributes such as:
Select * from idx_test where MATCH('Terms') and name_id in (1,2,3) OR customer_id in (4,5,6)
it failed because Sphinx 2.* does not support OR in the attribute query.
I was also unable to simply put the name and customer IDs in to the query:
Select * from idx_test where MATCH('Terms ((#(name_id) 1|2|3)|(#customer_id) 4|5|6))')
Because (as far as I can tell) you can't push integer fields into the full_text search.
My solution was to index the id fields a second time appended by _text:
Select name_id, name_id as name_id_text
and then add that to the field list:
sql_attr_uint = name_id
sql_field_string = name_id_text
sql_attr_uint = customer_id
sql_field_string = customer_id_text
So now I can do my OR query as full_text:
Select * from idx_test where MATCH('Terms ((#(name_id_text) 1|2|3)|(#customer_id_text) 4|5|6))')
However recently I found an article that discussed the tradeoff between attribute and full-text searches. The upshot is that "it could reduce performance of queries that otherwise match few records". Which is precisely what my name_id/city_id query does. In an ideal world then I'd be able to go back to:
Select * from idx_test where MATCH('Terms') and name_id in (1,2,3) OR customer_id in (4,5,6)
If Sphinx would only allow for OR between attributes since as far as I can tell once I have a query that is filtering down to a relatively low # of results I'd have a much faster query using attributes vs full_text.
So my two-part question therefor is:
Am I in fact correct that this is the case (a query that would reduce the # of results significantly is better served doing attributes then full-text)?
If so are there plans to add OR to the attribute part of the SphinxQL query?
If so, when?
OR filter has been added in the Sphinx fork (from 2.3 branch) - Manticore, see https://github.com/manticoresoftware/manticore/commit/76b04de04feb8a4db60d7309bf1e57114052e298
For now it's only between attributes, OR between MATCH and attributes is not supported yet.
While yes, OR is not supported directly in WHERE, can still run the query. Your
Select * from idx_test where MATCH('Terms') and name_id in (1,2,3) OR customer_id in (4,5,6)
example can be written as
Select *, IN(name_id,1,2,3) + IN(customer_id,4,5,6) as filter
from idx_test where MATCH('Terms') and filter > 0
It is a bit more cumbersome, but should work. You still get the full benefit of the full-text inverted index, so performance actully shoudnt be bad. The fitler is only executed against docs matching the terms.
(this may look crazy, if coming from say mysql background, but remeber sphinxQL isnt mysql :)
You dont get 'short circuiting (ie customer_id filter, will still be run, even if matches name_id), so perhaps
Select *, IF(IN(name_id,1,2,3) OR IN(customer_id,4,5,6),1,0) as filter
from idx_test where MATCH('Terms') and filter =1
is even better, the if function has an OR operator! (as sphinx could potentially short-circuit, but don't know if it does)
(but also yes, if the 'filter' is highly selective (matching few rows), than including in the full-text query can be good. As it discards the rows earlier in processing. The problem with non-selective filters, is they have lots of matching rows, so a long doclist to process during text-query processing)

Postgres undefined column when using CASE

Using this SQL, I can cast a boolean column to a text:
SELECT *, (CASE WHEN bars.some_cond THEN 'Yes' ELSE 'No' END) AS some_cond_alpha
FROM "foos"
INNER JOIN "bars" ON "bars"."id" = "foos"."bar_id";
So why do I get a PG::UndefinedColumn: ERROR: column "some_cond_alpha" does not exist when I try to use it in a WHERE clause?
SELECT *, (CASE WHEN bars.some_cond THEN 'Yes' ELSE 'No' END) AS some_cond_alpha
FROM "foos"
INNER JOIN "bars" ON "bars"."id" = "foos"."bar_id"
WHERE (some_cond_alpha ILIKE '%y%');
This is because the column is created on-the-fly and does not exist. Possibly in later editions of PG it will, but right now you can not refer to an alias'd column in the WHERE clause, although for some reason you can refer to the alias'd column in the GROUP BY clause (don't ask me why they more friendly in the GROUP BY)
To get around this, I would make the query into a subquery and then query the column OUTSIDE the subquery as follows:
select *
from (
SELECT *, (CASE WHEN bars.some_cond THEN 'Yes' ELSE 'No' END) AS some_cond_alpha
FROM "foos"
INNER JOIN "bars" ON "bars"."id" = "foos"."bar_id"
) x
WHERE (x.some_cond_alpha ILIKE '%y%')
NOTE: It is possible at some point in the future you will be able to refer to an alias'd column in the WHERE clause. In prior versions, you could not refer to the alias in the GROUP BY clause but since 9.4 + it is possible...
SQL evaluates queries in a rather counterintuitive way. It starts with the FROM and WHERE clauses, and only hits the SELECT towards the end. So aliases defined in the SELECT don't exist yet when we're in the WHERE. You need to do a subquery if you want to have access to an alias, as shown in Walker Farrow's answer.
When I read an SQL query, I try to do so in roughly this order:
Start at the FROM. You can generally read one table/view/subquery at a time from left to right (or top to bottom, depending on how the code is laid out); it's normally not permissible for one item to refer to something that hasn't been mentioned yet.
Go down, clause by clause, in the order they're written. Again, read from left to right, top to bottom; nothing should reference anything that hasn't been defined yet. Stop right before you hit ORDER BY or something which can only go after ORDER BY (if there is no ORDER BY/etc., stop at the end).
Jump up to the SELECT and read it.
Go back down to where you were and resume reading.
If at any point you see a subquery, apply this algorithm recursively.
If the query begins with WITH RECURSIVE, go read the Postgres docs for 20 minutes and figure it out.

Proper GROUP BY syntax

I'm fairly proficient in mySQL and MSSQL, but I'm just getting started with postgres. I'm sure this is a simple issue, so to be brief:
SQL error:
ERROR: column "incidents.open_date" must appear in the GROUP BY clause or be used in an aggregate function
In statement:
SELECT date(open_date), COUNT(*)
FROM incidents
GROUP BY 1
ORDER BY open_date
The type for open_date is timestamp with time zone, and I get the same results if I use GROUP BY date(open_date).
I've tried going over the postgres docs and some examples online, but everything seems to indicate that this should be valid.
The problem is with the unadorned open_date in the ORDER BY clause.
This should do it:
SELECT date(open_date), COUNT(*)
FROM incidents
GROUP BY date(open_date)
ORDER BY date(open_date);
This would also work (though I prefer not to use integers to refer to columns for maintenance reasons):
SELECT date(open_date), COUNT(*)
FROM incidents
GROUP BY 1
ORDER BY 1;
"open_date" is not in your select list, "date(open_date)" is.
Either of these will work:
order by date(open_date)
order by 1
You can also name your columns in the select statement, and then refer to that alias:
select date(open_date) "alias" ... order by alias
Some databases require the keyword, AS, before the alias in your select.