How can I search on "/" and "$" in Sphinx? - sphinx

I am trying to add a NOT condition to a Sphinx search:
!($|/) NEAR/3 (HR)
however Sphinx returns "unexpected $" or "unexpected /" so I am assuming those are characters used as syntax searches. How can I tell Sphinx for this search to use as characters?

You should escape those symbols using \\:
select * from my_index where match('!(\\$|\\/) NEAR/3 (HR)');

Related

Oddities in fail2ban regex

This appears to be a bug in fail2ban, with different behaviour between the fail2ban-regex tool and a failregex filter
I am attempting to develop a new regex rule for fail2ban, to match:
\"%20and%20\"x\"%3D\"x
When using fail2ban-regex, this appears to produce the desired result:
^<HOST>.*GET.*\\"%20and%20\\"x\\"%3D\\"x.* 200.*$
As does this:
^<HOST>.*GET.*\\\"%20and%20\\\"x\\\"%3D\\\"x.* 200.*$
However, when I put either of these into a filter, I get the following error:
Failed during configuration: '%' must be followed by '%' or '(', found:…
To have this work in a filter you have to double-up the ‘%’, ie ‘%%’:
^<HOST>.*GET.*\\\"%%20and%%20\\\"x\\\"%%3D\\\"x.* 200.*$
While this gets the required hits running as a filter, it gets none running through fail2ban-regex.
I tried the \\\\ as Andre suggested below, but this gets no results in fail2ban-regex.
So, as this appears to be differential behaviour, I am going to file it as a bug.
According to Python's own site a singe backslash "\" has to be written as "\\\\" and there's no mention of %.
Regular expressions use the backslash character ('') to indicate
special forms or to allow special characters to be used without
invoking their special meaning. This collides with Python’s usage of
the same character for the same purpose in string literals; for
example, to match a literal backslash, one might have to write '\\'
as the pattern string, because the regular expression must be \, and
each backslash must be expressed as \ inside a regular Python string
literal
I would just go with:
failregex = (?i)^<HOST> -.*"(GET|POST|HEAD|PUT).*20and.*3d.*$
the .* wil match anything inbetween anyways and (?i) makes the entire regex case-insensitive

Why does my LIKE statement fail with '\\_' for matching?

I have a database entry that has entries that look like this:
id | name | code_set_id
I have this particular entry that I need to find:
674272310 | raphodo/qrc_resources.py | 782732
In my rails app (2.3.8), I have a statement that evaluates to this:
SELECT * from fyles WHERE code_set_id = 782732 AND name LIKE 'raphodo/qrc\\_resources.py%';
From reading up on escaping, the above query is correct. This is supposed to correctly double escape the underscore. However this query does not find the record in the database. These queries will:
SELECT * from fyles WHERE code_set_id = 782732 AND name LIKE 'raphodo/qrc\_resources.py%';
SELECT * from fyles WHERE code_set_id = 782732 AND name LIKE 'raphodo/qrc_resources.py%';
Am I missing something here? Why is the first SQL statement not finding the correct entry?
A single backslash in the RHS of a LIKE escapes the following character:
9.7.1. LIKE
[...]
To match a literal underscore or percent sign without matching other characters, the respective character in pattern must be preceded by the escape character. The default escape character is the backslash but a different one can be selected by using the ESCAPE clause. To match the escape character itself, write two escape characters.
So this is a literal underscore in a LIKE pattern:
\_
and this is a single backslash followed by an "any character" pattern:
\\_
You want LIKE to see this:
raphodo/qrc\_resources.py%
PostgreSQL used to interpret C-stye backslash escapes in strings by default but no longer, now you have to use E'...' to use backslash escapes in string literals (unless you've changed the configuration options). The String Constants with C-style Escapes section of the manual covers this but the simple version is that these two:
name LIKE E'raphodo/qrc\\_resources.py%'
name LIKE 'raphodo/qrc\_resources.py%'
do the same thing as of PostgreSQL 9.1.
Presumably your Rails 2.3.8 app (or whatever is preparing your LIKE patterns) is assuming an older version of PostgreSQL than the one you're actually using. You'll need to adjust things to not double your backslashes (or prefix the pattern string literals with Es).

How to use Like Operator with "_" in MyBatis

I want to search "parent_id" with underscore "_"
t.parent_id like '%'+#parentPartID#+'%'
I want the results only _xxx or xx_xx or xxx_
Use ESCAPE subclause to specify escape symbol and use it to escape underscore which is special symbol handled by LIKE: t.parent_id LIKE '%\_%' ESCAPE '\'

PSQLException: ERROR: syntax error in tsquery

Which characters must be avoided to make sure PSQLException: ERROR: syntax error in tsquery will not occur?
The documentation does not say anything about how to escape the search string: http://www.postgresql.org/docs/8.3/static/datatype-textsearch.html
Use quotes around your terms if you want them as phrases/verbatim or they contain characters used in the syntax:
select to_tsquery('"hello there" | hi');
Bear in mind that you shouldn't really have crazy characters in your terms, since they are not going to match anything in the tsvector.
The (non-token) characters recognized by the tsquery parser are: \0 (null), (, ),   (whitespace), |, &, :, * and !. But how you tokenize your query should be based on how you have setup your dictionary. There are a great many other characters that you will likely not want in your query, not because they will cause a syntax error but because it means you are not tokenizing your query correctly.
Use the plainto_tsquery version if it's a simple AND query and you don't want to deal with creating the query manually.

PostgreSQL regexp_replace with matched expression

I am using PostgreSQL regexp_replace function to escape square brackets, parentheses and backslash in a string so that I could use that string as a regex pattern itself (there are other manipulations done on this string as well before using it, but they are outside the scope of this question. The idea is to replace:
[ with \[
] with \]
( with \(
) with \)
\ with \\
Postgres documentation page on regular expressions states the following:
The replacement string can contain \n, where n is 1 through 9, to
indicate that the source substring matching the n'th parenthesized
subexpression of the pattern should be inserted, and it can contain \&
to indicate that the substring matching the entire pattern should be
inserted. Write \ if you need to put a literal backslash in the
replacement text.
However regexp_replace('abc [def]', '([\[\]\(\)\\])', E'\\\1', 'g'); produces abc \ def\.
Further down on that same page, an example is given, which uses \\1 notation - so I tried that.
Yet, regexp_replace('abc [def]', '([\[\]\(\)\\])', E'\\\\1', 'g'); produces abc \1def\1.
I would guess this is expected, but regexp_replace('abc [def]', '([\[\]\(\)\\])', E'.\\1', 'g'); produces abc .[def.]. That is, escaping works with characters other than the standard backslash.
At this point I don't know how to proceed. What can I do to actually give me the replacement I want?
OK, found the answer. Apparently, I need to double-escape the backslash in the replacement. Also, I need to E-prefix and double-escape backslashes in the search pattern on older versions of postgres (8.3 in my case). The final code looks like this:
regexp_replace('abc [def]', E'([\\[\\]\\(\\)\\\\\?\\|_%])', E'\\\\\\1', 'g')
Yes, it looks horrible, but it works :)
it's simpliest way
select regexp_replace('abc [def]', '([\[\]\(\)\\])', '\\\1', 'g')