Why does my LIKE statement fail with '\\_' for matching? - postgresql

I have a database entry that has entries that look like this:
id | name | code_set_id
I have this particular entry that I need to find:
674272310 | raphodo/qrc_resources.py | 782732
In my rails app (2.3.8), I have a statement that evaluates to this:
SELECT * from fyles WHERE code_set_id = 782732 AND name LIKE 'raphodo/qrc\\_resources.py%';
From reading up on escaping, the above query is correct. This is supposed to correctly double escape the underscore. However this query does not find the record in the database. These queries will:
SELECT * from fyles WHERE code_set_id = 782732 AND name LIKE 'raphodo/qrc\_resources.py%';
SELECT * from fyles WHERE code_set_id = 782732 AND name LIKE 'raphodo/qrc_resources.py%';
Am I missing something here? Why is the first SQL statement not finding the correct entry?

A single backslash in the RHS of a LIKE escapes the following character:
9.7.1. LIKE
[...]
To match a literal underscore or percent sign without matching other characters, the respective character in pattern must be preceded by the escape character. The default escape character is the backslash but a different one can be selected by using the ESCAPE clause. To match the escape character itself, write two escape characters.
So this is a literal underscore in a LIKE pattern:
\_
and this is a single backslash followed by an "any character" pattern:
\\_
You want LIKE to see this:
raphodo/qrc\_resources.py%
PostgreSQL used to interpret C-stye backslash escapes in strings by default but no longer, now you have to use E'...' to use backslash escapes in string literals (unless you've changed the configuration options). The String Constants with C-style Escapes section of the manual covers this but the simple version is that these two:
name LIKE E'raphodo/qrc\\_resources.py%'
name LIKE 'raphodo/qrc\_resources.py%'
do the same thing as of PostgreSQL 9.1.
Presumably your Rails 2.3.8 app (or whatever is preparing your LIKE patterns) is assuming an older version of PostgreSQL than the one you're actually using. You'll need to adjust things to not double your backslashes (or prefix the pattern string literals with Es).

Related

Postgres escape double quotes

I am working with a malformed database which seems to have double quotes as part of the column names.
Example:
|"Market" |
|---------|
|Japan |
|UK |
|USA |
And I want to select like below
SELECT "\"Market\"" FROM mytable; /* Does not work */
How does one select such a thing?
The documentation says
[A] delimited identifier or quoted identifier […] is formed by enclosing an arbitrary sequence of characters in double-quotes ("). […]
Quoted identifiers can contain any character, except the character with code zero. (To include a double quote, write two double quotes.)
So you'll want to use
SELECT """Market""" AS "Market" FROM mytable;
An alternative would be
A variant of quoted identifiers allows including escaped Unicode characters identified by their code points. This variant starts with U& (upper or lower case U followed by ampersand) immediately before the opening double quote, without any spaces in between, for example U&"foo". […] Inside the quotes, Unicode characters can be specified in escaped form by writing a backslash followed by the four-digit hexadecimal code point number or alternatively a backslash followed by a plus sign followed by a six-digit hexadecimal code point number.
which in your case would mean
SELECT U&"\0022Market\0022" AS "Market" FROM mytable;
SELECT U&"\+000022Market\+000022" AS "Market" FROM mytable;
Disclaimer: your database may not actually have double quotes as part of the name itself. As mentioned in the comments, this might just be the way in which the tool you are using does display a column named Market (not market) since
Quoting an identifier also makes it case-sensitive
So all you might need could be
SELECT "Market" FROM mytable;

POSTGRES 🐘- combine LIKE with exception(E') literal

I'm trying to match a certain text that includes a single quote (i.e. 'company's report...')
normally I would have used the E' literal + ' or double single quotes.
but when it gets to using the LIKE '%' operator, things got complicated.
what is the best approach to match a text with a single quote?
You can escape single quote with another single quote. Example:
WHERE column LIKE 'RSNboim''s'
From https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-STRINGS
To include a single-quote character within a string constant, write two adjacent single quotes, e.g., 'Dianne''s horse'. Note that this is not the same as a double-quote character (").
You can use Dollar-quoted String Constants at Lexical Structure
Your condition should be something like below;
select * from atable
where afield like $$Dianne's %$$

Oddities in fail2ban regex

This appears to be a bug in fail2ban, with different behaviour between the fail2ban-regex tool and a failregex filter
I am attempting to develop a new regex rule for fail2ban, to match:
\"%20and%20\"x\"%3D\"x
When using fail2ban-regex, this appears to produce the desired result:
^<HOST>.*GET.*\\"%20and%20\\"x\\"%3D\\"x.* 200.*$
As does this:
^<HOST>.*GET.*\\\"%20and%20\\\"x\\\"%3D\\\"x.* 200.*$
However, when I put either of these into a filter, I get the following error:
Failed during configuration: '%' must be followed by '%' or '(', found:…
To have this work in a filter you have to double-up the ‘%’, ie ‘%%’:
^<HOST>.*GET.*\\\"%%20and%%20\\\"x\\\"%%3D\\\"x.* 200.*$
While this gets the required hits running as a filter, it gets none running through fail2ban-regex.
I tried the \\\\ as Andre suggested below, but this gets no results in fail2ban-regex.
So, as this appears to be differential behaviour, I am going to file it as a bug.
According to Python's own site a singe backslash "\" has to be written as "\\\\" and there's no mention of %.
Regular expressions use the backslash character ('') to indicate
special forms or to allow special characters to be used without
invoking their special meaning. This collides with Python’s usage of
the same character for the same purpose in string literals; for
example, to match a literal backslash, one might have to write '\\'
as the pattern string, because the regular expression must be \, and
each backslash must be expressed as \ inside a regular Python string
literal
I would just go with:
failregex = (?i)^<HOST> -.*"(GET|POST|HEAD|PUT).*20and.*3d.*$
the .* wil match anything inbetween anyways and (?i) makes the entire regex case-insensitive

YAML, Docker Compose, Spaces & Quotes

Under what circumstances must one use quotes in a YAML file, specifically when using docker-compose.
For instance,
service:
image: "my-registry/repo:tag1"
environment:
ENV1: abc
ENV2: "abc"
ENV3: "a b c"
If spaces are required, for example, must one use quotes around the environment variable, as depicted in ENV3?
After some googling I've found a blog post
that touches this problem as I understood it.
I'll cite the most important part here:
plain scalars:
- a string
- a string with a \ backslash that doesn't need to be escaped
- can also use " quotes ' and $ a % lot /&?+ of other {} [] stuff
single quoted:
- '& starts with a special character, needs quotes'
- 'this \ backslash also does not need to be escaped'
- 'just like the " double quote'
- 'to express one single quote, use '' two of them'
double quoted:
- "here we can use predefined escape sequences like \t \n \b"
- "or generic escape sequences \x0b \u0041 \U00000041"
- "the double quote \" needs to be escaped"
- "just like the \\ backslash"
- "the single quote ' and other characters must not be escaped"
literal block scalar: |
a multiline text
line 2
line 3
folded block scalar: >
a long line split into
several short
lines for readability
Also I have not seen such docker-compose syntax to set env variables. Documentation suggests using simple values like
environment:
- ENV1=abc
- "ENV2=abc"
Where quotes " or ' are optional in this particular example according to what I've said earlier.
To see how to include spaces in env variables you can check out this so answer
Whether or not you need quotes, depends on the parser. Docker-compose AFAIK is still relying on the PyYAML module and that implements most of YAML 1.1 and has a few quirks of its own.
In general you only need to quote what could otherwise be misinterpreted or clash with some YAML construct that is not a scalar string. You also need (double) quotes for things that cannot be represented in plain scalars, single quoted scalars or block style literal or folded scalars.
Misinterpretation
You need to quote strings that look like some of the other data structures:
booleans: "True", "False", but PyYAML also assumes alternatives words like "Yes", "No", "On", "Off" represent boolean values ( and the all lowercase, all uppercase versions should be considered as well). Please note that the YAML 1.2 standard removed references to these alternatives.
integers: this includes string consisting of numbers only. But also hex (0x123) and octal number (0123). The octals in YAML 1.2 are written as 0o123, but PyYAML doesn't support this, however it is best to quote both.
A special integer that PyYAML still supports but again not in the YAML 1.2 specification are sexagesimals: base 60 number separated by colon (:), time indications, but also MAC addresses can be interpreted as such if the values between/after the colons are in the range 00-59
floats: strings like 1E3 (with optional sign ans mantissa) should be quoted. Of course 3.14 needs to be quoted as well if it is a string. And sexagesimal floats (with a mantissa after the number after the final colon) should be quoted as well.
timestamps: 2001-12-15T02:59:43.1Z but also iso-8601 like strings should be quoted to prevent them from being interpreted as timestamps
The null value is written as the empty string, as ~ or Null (in all casing types), so any strings matching those need to be quoted.
Quoting in the above can be done with either single or double quotes, or block style literal or folded scalars can be used. Please note that for the block-style you should use |- resp. >- in order not to introduce a trailing newline that is not in the original string.
Clashes
YAML assigns special meaning to certain characters or character combinations. Some of these only have special meaning at the beginning of a string, others only within a string.
characters fromt the set !&*?{[ normally indicate special YAML constructs. Some of these might be disambiguated depending on the following character, but I would not rely on that.
whitespace followed by # indicates an end of line comment
wherever a key is possible (and within block mode that is in many places) the combination of colon + space (:) indicates a value will be following. If that combination is part of your scalar string, you have to quote.
As with the misinterpretation you can use single or double quoting or block-style literal or folding scalars. There can be no end-of-line comments beyond the first line of a block-style scalar.
PyYAML can additionally get confused by any colon + space within a plain scalar (even when this is in a value) so always quote those.
Representing special characters
You can insert special characters or unicode code-points in a YAML file, but if you want these to be clearly visible in all cases, you might want to use escape sequences. In that case you have to use double quotes, this is the only mode that
allows backslash escapes. And e.g. \u2029. A full list of such escapes can be taken from the standard, but note that PyYAML doesn't implement e.g \/ (or at least did not when I forked that library).
One trick to find out what to quote or not is to use the library used to dump the strings that you have. My ruamel.yaml and PyYAML used by docker-compose, when potentially dumping a plain scalar, both try to read back (yes, by parsing the result) the plain scalar representation of a string and if that results in something different than a string, it is clear quotes need to be applied. You can do so too: when in doubt write a small program dumping the list of strings that you have using PyYAML's safe_dump() and apply quotes anywhere that PyYAML does.

SCALA Replace with $

I want replace a Letter with a literal $. I tried:
var s = string.replaceAll("Register","$10")
I want that this text Register saved to be changed to: $10 saved
Illegal group reference is the error I get.
If you look at the scaladoc for replaceAll, you'll see that it takes a regular expression string as the parameter. Escape the $ with a \, or use replaceAllLiterally
replaceAll uses a regular expressions to find the match. In the replacement string $ is a special character that refers to a specific capture group in the matching string. You have no capture groups so this is an error. It's not what you want anyway since you want the literal text "$10".
Usereplaceinstead ofreplaceAll`. It just does a direct string replacement.