Search problem with special characters in PostgreSQL - postgresql

SELECT * FROM "main_parse_user"
WHERE ("main_parse_user"."bio"::text ~* '\mFounder of JoJoWorld | Python'
OR "main_parse_user"."first_name"::text ~* '\mFounder of JoJoWorld | Python')
I'm looking for text with this code
And sometimes such words with '|'
How can I make it so that '|' treated like a normal line
But with text without such characters, everything works correctly

You'll have to escape characters that have a special meaning in regular expressions with a backslash to deprive them of their special meaning. Per the documentation:
\k (where k is a non-alphanumeric character) matches that character taken as an ordinary character, e.g., \\ matches a backslash character

Related

Postgres escape double quotes

I am working with a malformed database which seems to have double quotes as part of the column names.
Example:
|"Market" |
|---------|
|Japan |
|UK |
|USA |
And I want to select like below
SELECT "\"Market\"" FROM mytable; /* Does not work */
How does one select such a thing?
The documentation says
[A] delimited identifier or quoted identifier […] is formed by enclosing an arbitrary sequence of characters in double-quotes ("). […]
Quoted identifiers can contain any character, except the character with code zero. (To include a double quote, write two double quotes.)
So you'll want to use
SELECT """Market""" AS "Market" FROM mytable;
An alternative would be
A variant of quoted identifiers allows including escaped Unicode characters identified by their code points. This variant starts with U& (upper or lower case U followed by ampersand) immediately before the opening double quote, without any spaces in between, for example U&"foo". […] Inside the quotes, Unicode characters can be specified in escaped form by writing a backslash followed by the four-digit hexadecimal code point number or alternatively a backslash followed by a plus sign followed by a six-digit hexadecimal code point number.
which in your case would mean
SELECT U&"\0022Market\0022" AS "Market" FROM mytable;
SELECT U&"\+000022Market\+000022" AS "Market" FROM mytable;
Disclaimer: your database may not actually have double quotes as part of the name itself. As mentioned in the comments, this might just be the way in which the tool you are using does display a column named Market (not market) since
Quoting an identifier also makes it case-sensitive
So all you might need could be
SELECT "Market" FROM mytable;

Find and replace a href value with PowerShell?

I have a HTML file with a load of links in it.
They are in the format
http:/oldsite/showFile.asp?doc=1234&lib=lib1
I'd like to replace them with
http://newsite/?lib=lib1&doc=1234
(1234 and lib1 are variable)
Any idea on how to do that?
Thanks
P
I don't think your examples are correct.
http:/oldsite/showFile.asp?doc=1234&lib=lib1 should be
http:/oldsite/showFile.asp?doc=1234&lib=lib1
and
http://newsite/?lib=lib1&doc=1234 should be http://newsite?lib=lib1&doc=1234
To do the replacement on these, you can do
'http:/oldsite/showFile.asp?doc=1234&lib=lib1' -replace 'http:/oldsite/showFile\.asp\?(doc=\d+)&(lib=\w+)', 'http://newsite?$2&$1'
which returns http://newsite?lib=lib1&doc=1234
To replace these in a file you can use:
(Get-Content -Path 'X:\TheHtmlFile.html' -Raw) -replace 'http:/oldsite/showFile\.asp\?(doc=\d+)&(lib=\w+)', 'http://newsite?$2&$1' |
Set-Content -Path 'X:\TheNewHtmlFile.html'
Regex details:
http:/oldsite/showFile Match the characters “http:/oldsite/showFile” literally
\. Match the character “.” literally
asp Match the characters “asp” literally
\? Match the character “?” literally
( Match the regular expression below and capture its match into backreference number 1
doc= Match the characters “doc=” literally
\d Match a single digit 0..9
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
& Match the character “&” literally
( Match the regular expression below and capture its match into backreference number 2
lib= Match the characters “lib=” literally
\w Match a single character that is a “word character” (letters, digits, etc.)
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
Read in the file, loop through each line and replace the old value with the new value, send the output to the a new file:
gc file.html | % { $_.Replace('oldsite...','newsite...') } | out-file new-file.html

Why does my LIKE statement fail with '\\_' for matching?

I have a database entry that has entries that look like this:
id | name | code_set_id
I have this particular entry that I need to find:
674272310 | raphodo/qrc_resources.py | 782732
In my rails app (2.3.8), I have a statement that evaluates to this:
SELECT * from fyles WHERE code_set_id = 782732 AND name LIKE 'raphodo/qrc\\_resources.py%';
From reading up on escaping, the above query is correct. This is supposed to correctly double escape the underscore. However this query does not find the record in the database. These queries will:
SELECT * from fyles WHERE code_set_id = 782732 AND name LIKE 'raphodo/qrc\_resources.py%';
SELECT * from fyles WHERE code_set_id = 782732 AND name LIKE 'raphodo/qrc_resources.py%';
Am I missing something here? Why is the first SQL statement not finding the correct entry?
A single backslash in the RHS of a LIKE escapes the following character:
9.7.1. LIKE
[...]
To match a literal underscore or percent sign without matching other characters, the respective character in pattern must be preceded by the escape character. The default escape character is the backslash but a different one can be selected by using the ESCAPE clause. To match the escape character itself, write two escape characters.
So this is a literal underscore in a LIKE pattern:
\_
and this is a single backslash followed by an "any character" pattern:
\\_
You want LIKE to see this:
raphodo/qrc\_resources.py%
PostgreSQL used to interpret C-stye backslash escapes in strings by default but no longer, now you have to use E'...' to use backslash escapes in string literals (unless you've changed the configuration options). The String Constants with C-style Escapes section of the manual covers this but the simple version is that these two:
name LIKE E'raphodo/qrc\\_resources.py%'
name LIKE 'raphodo/qrc\_resources.py%'
do the same thing as of PostgreSQL 9.1.
Presumably your Rails 2.3.8 app (or whatever is preparing your LIKE patterns) is assuming an older version of PostgreSQL than the one you're actually using. You'll need to adjust things to not double your backslashes (or prefix the pattern string literals with Es).

Why does sed command contain at symbols

I don't understand why the following sed command contains an # symbol:
sed 's#session\s*required\s*pam_loginuid.so#session optional pam_loginuid.so#g' -i /etc/pam.d/sshd
I've looked at /etc/pam.d/sshd for the before/after effects of this command:
BEFORE:
...
# Set the loginuid process attribute.
session required pam_loginuid.so
...
AFTER:
...
# Set the loginuid process attribute.
session optional pam_loginuid.so
....
Is the # symbol possibly part of regex or sed syntax?
Could not find any doco on this.
Note: The above sed command is actually part of a Dockerfile RUN command in tutorial:
https://docs.docker.com/examples/running_ssh_service/
These are alternate delimiters for the regular expressions and replacement string. Handy when your regex or replacement string includes '/'.
From the sed manual
The syntax of the s (as in substitute) command is ‘s/regexp/replacement/flags’. The / characters may be uniformly replaced by any other single character within any given s command. The / character (or whatever other character is used in its stead) can appear in the regexp or replacement only if it is preceded by a \ character.
From the POSIX specification:
[2addr]s/BRE/replacement/flags
Substitute the replacement string for instances of the BRE in the pattern space. Any character other than <backslash> or <newline> can be used instead of a to delimit the BRE and the replacement. Within the BRE and the replacement, the BRE delimiter itself can be used as a literal character if it is preceded by a <backslash>.
as other says, it is another delimiter than traditionnal / in the s///action. This is usually used when / is found/part of the pattern like searching (or replacing by) a unix path that need to escape the /
s/\/my\/path/\/Your\/path/
# same as
s#my/path#/Your/path#
You often use a character that is not alpha numeric (but you can). The only (logical) constraint is to avoid a special character (aka special meaning like ^$[]{}()+\*.) for regex that make it difficult to read (but functionnal) and without the feature of this character in the pattern
echo "b(a)l" | sed 's(.)()('

What does this Perl while loop mean?

while ($aaa =~ m/= "(\D.*?)"/g)
I figured that it matches while $aaa is like anything = "something" it returns something (without the quotation mark).
But what does this piece of code mean?
m/= "(\D.*?)"/
You seem to have figured out most of it. The =, , and " all literally match those characters. The () capture a part of the matched string and make it available as $1. The part inside the parenthesis matches a non-digit character (\D), followed by zero or more (*?) non-newline characters (.) until the ". * would also match zero or more times, but prefers to match more characters so would end up matching until the last " in the string instead of the next one, as *? does.
All of this is documented in perlre.
The equals sign and quotation mark are taken literally, \D means any non-digit, .*? followed or not by zero or more characters, of any kind.
From left to right:
m/= "(\D.*?)"/g
match operator,
start regex:
equals sign, whitespace, double quotation mark,
start group:
one non-digit character, zero or more characters,
end group,
double quotation mark,
end regex
match globally