confused about what must be escaped for sed - sed

I want to replace specific strings in php files automatically using sed. Some work, and some do not. I already investigated this is not an issue with the replacement string but with the string that is to be replaced. I already tried to escape [ and ] with no success. It seems to be the whitespace within the () - not whitespaces in general. The first whitespaces (around the = ) do not have any problems. Please can someone point me to the problem:
sed -e "1,\$s/$adm = substr($path . rawurlencode($upload['name']) , 16);/$adm = rawurlencode($upload['name']); # fix 23/g" -i administration/identify.php
I already tried to shorten the string which should be replaced and the result was if I cut it directly behind $path it works, with the following whitespace it does not. Escaping whitespace has no effect...

what must be escaped for sed
The following characters have special meaning in sed and have to be escaped with \ for the regex to be taken literally:
\
[
the character used in separating s command parts, ie. / here
.
*
& only replacement string
Newline character is handled specially as the end of the string, but can be replaced for \n.
So first escape all special characters in input and then pass it to sed:
rgx="$adm = substr($path . rawurlencode($upload['name']) , 16);"
rgx_escaped=$(sed 's/[\\\[\.\*\/&]/\\&/g' <<<"$rgx")
sed "s/$rgx_escaped/ etc."
See Escape a string for a sed replace pattern for a generic escaping solution.

You may use
sed -i 's/\$adm = substr(\$path \. rawurlencode(\$upload\['"'"'name'"'"']) , 16);/$adm = rawurlencode($upload['"'"'name'"'"']); # fix 23/g' administration/identify.php
Note:
the sed command is basically wrapped in single quotes, the variable expansion won't occur inside single quotes
In the POSIX BRE syntax, ( matches a literal (, you do not need to escape ) either, but you need to escape [ and . that must match themselves
The single quotes require additional quoting with concatenation.

Related

What do I miss in this sed Expression?

I'd like to replace the database server of a horde config file from "localhost" to a remote server (I use "database.contoso.com" as a placeholder).
The file in question is /var/www/horde/config/conf.php.
The line in the file looks like this:
$conf['sql']['hostspec'] = 'localhost';
Now I have created a sed line like so:
sed s/\$conf\[\'sql\'\]\[\'hostspec\'\]\ \=\ \'localhost\'\;/\$conf\[\'sql\'\]\[\
'hostspec\'\]\ \=\ \'database\.contoso\.com\'\;/ /var/www/horde/config/conf.php
But for whatever reason, it does not work -I spare out the -i option for later.
While trying to figure out, why it does not work, I did this:
echo "\$conf['sql']['hostspec'] = 'localhost';"|sed s/\$conf\[\'sql\'\]\[\'hostspec\'\]\ \=\ \'localhost\'\;/\$conf\[\'sql\'\]\[\'hostspec\'\]\ \=\ \'database\.contoso\.com\'\;/
which returns this:
$conf['sql']['hostspec'] = 'localhost';
but it should return:
$conf['sql']['hostspec'] = 'database.contoso.com';
What am I missing?
From Escape a string for a sed replace pattern in this case it would work:
KEYWORD="\$conf['sql']['hostspec'] = 'localhost';"
REPLACE="\$conf['sql']['hostspec'] = 'database.contoso.com';"
ESCAPED_REPLACE=$(printf '%s\n' "$REPLACE" | sed -e 's/[\/&]/\\&/g')
ESCAPED_KEYWORD=$(printf '%s\n' "$KEYWORD" | sed -e 's/[]\/$*.^[]/\\&/g');
sed "s/$ESCAPED_KEYWORD/$ESCAPED_REPLACE/"
The immediate problem is that you are not quoting enough. To match a regex metacharacter literally, you need to pass in a literal backslash \\ followed by a literal, like for example \[. But the simplest solution by far is to use single quotes around your expression, and then only backslash the characters which are regex metacharacters.
Literal single quotes inside single quotes are still challenging. Here, I have chosen to end the single-quoted string, insert a backslash-escaped but otherwise unquoted single quote, and add an opening single quote to continue with another single-quoted string. The shell glues these together into a single string.
echo "\$conf['sql']['hostspec'] = 'localhost';" |
sed 's/\$conf\['\''sql'\''\]\['\''hostspec'\''\] = '\''localhost'\'';/$conf['\''sql'\'']['\''hostspec'\''] = '\''database.contoso.com'\'';/'
A better solution generally is to use backreferences to quote back part of the matched string so you don't have to repeat it.
echo "\$conf['sql']['hostspec'] = 'localhost';" |
sed 's/\(\$conf\['\''sql'\''\]\['\''hostspec'\''\] = '\''\)[^'\'']*'\'';/\1database.contoso.com'\'';/'
Demo: https://ideone.com/RA0MSi
A much much much better solution is to change your PHP script so that this setting can be overridden with an option, an environment variable, and/or a configuration file.
This might work for you (GNU sed & shell):
sed -E 's/(\$conf\[('\'')sql\2]\[\2hostspec\2\] = )\2localhost\2;/\1\2database.contoso.com\2;/' file
Use pattern matching to match and replace.
N.B. Certain metacharacters must be escaped/quoted i.e. $,[,] and then because the sed commands are surrounded by single quotes, each single quote (within the substitution command) must be replaced by '\'' (see here for reasoning). Also, back references can be used both in the RHS and the LHS of the substitution command. The back references in the LHS especially allow for the shortening of the overall command and perhaps make the regexp more readable.

Using sed -e to replace slash

I am trying to understand what this below command does with -e in sed and exclamatory marks in the command,
sed -e "s!VPC_CIDR!"$(get_cluster_vpc_cidr)"!g" "templates/network-policies-${ns}.yaml"
This command helped to replace VPC_CIDR with 1.2.3.4\16.
Could someone through light on this please?
-e option just tells sed that the next argument is the script to execute. "s!VPC_CIDR!"$(get_cluster_vpc_cidr)"!g" is the script.
The " usage is strange. I would just "s!VPC_CIDR!$(get_cluster_vpc_cidr)!g". Because $(get_cluster_vpc_cidr) is not within " quotes, the result will undergo word splitting and filename expansion. Ie. it will fail on spaces and * or ? characters may work strangely.
The "s!VPC_CIDR!"$(get_cluster_vpc_cidr)"!g" is a sed script. The s command does, from man 1 sed:
s/regexp/replacement/
Attempt to match regexp against the pattern space. If successful, replace that portion matched with replacement. The replacement may con‐
tain the special character & to refer to that portion of the pattern space which matched, and the special escapes \1 through \9 to refer to
the corresponding matching sub-expressions in the regexp.
But you think - och ! is not /! But, as man 1 sed tells us This is just a brief synopsis of sed commands to serve as a reminder to those who already know sed. The POSIX sed or man 7 sed page will shed some more light:
[2addr]s/BRE/replacement/flags
Substitute the replacement string for instances of the BRE in the pattern space. Any character other than <backslash> or <newline> can be used instead of a to delimit the BRE and the replacement. Within the BRE and the replacement, the BRE delimiter itself can be used as a literal character if it is preceded by a <backslash>.
Any character. You can evey pass byte 0x01, like sed $'s\x01BRE\x01replacement\x01' and it's a valid script.
So s!VPC_CIDR!$(get_cluster_vpc_cidr)!g command replaces every occurence (ie. the g global flag) of the VPC_CIDR string (the string is literal, there are no special regex expressions there) for the output of $(get_cluster_vpc_cidr) (except that & and \1 and such are interpreted specially in replacement part).

Why does q/\\a/ equal q/\a/?

The following example prints "SAME":
if (q/\\a/ eq q/\a/) {
print "SAME\n";
}
else {
print "DIFFERENT\n";
}
I understand this is consistent with the documentation. But I think this behavior is undesirable. Is there a need to escape a backlash lilteral in single-quoted string? If I wanted 2 backlashes, I'd need to specify 4; this does not seem convenient.
Shouldn't Perl detect whether a backslash serves as an escape character or not? For instance, when a backslash does not precede a delimiter, it should be treated as a literal; and if that were the case, I wouldn't need 3 backslashes to express two, e.g.,
q<a\\b>
instead of
q<a\\\b>.
Is there a need to escape a backlash in single-quoted string?
Yes, if the backslash is followed by another backslash, or is the last character in the string:
$ perl -e'print q/C:\/'
Can't find string terminator "/" anywhere before EOF at -e line 1.
$ perl -e'print q/C:\\/'
C:\
This makes it possible to include any character in a single-quoted string, including the delimiter and the escape character.
If I wanted 2 backlashes, I'd need to specify 4; this does not seem convenient.
Actually, you only need three (because the second backslash isn't followed by another backslash). But as an alternative, if your string contains a lot of backslashes you can use a single-quoted heredoc, which requires no escaping:
my $path = <<'END';
C:\a\very\long\path
END
chomp $path;
print $path; # C:\a\very\long\path
Note that the heredoc adds a newline to the end, which you can remove with chomp.
In single-quoted string literals,
A backslash represents a backslash unless followed by the delimiter or another backslash, in which case the delimiter or backslash is interpolated.
In other words,
You must escape delimiters.
You must escape \ that are followed by \ or the delimiter.
You may escape \ that aren't followed by \ or the delimiter.
So,
q/\// ⇒ /
q/\\\\a/ ⇒ \\a
q/\\\a/ ⇒ \\a
q/\\a/ ⇒ \a
q/\a/ ⇒ \a
Is there a need to escape a backlash in single-quoted string?
Yes, if it's followed by another backslash or the delimiter.
If I wanted 2 backlashes, I'd need to specify 4
Three would suffice.
this does not seem convenient.
It's more convenient than double-quoted strings, where backslashes must always be escaped. Single-quoted string require the minimum amount of escaping possible without losing the ability to produce the delimiter.

Why does sed command contain at symbols

I don't understand why the following sed command contains an # symbol:
sed 's#session\s*required\s*pam_loginuid.so#session optional pam_loginuid.so#g' -i /etc/pam.d/sshd
I've looked at /etc/pam.d/sshd for the before/after effects of this command:
BEFORE:
...
# Set the loginuid process attribute.
session required pam_loginuid.so
...
AFTER:
...
# Set the loginuid process attribute.
session optional pam_loginuid.so
....
Is the # symbol possibly part of regex or sed syntax?
Could not find any doco on this.
Note: The above sed command is actually part of a Dockerfile RUN command in tutorial:
https://docs.docker.com/examples/running_ssh_service/
These are alternate delimiters for the regular expressions and replacement string. Handy when your regex or replacement string includes '/'.
From the sed manual
The syntax of the s (as in substitute) command is ‘s/regexp/replacement/flags’. The / characters may be uniformly replaced by any other single character within any given s command. The / character (or whatever other character is used in its stead) can appear in the regexp or replacement only if it is preceded by a \ character.
From the POSIX specification:
[2addr]s/BRE/replacement/flags
Substitute the replacement string for instances of the BRE in the pattern space. Any character other than <backslash> or <newline> can be used instead of a to delimit the BRE and the replacement. Within the BRE and the replacement, the BRE delimiter itself can be used as a literal character if it is preceded by a <backslash>.
as other says, it is another delimiter than traditionnal / in the s///action. This is usually used when / is found/part of the pattern like searching (or replacing by) a unix path that need to escape the /
s/\/my\/path/\/Your\/path/
# same as
s#my/path#/Your/path#
You often use a character that is not alpha numeric (but you can). The only (logical) constraint is to avoid a special character (aka special meaning like ^$[]{}()+\*.) for regex that make it difficult to read (but functionnal) and without the feature of this character in the pattern
echo "b(a)l" | sed 's(.)()('

How can I get sed to remove `\` followed by anything?

I am trying to write a sed script to convert LaTeX coded tables into tab delimited tables.
To do this I need to convert & into \t and strip out anything that is preceded by \.
This is what I have so far:
s/&/\t/g
s/\*/" "/g
The first line works as intended. In the second line I try to replace \ followed by anything with a space but it doesn't alter the lines with \ in them.
Any suggestions are appreciated. Also, can you briefly explain what suggested scripts "say"? I am new to sed and that really helps with the learning process!
Thanks
Assuming you're running this as a sed script, and not directly on the command line:
s/\\.*/ /g
Explanation:
\\ - double backslash to match a literal backslash (a single \ is interpreted as "escape the following character", followed by a .* (. - match any single character, * - arbitrarily many times).
You need to escape the backslash as it is a special character.
If you want to denote "any character" you need to use . (a period)
the second expression should be:
s/\\.//g
I hope I understood your intention and you want to strip the character after the backslash,
if you want to delete all the characters in the line after the backslash add a star (*)
after the period.