sed rare-delimiter (other than & | / ?...) - sed

I am using the Unix sed command on a string that can contain all types of characters (&, |, !, /, ?, etc).
Is there a complex delimiter (with two characters?) that can fix the error:
sed: -e expression #1, char 22: unknown option to `s'

The characters in the input file are of no concern - sed parses them fine. There may be an issue, however, if you have most of the common characters in your pattern - or if your pattern may not be known beforehand.
At least on GNU sed, you can use a non-printable character that is highly improbable to exist in your pattern as a delimiter. For example, if your shell is Bash:
$ echo '|||' | sed s$'\001''|'$'\001''/'$'\001''g'
In this example, Bash replaces $'\001' with the character that has the octal value 001 - in ASCII it's the SOH character (start of heading).
Since such characters are control/non-printable characters, it's doubtful that they will exist in the pattern. Unless, that is, you are doing something weird like modifying binary files - or Unicode files without the proper locale settings.

Another way to do this is to use Shell Parameter Substitution.
${parameter/pattern/replace} # substitute replace for pattern once
or
${parameter//pattern/replace} # substitute replace for pattern everywhere
Here is a quite complex example that is difficult with sed:
$ parameter="Common sed delimiters: [sed-del]"
$ pattern="\[sed-del\]"
$ replace="[/_%:\\#]"
$ echo "${parameter//$pattern/replace}"
result is:
Common sed delimiters: [/_%:\#]
However: This only work with bash parameters and not files where sed excel.

There is no such option for multi-character expression delimiters in sed, but I doubt
you need that. The delimiter character should not occur in the pattern, but if it appears in the string being processed, it's not a problem. And unless you're doing something extremely weird, there will always be some character that doesn't appear in your search pattern that can serve as a delimiter.

You need the nested delimiter facility that Perl offers. That allows to use stuff like matching, substituting, and transliterating without worrying about the delimiter being included in your contents. Since perl is a superset of sed, you should be able to use it for whatever you’re used sed for.
Consider this:
$ perl -nle 'print if /something/' inputs
Now if your something contains a slash, you have a problem. The way to fix this is to change delimiter, preferably to a bracketing one. So for example, you could having anything you like in the $WHATEVER shell variable (provided the backets are balanced), which gets interpolated by the shell before Perl is even called here:
$ perl -nle "print if m($WHATEVER)" /usr/share/dict/words
That works even if you have correctly nested parens in $WHATEVER. The four bracketing pairs which correctly nest like this in Perl are < >, ( ), [ ], and { }. They allow arbitrary contents that include the delimiter if that delimiter is balanced.
If it is not balanced, then do not use a delimiter at all. If the pattern is in a Perl variable, you don’t need to use the match operator provided you use the =~ operator, so:
$whatever = "some arbitrary string ( / # [ etc";
if ($line =~ $whatever) { ... }

With the help of Jim Lewis, I finally did a test before using sed :
if [ `echo $1 | grep '|'` ]; then
grep ".*$1.*:" $DB_FILE | sed "s#^.*$1*.*\(:\)## "
else
grep ".*$1.*:" $DB_FILE | sed "s|^.*$1*.*\(:\)|| "
fi
Thanks for help

Wow. I totally did not know that you could use any character as a delimiter.
At least half the time I use the sed and BREs its on paths, code snippets, junk characters, things like that. I end up with a bunch of horribly unreadable escapes which I'm not even sure won't die on some combination I didn't think of. But if you can exclude just some character class (or just one character even)
echo '#01Y $#1+!' | sed -e 'sa$#1+ashita' -e 'su#01YuHolyug'
> > > Holy shit!
That's so much easier.

Escaping the delimiter inline for BASH to parse is cumbersome and difficult to read (although the delimiter does need escaping for sed's benefit when it's first used, per-expression).
To pull together thkala's answer and user4401178's comment:
DELIM=$(echo -en "\001");
sed -n "\\${DELIM}${STARTING_SEARCH_TERM}${DELIM},\\${DELIM}${ENDING_SEARCH_TERM}${DELIM}p" "${FILE}"
This example returns all results starting from ${STARTING_SEARCH_TERM} until ${ENDING_SEARCH_TERM} that don't match the SOH (start of heading) character with ASCII code 001.

There's no universal separator, but it can be escaped by a backslash for sed to not treat it like separator (at least unless you choose a backslash character as separator).
Depending on the actual application, it might be handy to just escape those characters in both pattern and replacement.
If you're in a bash environment, you can use bash substitution to escape sed separator, like this:
safe_replace () {
sed "s/${1//\//\\\/}/${2//\//\\\/}/g"
}
It's pretty self-explanatory, except for the bizarre part.
Explanation to that:
${1//\//\\\/}
${ - bash expansion starts
1 - first positional argument - the pattern
// - bash pattern substitution pattern separator "replace-all" variant
\/ - literal slash
/ - bash pattern substitution replacement separator
\\ - literal backslash
\/ - literal slash
} - bash expansion ends
example use:
$ input="ka/pus/ta"
$ pattern="/pus/"
$ replacement="/re/"
$ safe_replace "$pattern" "$replacement" <<< "$input"
ka/re/ta

Related

What do I miss in this sed Expression?

I'd like to replace the database server of a horde config file from "localhost" to a remote server (I use "database.contoso.com" as a placeholder).
The file in question is /var/www/horde/config/conf.php.
The line in the file looks like this:
$conf['sql']['hostspec'] = 'localhost';
Now I have created a sed line like so:
sed s/\$conf\[\'sql\'\]\[\'hostspec\'\]\ \=\ \'localhost\'\;/\$conf\[\'sql\'\]\[\
'hostspec\'\]\ \=\ \'database\.contoso\.com\'\;/ /var/www/horde/config/conf.php
But for whatever reason, it does not work -I spare out the -i option for later.
While trying to figure out, why it does not work, I did this:
echo "\$conf['sql']['hostspec'] = 'localhost';"|sed s/\$conf\[\'sql\'\]\[\'hostspec\'\]\ \=\ \'localhost\'\;/\$conf\[\'sql\'\]\[\'hostspec\'\]\ \=\ \'database\.contoso\.com\'\;/
which returns this:
$conf['sql']['hostspec'] = 'localhost';
but it should return:
$conf['sql']['hostspec'] = 'database.contoso.com';
What am I missing?
From Escape a string for a sed replace pattern in this case it would work:
KEYWORD="\$conf['sql']['hostspec'] = 'localhost';"
REPLACE="\$conf['sql']['hostspec'] = 'database.contoso.com';"
ESCAPED_REPLACE=$(printf '%s\n' "$REPLACE" | sed -e 's/[\/&]/\\&/g')
ESCAPED_KEYWORD=$(printf '%s\n' "$KEYWORD" | sed -e 's/[]\/$*.^[]/\\&/g');
sed "s/$ESCAPED_KEYWORD/$ESCAPED_REPLACE/"
The immediate problem is that you are not quoting enough. To match a regex metacharacter literally, you need to pass in a literal backslash \\ followed by a literal, like for example \[. But the simplest solution by far is to use single quotes around your expression, and then only backslash the characters which are regex metacharacters.
Literal single quotes inside single quotes are still challenging. Here, I have chosen to end the single-quoted string, insert a backslash-escaped but otherwise unquoted single quote, and add an opening single quote to continue with another single-quoted string. The shell glues these together into a single string.
echo "\$conf['sql']['hostspec'] = 'localhost';" |
sed 's/\$conf\['\''sql'\''\]\['\''hostspec'\''\] = '\''localhost'\'';/$conf['\''sql'\'']['\''hostspec'\''] = '\''database.contoso.com'\'';/'
A better solution generally is to use backreferences to quote back part of the matched string so you don't have to repeat it.
echo "\$conf['sql']['hostspec'] = 'localhost';" |
sed 's/\(\$conf\['\''sql'\''\]\['\''hostspec'\''\] = '\''\)[^'\'']*'\'';/\1database.contoso.com'\'';/'
Demo: https://ideone.com/RA0MSi
A much much much better solution is to change your PHP script so that this setting can be overridden with an option, an environment variable, and/or a configuration file.
This might work for you (GNU sed & shell):
sed -E 's/(\$conf\[('\'')sql\2]\[\2hostspec\2\] = )\2localhost\2;/\1\2database.contoso.com\2;/' file
Use pattern matching to match and replace.
N.B. Certain metacharacters must be escaped/quoted i.e. $,[,] and then because the sed commands are surrounded by single quotes, each single quote (within the substitution command) must be replaced by '\'' (see here for reasoning). Also, back references can be used both in the RHS and the LHS of the substitution command. The back references in the LHS especially allow for the shortening of the overall command and perhaps make the regexp more readable.

Using sed -e to replace slash

I am trying to understand what this below command does with -e in sed and exclamatory marks in the command,
sed -e "s!VPC_CIDR!"$(get_cluster_vpc_cidr)"!g" "templates/network-policies-${ns}.yaml"
This command helped to replace VPC_CIDR with 1.2.3.4\16.
Could someone through light on this please?
-e option just tells sed that the next argument is the script to execute. "s!VPC_CIDR!"$(get_cluster_vpc_cidr)"!g" is the script.
The " usage is strange. I would just "s!VPC_CIDR!$(get_cluster_vpc_cidr)!g". Because $(get_cluster_vpc_cidr) is not within " quotes, the result will undergo word splitting and filename expansion. Ie. it will fail on spaces and * or ? characters may work strangely.
The "s!VPC_CIDR!"$(get_cluster_vpc_cidr)"!g" is a sed script. The s command does, from man 1 sed:
s/regexp/replacement/
Attempt to match regexp against the pattern space. If successful, replace that portion matched with replacement. The replacement may con‐
tain the special character & to refer to that portion of the pattern space which matched, and the special escapes \1 through \9 to refer to
the corresponding matching sub-expressions in the regexp.
But you think - och ! is not /! But, as man 1 sed tells us This is just a brief synopsis of sed commands to serve as a reminder to those who already know sed. The POSIX sed or man 7 sed page will shed some more light:
[2addr]s/BRE/replacement/flags
Substitute the replacement string for instances of the BRE in the pattern space. Any character other than <backslash> or <newline> can be used instead of a to delimit the BRE and the replacement. Within the BRE and the replacement, the BRE delimiter itself can be used as a literal character if it is preceded by a <backslash>.
Any character. You can evey pass byte 0x01, like sed $'s\x01BRE\x01replacement\x01' and it's a valid script.
So s!VPC_CIDR!$(get_cluster_vpc_cidr)!g command replaces every occurence (ie. the g global flag) of the VPC_CIDR string (the string is literal, there are no special regex expressions there) for the output of $(get_cluster_vpc_cidr) (except that & and \1 and such are interpreted specially in replacement part).

How to use capture groups with sed?

I'm trying to replace some text in a file using sed but I'm having troubles.
sed -ir 's/(\$hello = )true/\1false/' /path/to/my/file.txt gives the error sed: -e expression #1, char 27: invalid reference \1 on 's' command's RHS.
I want to replace $hello = true with $hello = false, so in order to avoid typing $hello = twice I wanted to use capture groups - which isn't working.
What am I doing wrong?
You don't have to escape parentheses in extended regex mode, if it was your intent with the r into -ir, but actually if you want both options -i and -r then you have to keep them apart or use -ri instead of -ir because the latter interprets the part after -i as an optional backup suffix.
From sed manual
Because -i takes an optional argument, it should
not be followed by other short options:
sed -Ei '...' FILE
Same as -E -i with no backup suffix - FILE will be edited in-place without creating a backup.
sed -iE '...' FILE
This is equivalent to --in-place=E, creating FILEE as backup
of FILE
You must escape the parenthesis with backslashes \(...\), to be used as grouping.
See THE SED FAQ, section "3.1.2. Escape characters on the right side of "s///"" has an example:
3.1.2. Escape characters on the right side of "s///"
The right-hand side (the replacement part) in "s/find/replace/" is
almost always a string literal, with no interpolation of these
metacharacters:
. ^ $ [ ] { } ( ) ? + * |
Three things are interpolated: ampersand (&), backreferences, and
options for special seds. An ampersand on the RHS is replaced by
the entire expression matched on the LHS. There is never any
reason to use grouping like this:
s/\(some-complex-regex\)/one two \1 three/
And later in section "F. GNU sed v2.05 and higher versions":
F. GNU sed v2.05 and higher versions
...
Undocumented -r switch:
Beginning with version 3.02, GNU sed has an undocumented -r switch
(undocumented till version 4.0), activating Extended Regular
Expressions in the following manner:
? - 0 or 1 occurrence of previous character
+ - 1 or more occurrences of previous character
| - matches the string on either side, e.g., foo|bar
(...) - enable grouping without backslash
{...} - enable interval expression without backslash
When the -r switch (mnemonic: "regular expression") is used, prefix
these symbols with a backslash to disable the special meaning.
For documentation of regular expression syntax used in (GNU) sed, see Overview of basic regular expression syntax
5.3 Overview of basic regular expression syntax
...
\(regexp\)
Groups the inner regexp as a whole, this is used to:
Apply postfix operators, like (abcd)*: this will search for zero or more whole sequences of ‘abcd’, while abcd* would search for ‘abc’ followed by zero or more occurrences of ‘d’. Note that support for (abcd)* is required by POSIX 1003.1-2001, but many non-GNU implementations do not support it and hence it is not universally portable.
Use back references (see below).

sed command not working properly on ubuntu

I have one file named `config_3_setConfigPW.ldif? containing the following line:
{pass}
on terminal, I used following commands
SLAPPASSWD=Pwd&0011
sed -i "s#{pass}#$SLAPPASSWD#" config_3_setConfigPW.ldif
It should replace {pass} to Pwd&0011 but it generates Pwd{pass}0011.
The reason is that the SLAPPASSWD shell variable is expanded before sed sees it. So sed sees:
sed -i "s#{pass}#Pwd&0011#" config_3_setConfigPW.ldif
When an "&" is on the right hand side of a pattern it means "copy the matched input", and in your case the matched input is "{pass}".
The real problem is that you would have to escape all the special characters that might arise in SLAPPASSWD, to prevent sed doing this. For example, if you had character "#" in the password, sed would think it was the end of the substitute command, and give a syntax error.
Because of this, I wouldn't use sed for this. You could try gawk or perl?
eg, this will print out the modified file in awk (though it still assumes that SLAPPASSWD contains no " character
awk -F \{pass\} ' { print $1"'${SLAPPASSWD}'"$2 } ' config_3_setConfigPW.ldif
That's because$SLAPPASSWD contains the character sequences & which is a metacharacter used by sed and evaluates to the matched text in the s command. Meaning:
sed 's/{pass}/match: &/' <<< '{pass}'
would give you:
match: {pass}
A time ago I've asked this question: "Is it possible to escape regex metacharacters reliably with sed". Answers there show how to reliably escape the password before using it as the replacement part:
pwd="Pwd&0011"
pwdEscaped="$(sed 's/[&/\]/\\&/g' <<< "$pwd")"
# Now you can safely pass $pwd to sed
sed -i "s/{pass}/$pwdEscaped/" config_3_setConfigPW.ldif
Bear in mind that sed NEVER operates on strings. The thing sed searches for is a regexp and the thing it replaces it with is string-like but has some metacharacters you need to be aware of, e.g. & or \<number>, and all of it needs to avoid using the sed delimiters, / typically.
If you want to operate on strings you need to use awk:
awk -v old="{pass}" -v new="$SLAPPASSWD" 's=index($0,old){ $0 = substr($0,1,s-1) new substr($0,s+length(old))} 1' file
Even the above would need tweaked if old or new contained escape characters.

Substitution contains sed delimiter and extended regex features

I am trying to use sed to match and replace a string of the following nature somedate:"12/12/2012" using sed 's/somedate:"[0-9]{2}\/[0-9]{2}\/[0-9]{4}"//g'
This expression does not match anything in the string something and somedate:"12/12/2012" and something else
What am I doing incorrectly?
A few points:
You use / as the delimiter for sed but the substitution contains / so choose a different delimiter such as | or escape the / in the substitution with \. Any delimiter can used with sed.
The quantifier {n} is part of the extended regexp class so use the -r option (or -E for BSD derivatives of sed) or again escape the extended features like \{2\}.
The g flag may or may not be needed depending if you have multiple matches on a single line. It doesn't make a difference for your given example but it's worth pointing out.
You probably want {1,2} for the days and months i.e 1/1/2012.
I would do:
$ sed -r 's|somedate:"[0-9]{1,2}/[0-9]{1,2}/[0-9]{4}"||' file
something and and something else
Alternatively by escaping everything:
$ sed 's/somedate:"[0-9]\{1,2\}\/[0-9]\{1,2\}\/[0-9]\{4\}"//' file
something and and something else