cannot reference capture group in sed - sed

I want to append a character on to a string.
I have this:
sed -r "s/\(.+:.+\)/\1,f/" "123:abc"
I simply want to append a ,f to the end of the string and am trying to reference the capture group \(.+:.+\). But, it does not work. I keep getting this error when I try to reference the capture group \1:
sed: -e expression #1, char 17: invalid reference \1 on `s' command's RHS
And idea?

You're using POSIX Basic syntax (with escaped parenthesis) when you specified the -r flag, which signifies POSIX Extended syntax.
More on this subject
Don't escape the parenthesis, and this should work. Sed is complaining because it doesn't think there is a group to reference, but instead, that there are literal parenthesis to find.
... "s/(.+:.+)/\1,f/" ...
i.e.
>echo "123:abc" | sed -r "s/(.+:.+)/\1,f/"
123:abc,f

Related

Using sed -e to replace slash

I am trying to understand what this below command does with -e in sed and exclamatory marks in the command,
sed -e "s!VPC_CIDR!"$(get_cluster_vpc_cidr)"!g" "templates/network-policies-${ns}.yaml"
This command helped to replace VPC_CIDR with 1.2.3.4\16.
Could someone through light on this please?
-e option just tells sed that the next argument is the script to execute. "s!VPC_CIDR!"$(get_cluster_vpc_cidr)"!g" is the script.
The " usage is strange. I would just "s!VPC_CIDR!$(get_cluster_vpc_cidr)!g". Because $(get_cluster_vpc_cidr) is not within " quotes, the result will undergo word splitting and filename expansion. Ie. it will fail on spaces and * or ? characters may work strangely.
The "s!VPC_CIDR!"$(get_cluster_vpc_cidr)"!g" is a sed script. The s command does, from man 1 sed:
s/regexp/replacement/
Attempt to match regexp against the pattern space. If successful, replace that portion matched with replacement. The replacement may con‐
tain the special character & to refer to that portion of the pattern space which matched, and the special escapes \1 through \9 to refer to
the corresponding matching sub-expressions in the regexp.
But you think - och ! is not /! But, as man 1 sed tells us This is just a brief synopsis of sed commands to serve as a reminder to those who already know sed. The POSIX sed or man 7 sed page will shed some more light:
[2addr]s/BRE/replacement/flags
Substitute the replacement string for instances of the BRE in the pattern space. Any character other than <backslash> or <newline> can be used instead of a to delimit the BRE and the replacement. Within the BRE and the replacement, the BRE delimiter itself can be used as a literal character if it is preceded by a <backslash>.
Any character. You can evey pass byte 0x01, like sed $'s\x01BRE\x01replacement\x01' and it's a valid script.
So s!VPC_CIDR!$(get_cluster_vpc_cidr)!g command replaces every occurence (ie. the g global flag) of the VPC_CIDR string (the string is literal, there are no special regex expressions there) for the output of $(get_cluster_vpc_cidr) (except that & and \1 and such are interpreted specially in replacement part).

How to use capture groups with sed?

I'm trying to replace some text in a file using sed but I'm having troubles.
sed -ir 's/(\$hello = )true/\1false/' /path/to/my/file.txt gives the error sed: -e expression #1, char 27: invalid reference \1 on 's' command's RHS.
I want to replace $hello = true with $hello = false, so in order to avoid typing $hello = twice I wanted to use capture groups - which isn't working.
What am I doing wrong?
You don't have to escape parentheses in extended regex mode, if it was your intent with the r into -ir, but actually if you want both options -i and -r then you have to keep them apart or use -ri instead of -ir because the latter interprets the part after -i as an optional backup suffix.
From sed manual
Because -i takes an optional argument, it should
not be followed by other short options:
sed -Ei '...' FILE
Same as -E -i with no backup suffix - FILE will be edited in-place without creating a backup.
sed -iE '...' FILE
This is equivalent to --in-place=E, creating FILEE as backup
of FILE
You must escape the parenthesis with backslashes \(...\), to be used as grouping.
See THE SED FAQ, section "3.1.2. Escape characters on the right side of "s///"" has an example:
3.1.2. Escape characters on the right side of "s///"
The right-hand side (the replacement part) in "s/find/replace/" is
almost always a string literal, with no interpolation of these
metacharacters:
. ^ $ [ ] { } ( ) ? + * |
Three things are interpolated: ampersand (&), backreferences, and
options for special seds. An ampersand on the RHS is replaced by
the entire expression matched on the LHS. There is never any
reason to use grouping like this:
s/\(some-complex-regex\)/one two \1 three/
And later in section "F. GNU sed v2.05 and higher versions":
F. GNU sed v2.05 and higher versions
...
Undocumented -r switch:
Beginning with version 3.02, GNU sed has an undocumented -r switch
(undocumented till version 4.0), activating Extended Regular
Expressions in the following manner:
? - 0 or 1 occurrence of previous character
+ - 1 or more occurrences of previous character
| - matches the string on either side, e.g., foo|bar
(...) - enable grouping without backslash
{...} - enable interval expression without backslash
When the -r switch (mnemonic: "regular expression") is used, prefix
these symbols with a backslash to disable the special meaning.
For documentation of regular expression syntax used in (GNU) sed, see Overview of basic regular expression syntax
5.3 Overview of basic regular expression syntax
...
\(regexp\)
Groups the inner regexp as a whole, this is used to:
Apply postfix operators, like (abcd)*: this will search for zero or more whole sequences of ‘abcd’, while abcd* would search for ‘abc’ followed by zero or more occurrences of ‘d’. Note that support for (abcd)* is required by POSIX 1003.1-2001, but many non-GNU implementations do not support it and hence it is not universally portable.
Use back references (see below).

Variable in sed command

I can't get this sed command to work.
remote_file_path=$(tac $local_file_name | sed -nr '1,/$file_name/ d; /^\\/ { p; q }')
If I replace the single quotes with doubles, it breaks the rest of the command and I get this error:
sed: -e expression #1, char 31: unterminated address regex
Basically what I'm doing is using tac to search through a file backwards so that I can locate the preceeding line that starts with a backslash and assign it to the variable remote_file_path.
THank you
Single quotes don't expand variables. Double quotes do, but if the filename contains a slash (or some other character special to sed), it can break the expression. You can use a different regex delimiter (e.g. \=$file_name= d), but only if you're sure the file name can never contain it (or other special sed characters, e.g. a dot).
Use a real language with variables, variables in shell are just macros; for example, you can use Perl:
f=$file_name perl -ne 'next if 1 .. /\Q$ENV{f}/; print, last if /^\\/'
\Q makes all the special characters in $f literal (see quotemeta).

Terminal File replace testA with a

Is there a way to convert testThatMy to thatMy using the Terminal?
This is what I have now:
sed -i 's/test//g' MyJavaFile.java
The only thing missing would be to convert the character after test now to lower case.
Also for some reason referencing to a regex variable does not seem to work.
sed -i 's/test([A-Z]{1})/\1/g' MyJavaFile.java
You can use the following GNU sed command:
sed -r 's/test([[:upper:]])([^[:space:]]*)/\L\1\2/g' file.java
For in place editing you need to pass -i, but I would test the command first.
Pattern Explanation:
-r enables extended POSIX regular expressions.
[[:upper:]] matches an uppercase character
[^[:space:]]* matches zero or more non space characters
Replacement Explanation:
\L transform the following expression to uppercase. \1 is the content first capturing group. \2 is the content of the second capturing group.

sed: Replace part of a line

How can one replace a part of a line with sed?
The line
DBSERVERNAME xxx
should be replaced to:
DBSERVERNAME yyy
The value xxx can vary and there are two tabs between dbservername and the value. This name-value pair is one of many from a configuration file.
I tried with the following backreference:
echo "DBSERVERNAME xxx" | sed -rne 's/\(dbservername\)[[:blank:]]+\([[:alpha:]]+\)/\1 yyy/gip'
and that resulted in an error: invalid reference \1 on `s' command's RHS.
Whats wrong with the expression? Using GNU sed.
This works:
sed -rne 's/(dbservername)\s+\w+/\1 yyy/gip'
(When you use the -r option, you don't have to escape the parens.)
Bit of explanation:
-r is extended regular expressions - makes a difference to how the regex is written.
-n does not print unless specified - sed prints by default otherwise,
-e means what follows it is an expression. Let's break the expression down:
s/// is the command for search-replace, and what's between the first pair is the regex to match, and the second pair the replacement,
gip, which follows the search replace command; g means global, i.e., every match instead of just the first will be replaced in a line; i is case-insensitivity; p means print when done (remember the -n flag from earlier!),
The brackets represent a match part, which will come up later. So dbservername is the first match part,
\s is whitespace, + means one or more (vs *, zero or more) occurrences,
\w is a word, that is any letter, digit or underscore,
\1 is a special expression for GNU sed that prints the first bracketed match in the accompanying search.
Others have already mentioned the escaping of parentheses, but why do you need a back reference at all, if the first part of the line is constant?
You could simply do
sed -e 's/dbservername.*$/dbservername yyy/g'
You're escaping your ( and ). I'm pretty sure you don't need to do that. Try:
sed -rne 's/(dbservername)[[:blank:]]+\([[:alpha:]]+\)/\1 yyy/gip'
You shouldn't be escaping things when you use single quotes. ie.
echo "DBSERVERNAME xxx" | sed -rne 's/(dbservername[[:blank:]]+)([[:alpha:]]+)/\1 yyy/gip'
You shouldn't be escaping your parens. Try:
echo "DBSERVERNAME xxx" | sed -rne 's/(dbservername)[[:blank:]]+([[:alpha:]]+)/\1 yyy/gip'
This might work for you:
echo "DBSERVERNAME xxx" | sed 's/\S*$/yyy/'
DBSERVERNAME yyy
Try this
sed -re 's/DBSERVERNAME[ \t]*([^\S]+)/\yyy/ig' temp.txt
or this
awk '{if($1=="DBSERVERNAME") $2 ="YYY"} {print $0;}' temp.txt