Variable in sed command - sed

I can't get this sed command to work.
remote_file_path=$(tac $local_file_name | sed -nr '1,/$file_name/ d; /^\\/ { p; q }')
If I replace the single quotes with doubles, it breaks the rest of the command and I get this error:
sed: -e expression #1, char 31: unterminated address regex
Basically what I'm doing is using tac to search through a file backwards so that I can locate the preceeding line that starts with a backslash and assign it to the variable remote_file_path.
THank you

Single quotes don't expand variables. Double quotes do, but if the filename contains a slash (or some other character special to sed), it can break the expression. You can use a different regex delimiter (e.g. \=$file_name= d), but only if you're sure the file name can never contain it (or other special sed characters, e.g. a dot).
Use a real language with variables, variables in shell are just macros; for example, you can use Perl:
f=$file_name perl -ne 'next if 1 .. /\Q$ENV{f}/; print, last if /^\\/'
\Q makes all the special characters in $f literal (see quotemeta).

Related

Using sed -e to replace slash

I am trying to understand what this below command does with -e in sed and exclamatory marks in the command,
sed -e "s!VPC_CIDR!"$(get_cluster_vpc_cidr)"!g" "templates/network-policies-${ns}.yaml"
This command helped to replace VPC_CIDR with 1.2.3.4\16.
Could someone through light on this please?
-e option just tells sed that the next argument is the script to execute. "s!VPC_CIDR!"$(get_cluster_vpc_cidr)"!g" is the script.
The " usage is strange. I would just "s!VPC_CIDR!$(get_cluster_vpc_cidr)!g". Because $(get_cluster_vpc_cidr) is not within " quotes, the result will undergo word splitting and filename expansion. Ie. it will fail on spaces and * or ? characters may work strangely.
The "s!VPC_CIDR!"$(get_cluster_vpc_cidr)"!g" is a sed script. The s command does, from man 1 sed:
s/regexp/replacement/
Attempt to match regexp against the pattern space. If successful, replace that portion matched with replacement. The replacement may con‐
tain the special character & to refer to that portion of the pattern space which matched, and the special escapes \1 through \9 to refer to
the corresponding matching sub-expressions in the regexp.
But you think - och ! is not /! But, as man 1 sed tells us This is just a brief synopsis of sed commands to serve as a reminder to those who already know sed. The POSIX sed or man 7 sed page will shed some more light:
[2addr]s/BRE/replacement/flags
Substitute the replacement string for instances of the BRE in the pattern space. Any character other than <backslash> or <newline> can be used instead of a to delimit the BRE and the replacement. Within the BRE and the replacement, the BRE delimiter itself can be used as a literal character if it is preceded by a <backslash>.
Any character. You can evey pass byte 0x01, like sed $'s\x01BRE\x01replacement\x01' and it's a valid script.
So s!VPC_CIDR!$(get_cluster_vpc_cidr)!g command replaces every occurence (ie. the g global flag) of the VPC_CIDR string (the string is literal, there are no special regex expressions there) for the output of $(get_cluster_vpc_cidr) (except that & and \1 and such are interpreted specially in replacement part).

cannot reference capture group in sed

I want to append a character on to a string.
I have this:
sed -r "s/\(.+:.+\)/\1,f/" "123:abc"
I simply want to append a ,f to the end of the string and am trying to reference the capture group \(.+:.+\). But, it does not work. I keep getting this error when I try to reference the capture group \1:
sed: -e expression #1, char 17: invalid reference \1 on `s' command's RHS
And idea?
You're using POSIX Basic syntax (with escaped parenthesis) when you specified the -r flag, which signifies POSIX Extended syntax.
More on this subject
Don't escape the parenthesis, and this should work. Sed is complaining because it doesn't think there is a group to reference, but instead, that there are literal parenthesis to find.
... "s/(.+:.+)/\1,f/" ...
i.e.
>echo "123:abc" | sed -r "s/(.+:.+)/\1,f/"
123:abc,f

SED - replace string newline anything with string newline varable

I have the following content in a file
dhcp_option_domain:
- test.domain
And what I need to do is this:
whenever the value 'dhcp_option_domain:' followed by a newline and then ANY string, replace it with 'dhcp_option_domain:' followed by a newline and a variable.
ie if I set a variable of dhcp_domain="different.com" then then string above would convert to:
dhcp_option_domain:
- different.com
Note that both lines have and need to maintain leading 2 spaces.
I do not want to just do a search and replace on 'test.domain' as I have a few cases to use this and the values could be different each time the sed command is run.
I have tried a few methods such as:
dhcp_domain="something.com"
sed -i 's|dhcp_option_domain:\n.*|dhcp_option_domain:\n - $dhcp_domain|g' filename
however cannot get it to work.
Thanks.
As the manual explains:
sed operates by performing the following cycle on each line of input: first, sed reads one line from the input stream, removes any trailing newline, and places it in the pattern space. Then commands are executed
Your regex (dhcp_option_domain:\n.*) does not match because there is no \n in the pattern space in the first place.
A possible solution:
sed '/dhcp_option_domain:$/{n;c\
- '"$dhcp_domain"'
}'
The /dhcp_option_domain:$/ part is an address. The following command is only executed on lines matching that pattern.
The { } command groups multiple commands into a single block.
The n command prints out the current pattern space and replaces it by the next line of input.
The c\ command replaces the current pattern space by whatever follows in the script. Here it gets a bit tricky. We have:
a literal newline in the sed program (required after c\), then
- (placing those characters in the pattern space literally, then
' (part of shell syntax, terminating the single-quoted part started by sed '...), then
" (starting a double-quoted part), then
$dhcp_domain (which, because it's in a double-quoted part, interpolates the contents of the dhcp_domain shell variable), then
" (terminating the double-quoted part), then
' (starting another single-quoted part), then
a literal newline again (terminating the text after c\), then
} (closing the block started by {).
By default, sed works line by line (using newline character to distinguish newlines)
$ cat ip.txt
foo baz
dhcp_option_domain:
- test.domain
123
dhcp_option_domain:
$ dhcp_domain='something.com'
$ sed '/^ dhcp_option_domain:/{n; s/.*/ - '"$dhcp_domain"'/}' ip.txt
foo baz
dhcp_option_domain:
- something.com
123
dhcp_option_domain:
/^ dhcp_option_domain:/ condition to match
{} to group more than one command to be executed when this condition is satisfied
n get next line
s/.*/ - '"$dhcp_domain"'/ replace it as required - note that shell variables won't be expanded inside single quotes, see sed substitution with bash variables
for details
note that last line in the file didn't trigger the change as there was no further line
tested on GNU sed, syntax might vary for other implementations
From GNU sed manual
n
If auto-print is not disabled, print the pattern space, then,
regardless, replace the pattern space with the next line of input. If
there is no more input then sed exits without processing any more
commands.
This might work for you (GNU sed):
sed '/dhcp_option_domain:$/{p;s// - '"${var}"'/;n;d}' file
Match on dhcp_option_domain:, print it, substitute the new domain name (maintaining indent), print the current line and fetch the next (n) and delete it.

sed pattern negation with a comma separated line

I have a text file full of lines looking like:
Female,"$0 to $25,000",Arlington Heights,0,60462,ZD111326,9/18/13 0:21,Disk Drive
I am trying to change all of the commas , to pipes |, except for the commas within the quotes.
Trying to use sed (which I am new to)... and it is not working. Using:
sed '/".*"/!s/\,/|/g' textfile.csv
Any thoughts?
As a test case, consider this file:
Female,"$0 to $25,000",Arlington Heights,0,60462,ZD111326,9/18/13 0:21,Disk Drive
foo,foo,"x,y,z",foo,"a,b,c",foo,"yes,no"
"x,y,z",foo,"a,b,c",foo,"yes,no",foo
Here is a sed command to replace non-quoted commas with pipe symbols:
$ sed -r ':a; s/^([^"]*("[^"]*"[^"]*)*),/\1|/g; t a' file
Female|"$0 to $25,000"|Arlington Heights|0|60462|ZD111326|9/18/13 0:21|Disk Drive
foo|foo|"x,y,z"|foo|"a,b,c"|foo|"yes,no"
"x,y,z"|foo|"a,b,c"|foo|"yes,no"|foo
Explanation
This looks for commas that appear after pairs of double quotes and replaces them with pipe symbols.
:a
This defines a label a.
s/^([^"]*("[^"]*"[^"]*)*),/\1|/g
If 0, 2, 4, or any an even number of quotes precede a comma on the line, then replace that comma with a pipe symbol.
^
This matches at the start of the line.
(`
This starts the main grouping (\1).
[^"]*
This looks for zero or more non-quote characters.
("[^"]*"[^"]*)*
The * outside the parens means that we are looking for zero or more of the pattern inside the parens. The pattern inside the parens consists of a quote, any number of non-quotes, a quote and then any number on non-quotes.
In other words, this grouping only matches pairs of quotes. Because of the * outside the parens, it can match any even number of quotes.
)
This closes the main grouping
,
This requires that the grouping be followed by a comma.
t a
If the previous s command successfully made a substitution, then the test command tells sed to jump back to label a and try again.
If no substitution was made, then we are done.
using awk could be eaiser:
kent$ cat f
foo,foo,"x,y,z",foo,"a,b,c",foo,"yes,no"
Female,"$0 to $25,000",Arlington Heights,0,60462,ZD111326,9/18/13 0:21,Disk Drive
kent$ awk -F'"' -v OFS='"' '{for(i=1;i<=NF;i++)if(i%2)gsub(",","|",$i)}7' f
foo|foo|"x,y,z"|foo|"a,b,c"|foo|"yes,no"
Female|"$0 to $25,000"|Arlington Heights|0|60462|ZD111326|9/18/13 0:21|Disk Drive
I suggest a language with a proper CSV parser. For example:
ruby -rcsv -ne 'puts CSV.generate_line(CSV.parse_line($_), :col_sep=>"|")' file
Female|$0 to $25,000|Arlington Heights|0|60462|ZD111326|9/18/13 0:21|Disk Drive
Here I would have used gnu awks FPAT. It define how a field looks like FS that tells what the separator is. Then you can just set the output separator to |
awk '{$1=$1}1' OFS=\| FPAT="([^,]+)|(\"[^\"]+\")" file
Female|"$0 to $25,000"|Arlington Heights|0|60462|ZD111326|9/18/13 0:21|Disk Drive
If your awk does not support FPAT, this can be used:
awk -F, '{for (i=1;i<NF;i++) {c+=gsub(/\"/,"&",$i);printf "%s"(c%2?FS:"|"),$i}print $NF}' file
Female|"$0 to $25,000"|Arlington Heights|0|60462|ZD111326|9/18/13 0:21|Disk Drive
sed 's/"\(.*\),\(.*\)"/"\1##HOLD##\2"/g;s/,/|/g;s/##HOLD##/,/g'
This will match the text in quotes and put a placeholder for the commas, then switch all the other commas to pipes and put the placeholder back to commas. You can change the ##HOLD## text to whatever you want.

sed rare-delimiter (other than & | / ?...)

I am using the Unix sed command on a string that can contain all types of characters (&, |, !, /, ?, etc).
Is there a complex delimiter (with two characters?) that can fix the error:
sed: -e expression #1, char 22: unknown option to `s'
The characters in the input file are of no concern - sed parses them fine. There may be an issue, however, if you have most of the common characters in your pattern - or if your pattern may not be known beforehand.
At least on GNU sed, you can use a non-printable character that is highly improbable to exist in your pattern as a delimiter. For example, if your shell is Bash:
$ echo '|||' | sed s$'\001''|'$'\001''/'$'\001''g'
In this example, Bash replaces $'\001' with the character that has the octal value 001 - in ASCII it's the SOH character (start of heading).
Since such characters are control/non-printable characters, it's doubtful that they will exist in the pattern. Unless, that is, you are doing something weird like modifying binary files - or Unicode files without the proper locale settings.
Another way to do this is to use Shell Parameter Substitution.
${parameter/pattern/replace} # substitute replace for pattern once
or
${parameter//pattern/replace} # substitute replace for pattern everywhere
Here is a quite complex example that is difficult with sed:
$ parameter="Common sed delimiters: [sed-del]"
$ pattern="\[sed-del\]"
$ replace="[/_%:\\#]"
$ echo "${parameter//$pattern/replace}"
result is:
Common sed delimiters: [/_%:\#]
However: This only work with bash parameters and not files where sed excel.
There is no such option for multi-character expression delimiters in sed, but I doubt
you need that. The delimiter character should not occur in the pattern, but if it appears in the string being processed, it's not a problem. And unless you're doing something extremely weird, there will always be some character that doesn't appear in your search pattern that can serve as a delimiter.
You need the nested delimiter facility that Perl offers. That allows to use stuff like matching, substituting, and transliterating without worrying about the delimiter being included in your contents. Since perl is a superset of sed, you should be able to use it for whatever you’re used sed for.
Consider this:
$ perl -nle 'print if /something/' inputs
Now if your something contains a slash, you have a problem. The way to fix this is to change delimiter, preferably to a bracketing one. So for example, you could having anything you like in the $WHATEVER shell variable (provided the backets are balanced), which gets interpolated by the shell before Perl is even called here:
$ perl -nle "print if m($WHATEVER)" /usr/share/dict/words
That works even if you have correctly nested parens in $WHATEVER. The four bracketing pairs which correctly nest like this in Perl are < >, ( ), [ ], and { }. They allow arbitrary contents that include the delimiter if that delimiter is balanced.
If it is not balanced, then do not use a delimiter at all. If the pattern is in a Perl variable, you don’t need to use the match operator provided you use the =~ operator, so:
$whatever = "some arbitrary string ( / # [ etc";
if ($line =~ $whatever) { ... }
With the help of Jim Lewis, I finally did a test before using sed :
if [ `echo $1 | grep '|'` ]; then
grep ".*$1.*:" $DB_FILE | sed "s#^.*$1*.*\(:\)## "
else
grep ".*$1.*:" $DB_FILE | sed "s|^.*$1*.*\(:\)|| "
fi
Thanks for help
Wow. I totally did not know that you could use any character as a delimiter.
At least half the time I use the sed and BREs its on paths, code snippets, junk characters, things like that. I end up with a bunch of horribly unreadable escapes which I'm not even sure won't die on some combination I didn't think of. But if you can exclude just some character class (or just one character even)
echo '#01Y $#1+!' | sed -e 'sa$#1+ashita' -e 'su#01YuHolyug'
> > > Holy shit!
That's so much easier.
Escaping the delimiter inline for BASH to parse is cumbersome and difficult to read (although the delimiter does need escaping for sed's benefit when it's first used, per-expression).
To pull together thkala's answer and user4401178's comment:
DELIM=$(echo -en "\001");
sed -n "\\${DELIM}${STARTING_SEARCH_TERM}${DELIM},\\${DELIM}${ENDING_SEARCH_TERM}${DELIM}p" "${FILE}"
This example returns all results starting from ${STARTING_SEARCH_TERM} until ${ENDING_SEARCH_TERM} that don't match the SOH (start of heading) character with ASCII code 001.
There's no universal separator, but it can be escaped by a backslash for sed to not treat it like separator (at least unless you choose a backslash character as separator).
Depending on the actual application, it might be handy to just escape those characters in both pattern and replacement.
If you're in a bash environment, you can use bash substitution to escape sed separator, like this:
safe_replace () {
sed "s/${1//\//\\\/}/${2//\//\\\/}/g"
}
It's pretty self-explanatory, except for the bizarre part.
Explanation to that:
${1//\//\\\/}
${ - bash expansion starts
1 - first positional argument - the pattern
// - bash pattern substitution pattern separator "replace-all" variant
\/ - literal slash
/ - bash pattern substitution replacement separator
\\ - literal backslash
\/ - literal slash
} - bash expansion ends
example use:
$ input="ka/pus/ta"
$ pattern="/pus/"
$ replacement="/re/"
$ safe_replace "$pattern" "$replacement" <<< "$input"
ka/re/ta