Escape line beginning and end in bracket expressions in sed - sed

How do you escape line beginning and line end in bracket expressions in sed?
For example, let's say I want to replace both comma, line beginning, and line end in each line with pipe:
echo "a,b,c" | sed 's/,/|/g'
# a|b|c
echo "a,b,c" | sed 's/^/|/g'
# |a,b,c
echo "a,b,c" | sed 's/$/|/g'
# a,b,c|
echo "a,b,c" | sed 's/[,^$]/|/g'
# a|b|c
I would expect the last command to produce |a|b|c|. I also tried escaping the line beginning and line end via backslash, with no change.

With GNU sed with extended regular expressions, you can do:
$ echo "a,b,c" | /opt/gnu/bin/sed -E 's/^|,|$/|/g'
|a|b|c|
$
The -E option enables the extended regular expressions, as does -r, but -E is also used by other sed variants for the same purpose, unlike -r.
However, for reasons which elude me, the BSD (macOS) variant of sed produces:
$ echo "a,b,c" | sed -E 's/^|,|$/|/g'
|a|b|c
$
I can't think why.
If this variability is unacceptable, go with the three-substitution solution:
$ echo "a,b,c" | sed -e "s/^/|/" -e "s/$/|/" -e "s/,/|/g"
|a|b|c|
$
which should work with any variant of sed. However, note that echo "" | sed …3 subs… produces || whereas the -E variant produces |. I'm not sure if there's an easy fix for that.
You tried this, but it didn't do what you wanted:
$ echo "a,b,c" | sed 's/[,^$]/|/g'
a|b|c
$
This is what should be expected. Inside character classes, most special characters lose their special-ness. There is nothing special about $ (or , but it isn't a metacharacter anyway) in a character class; ^ is only special at the start of the class and it negates the character class. That means that what follows shows the correct, expected behaviour from this permutation of the contents of your character class:
$ echo "a,b\$\$b,c" | sed 's/[^,$]/|/g'
|,|$$|,|
$
It mapped all the non-comma, non-dollar characters to pipes. I should be using single quotes around the echo; then the backslashes wouldn't be necessary. I just followed the question's code quietly.

Following sed may help you in same.
echo "a,b,c" | sed 's/^/|/;s/,/|/g;s/$/|/'
Output will be as follows.
|a|b|c|

Related

Using a single sed call to split and grep

This is mostly by curiosity, I am trying to have the same behavior as:
echo -e "test1:test2:test3"| sed 's/:/\n/g' | grep 1
in a single sed command.
I already tried
echo -e "test1:test2:test3"| sed -e "s/:/\n/g" -n "/1/p"
But I get the following error:
sed: can't read /1/p: No such file or directory
Any idea on how to fix this and combine different types of commands into a single sed call?
Of course this is overly simplified compared to the real usecase, and I know I can get around by using multiple calls, again this is just out of curiosity.
EDIT: I am mostly interested in the sed tool, I already know how to do it using other tools, or even combinations of those.
EDIT2: Here is a more realistic script, closer to what I am trying to achieve:
arch=linux64
base=https://chromedriver.storage.googleapis.com
split="<Contents>"
curl $base \
| sed -e 's/<Contents>/<Contents>\n/g' \
| grep $arch \
| sed -e 's/^<Key>\(.*\)\/chromedriver.*/\1/' \
| sort -V > out
What I would like to simplify is the curl line, turning it into something like:
curl $base \
| sed 's/<Contents>/<Contents>\n/g' -n '/1/p' -e 's/^<Key>\(.*\)\/chromedriver.*/\1/' \
| sort -V > out
Here are some alternatives, awk and sed based:
sed -E "s/(.*:)?([^:]*1[^:]*).*/\2/" <<< "test1:test2:test3"
awk -v RS=":" '/1/' <<< "test1:test2:test3"
# or also
awk 'BEGIN{RS=":"} /1/' <<< "test1:test2:test3"
Or, using your logic, you would need to pipe a second sed command:
sed "s/:/\n/g" <<< "test1:test2:test3" | sed -n "/1/p"
See this online demo. The awk solution looks cleanest.
Details
In sed solution, (.*:)?([^:]*1[^:]*).* pattern matches an optional sequence of any 0+ chars and a :, then captures into Group 2 any 0 or more chars other than :, 1, again 0 or more chars other than :, and then just matches the rest of the line. The replacement just keeps Group 2 contents.
In awk solution, the record separator is set to : and then /1/ regex is used to only return the record having 1 in it.
This might work for you (GNU sed):
sed 's/:/\n/;/^[^\n]*1/P;D' file
Replace each : and if the first line in the pattern space contains 1 print it.
Repeat.
An alternative:
sed -Ez 's/:/\n/g;s/^[^1]*$//mg;s/\n+/\n/;s/^\n//' file
This slurps the whole file into memory and replaces all colons by newlines. All lines that do not contain 1 are removed and surplus newlines deleted.
An alternative to the really ugly sed is: grep -o '\w*2\w*'
$ printf "test1:test2:test3\nbob3:bob2:fred2\n" | grep -o '\w*2\w*'
test2
bob2
fred2
grep -o: only matching
Or: grep -o '[^:]*2[^:]*'
echo -e "test1:test2:test3" | sed -En 's/:/\n/g;/^[^\n]*2[^\n]*(\n|$)/P;//!D'
sed -n doesn't print unless told to
sed -E allows using parens to match (\n|$) which is newline or the end of the pattern space
P prints the pattern buffer up to the first newline.
D trims the pattern buffer up to the first newline
[^\n] is a character class that matches anything except a newline
// is sed shorthand for repeating a match
//! is then matching everything that didn't match previously
So, after you split into newlines, you want to make sure the 2 character is between the start of the pattern buffer ^ and the first newline.
And, if there is not the character you are looking for, you want to D delete up to the first newline.
At that point, it works for one line of input, with one string containing the character you're looking for.
To expand to several matches within a line, you have to ta, conditionally branch back to label :a:
$ printf "test1:test2:test3\nbob3:bob2:fred2\n" | \
sed -En ':a s/:/\n/g;/^[^\n]*2[^\n]*(\n|$)/P;D;ta'
test2
bob2
fred2
This is simply NOT a job for sed. With GNU awk for multi-char RS:
$ echo "test1:test2:test3:test4:test5:test6"| awk -v RS='[:\n]' '/1/'
test1
$ echo "test1:test2:test3:test4:test5:test6"| awk -v RS='[:\n]' 'NR%2'
test1
test3
test5
$ echo "test1:test2:test3:test4:test5:test6"| awk -v RS='[:\n]' '!(NR%2)'
test2
test4
test6
$ echo "foo1:bar1:foo2:bar2:foo3:bar3" | awk -v RS='[:\n]' '/foo/ || /2/'
foo1
foo2
bar2
foo3
With any awk you'd just have to strip the \n from the final record before operating on it:
$ echo "test1:test2:test3:test4:test5:test6"| awk -v RS=':' '{sub(/\n$/,"")} /1/'
test1

How to replace only last match in a line with sed?

With sed, I can replace the first match in a line using
sed 's/pattern/replacement/'
And all matches using
sed 's/pattern/replacement/g'
How do I replace only the last match, regardless of how many matches there are before it?
Copy pasting from something I've posted elsewhere:
$ # replacing last occurrence
$ # can also use sed -E 's/:([^:]*)$/-\1/'
$ echo 'foo:123:bar:baz' | sed -E 's/(.*):/\1-/'
foo:123:bar-baz
$ echo '456:foo:123:bar:789:baz' | sed -E 's/(.*):/\1-/'
456:foo:123:bar:789-baz
$ echo 'foo and bar and baz land good' | sed -E 's/(.*)and/\1XYZ/'
foo and bar and baz lXYZ good
$ # use word boundaries as necessary - GNU sed
$ echo 'foo and bar and baz land good' | sed -E 's/(.*)\band\b/\1XYZ/'
foo and bar XYZ baz land good
$ # replacing last but one
$ echo 'foo:123:bar:baz' | sed -E 's/(.*):(.*:)/\1-\2/'
foo:123-bar:baz
$ echo '456:foo:123:bar:789:baz' | sed -E 's/(.*):(.*:)/\1-\2/'
456:foo:123:bar-789:baz
$ # replacing last but two
$ echo '456:foo:123:bar:789:baz' | sed -E 's/(.*):((.*:){2})/\1-\2/'
456:foo:123-bar:789:baz
$ # replacing last but three
$ echo '456:foo:123:bar:789:baz' | sed -E 's/(.*):((.*:){3})/\1-\2/'
456:foo-123:bar:789:baz
Further Reading:
Buggy behavior if word boundaries is used inside a group with quanitifiers - for example: echo 'it line with it here sit too' | sed -E 's/with(.*\bit\b){2}/XYZ/' fails
Greedy vs. Reluctant vs. Possessive Quantifiers
Reference - What does this regex mean?
sed manual: Back-references and Subexpressions
This might work for you (GNU sed):
sed 's/\(.*\)pattern/\1replacement/' file
Use greed to swallow up the pattern space and then regexp engine will step back through the line and find the first match i.e. the last match.
A fun way to do this, is to use rev to reverse the characters of each line and write your sed replacement backwards.
rev input_file | sed 's/nrettap/tnemecalper/' | rev

Using sed to make replacements only within part of a line

How to replace '.' with '_' within the part of the line before the '=' char in the input below
Need single sed command to do all three
echo "few.num.dots=/home/user/.hidden/folder.dot" | sed 's/\./_/g'
required output => few_num_dots=/home/user/.hidden/folder.dot
echo "var=nodot" | sed 's/\./_/g'
required output => var=nodot
echo "var.one=onedot.notthis" | sed 's/\./_/g'
required output => var_one=onedot.notthis
You can use a conditional branching using the t command. It does a loop until the substitution command fails, and the command replaces any . character followed by an equal sign:
echo "few.num.dots=/home/user/.hidden/folder.dot" |
sed ':a; s/\.\([^=]*=\)/_\1/; ta'
It yields:
few_num_dots=/home/user/.hidden/folder.dot
perl?
echo "few.num.dots=/home/user/.hidden/folder.dot" |
perl -pe 's/^[^=]+/ ($x=$&) =~ tr{.}{_}; $x /e'
few_num_dots=/home/user/.hidden/folder.dot
awk?
awk -F= -v OFS='=' '{gsub(/\./,"_",$1)} 1'
You can do it this way as well,
echo "few.num.dots=/home/user/.hidden/folder.dot" |
sed -e '1,/./s/\./_/' -e '1,/./s/\./_/'
few_num_dots=/home/user/.hidden/folder.dot
First -e replaces the first occurrence of the pattern ., next -e replaces the next one...
Using awk
$ echo "few.num.dots=/home/user/.hidden/folder.dot" |awk '/=/{gsub(/\./,"_",$1)}1' FS="=" OFS="="
few_num_dots=/home/user/.hidden/folder.dot
$ echo "var.one=onedot.notthis" |awk '/=/{gsub(/\./,"_",$1)}1' FS="=" OFS="="
var_one=onedot.notthis
This might work for you (GNU sed):
sed 's/=/\n&/;h;y/./_/;G;s/\n.*\n.*\n//' file
Insert a marker to divide the line, copy the line, translate the characters, append the original line and using the marker reconstitute the line.

How can sed replace "\ " (backslash + space)?

In a bash script, files with spaces show up as "File\ with\ spaces.txt" and I want to substitute those slashed-spaces with either _ or +.
How can I tell sed to do that? I had no success using;
$1=~/File\ with\ spaces.txt
ext=$1
web=$(echo "$ext" | sed 's/\ /+/')
I'm open to suggestions if there's a better way than through sed.
[EDIT]: Foo Bah's solution works well, but it substitutes only the first space because the text following it is treated as arguments rather than part of the $1. Any way around this?
sed 's/\\\\ /+/';
\\\\ evaluates to a \\ at the shell level, and then into a literal \ within sed.
Sed recognises \ as space just fine:
bee#i20 ~ $ echo file\ 123 | sed 's/\ /+/'
file+123
Your bash script syntax is all wrong, though.
Not sure what you were trying to do with the script, but here is an example of replacing spaces with +:
ext=~/File\ with\ spaces.txt
web=`echo "$ext" | sed 's/\ /+/g'`
echo $web
Upd:
Oh, and you need the g flag to replace all occurences of space, not only the first one. Fixed above.
you want to escape the slash:
web=$(echo "$ext" | sed 's/\\ /_/g')
single quotes are your friend
the following should be used with single quoted args for $1 and $2
#!/bin/bash
ESCAPE='\\'
if [ $# -ne 2 ];then
echo "$0 <TO_ESCAPE> <IN_STRING>"
echo args should be in single quotes!!
exit 1
fi
TO_ESCAPE="${1}"
IN_STRING="${2}"
if [ ${TO_ESCAPE} = '\' ];then
TO_ESCAPE='\\'
fi
echo "${IN_STRING}" | sed "s/${TO_ESCAPE}/${ESCAPE}${TO_ESCAPE}/g"

Trim text using sed

How do I remove the first and the last quotes?
echo "\"test\"" | sed 's/"//' | sed 's/"$//'
The above is working as expected, But I guess there must be a better way.
You can combine the sed calls into one:
echo "\"test\"" | sed 's/"//;s/"$//'
The command you posted will remove the first quote even if it's not at the beginning of the line. If you want to make sure that it's only done if it is at the beginning, then you can anchor it like this:
echo "\"test\"" | sed 's/^"//;s/"$//'
Some versions of sed don't like multiple commands separated by semicolons. For them you can do this (it also works in the ones that accept semicolons):
echo "\"test\"" | sed -e 's/^"//' -e 's/"$//'
Maybe you prefer something like this:
echo '"test"' | sed 's/^"\(.*\)"$/\1/'
if you are sure there are no other quotes besides the first and last, just use /g modifier
$ echo "\"test\"" | sed 's/"//g'
test
If you have Ruby(1.9+)
$ echo $s
blah"te"st"test
$ echo $s | ruby -e 's=gets.split("\"");print "#{s[0]}#{s[1..-2].join("\"")+s[-1]}"'
blahte"sttest
Note the 2nd example the first and last quotes which may not be exactly at the first and last positions.
example with more quotes
$ s='bl"ah"te"st"tes"t'
$ echo $s | ruby -e 's=gets.split("\"");print "#{s[0]}#{s[1..-2].join("\"")+s[-1]}"'
blah"te"st"test