How do I correctly use variables with sed? - sed

I need to use a string variable in a sed command. My attempt is given in script.sh, it does not do what I want, I assume because my variable contains characters that I need sed to evaluate. I am working in linux bash.
input.txt
delicious.banana
gross.apple
script.sh
adjectives="delicious\|gross\|bearable\|yummy"
sed "s/\($adjectives\)\.//g" input.txt > output.txt
output.txt desired
banana
apple
output.txt current
deliciousbanana
grossdapple

Non-gnu sed don't work with \| in BRE (basic regex mode). I suggest using ERE (Extended regex mode) using -E and as a bonus you can eliminate all the escaping:
adjectives="delicious|gross|bearable|yummy"
sed -E "s/($adjectives)\.//g" input.txt
banana
apple

Related

How to replace only specific spaces in a file using sed?

I have this content in a file where I want to replace spaces at certain positions with pipe symbol (|). I used sed for this, but it is replacing all the spaces in the string. But I don't want to replace the space for the 3rd and 4th string.
How to achieve this?
Input:
test test test test
My attempt:
sed -e 's/ /|/g file.txt
Expected Output:
test|test|test test
Actual Output:
test|test|test|test
sed 's/ /\
/3;y/\n / |/'
As newline cannot appear in a sed pattern space, you can change the third space to a newline, then change all newlines and spaces to spaces and pipes.
GNU sed can use \n in the replacement text:
sed 's/ /\n/3;y/\n / |/'
If the original input doesn't contain any pipe characters, you can do
sed -e 's/ /|/g' -e 's/|/ /3' file
to retain the third white space. Otherwise see other answers.
You could replace the 'first space' twice, e.g.
sed -e 's/ /|/' -e 's/ /|/' file.txt
Or, if you want to specify the positions (e.g. the 2nd and 1st spaces):
sed -e 's/ /|/2' -e 's/ /|/1' file.txt
Using GNU sed to replace the first and second one or more whitespace chunks:
sed -i -E 's/\s+/|/;s/\s+/|/' file
See the online demo.
Details
-i - inline replacements on
-E - POSIX ERE syntax enabled
s/\s+/|/ - replaces the first one or more whitespace chars
; - and then
s/\s+/|/ the second one or more whitespace chars on each line (if present).
Keep it simple and use awk, e.g. using any awk in any shell on every Unix box no matter what other characters your input contains:
$ awk '{for (i=1;i<NF;i++) sub(/ /,"|")} 1' file
test|test|test test
The above replaces all but the last " " on each line. If you want to replace a specific number, e.g. 2, then just change NF to 2.

sed equivalent of perl -pe

I'm looking for an equivalent of perl -pe. Ideally, it would be replace with sed if it's possible. Any help is highly appreciated.
The code is:
perl -pe 's/^\[([^\]]+)\].*$/$1/g'
$ echo '[foo] 123' | perl -pe 's/^\[([^\]]+)\].*$/$1/g'
foo
$ echo '[foo] 123' | sed -E 's/^\[([^]]+)\].*$/\1/'
foo
sed by default accepts code from command line, so -e isn't needed (though it can be used)
printing the pattern space is default, so -p isn't needed and sed -n is similar to perl -n
-E is used here to be as close as possible to Perl regex. sed supports BRE and ERE (not as feature rich as Perl) and even that differs from implementation to implementation.
with BRE, the command for this example would be: sed 's/^\[\([^]]*\)\].*$/\1/'
\ isn't special inside character class unless it is an escape sequence like \t, \x27 etc
backreferences use \N format (and limited to maximum 9)
Also note that g flag isn't needed in either case, as you are using line anchors

Need to parse the following sed command: sed -e 's/ /\'$'\n/g'

I stumble upon the command sed -e 's/ /\'$'\n/g'that supposedly takes an input and split all spaces into new lines. Still, I don't quite get how the '$' works in the command. I know that s stands for substitute, / / stands for the blank spac, \n stands for new line and /g is for global replacement, but not sure how \'$' fits in the picture. Anybody who can shed some light here will be much appreciated.
Basically it's meant for platform portability. With GNU sed it would be just
sed -e 's/ /\n/g'
because GNU sed is able to interpret \n as new line.
However, other versions of sed, like the BSD version (that comes with MacOS) do not interprete \n as newline.
That's why the command is build out of two parts
sed -e 's/ /\' part2: $'\n/g'
The $'\n/g' is an ANSI C string parsed by the shell before executing sed. Escape sequences like \n will get expanded in such strings. Doing so, the author of the command passed a literal new line (0xa) to the sed command rather than passing the escape sequence \n. (0x5c 0x6e).
One more thing, since the newline (0xa) is a command separator in sed, it needs to get escaped. That's why the \ at the end of the first part.
Alternatively you could just use a multiline version:
sed -e 's/ /\
/g'
Btw, I would have written the command like
sed -e 's/ /\'$'\n''/g'
meaning just putting the $'\n' into the ANSI C string. Imo that's better to understand.

sed to replace non-printable character with printable character

Am running BASH and UNIX utilities on Windows 7.
Have a file that contains a vertical tab. The binary symbol is 0x0B. The octal symbol is 013. I need to replace the symbol with a blank space.
Have tried this sed approach but it fails:
sed -e 's/'$(echo "octal-value")'/replace-word/g'
Specifically:
sed -e 's/'$(echo "\013")'/ /g'
Update:
Following this advice I use GNU sed and this approach:
sed -i 's:\0x0B: :g' file
but the stubborn vertical tab is still in the file.
What is the correct way to replace a non-printable character with a printable character?
Sed should recognise special characters:
sed -e 's/\x0b/ /g'
In answer to why the -e? If you use more than one sed expression, then each one must be preceded by the -e. So, for example:
echo foo bar bas zer | sed -e 's/zer/oh my/g' -e 's/bas/baz/'
would result in:
foo bar baz oh my
thus performing 2 different sed changes ('scripts) with only a single invocation. See sed man pages for more details.
(the above example is, obviously, contrived. I, however, have seen a sed command in a script with 78 individual -e 'scripts'!)
If you only have one 'script', then the -e is optional, obviously.

What is wrong with this sed expression?

sed 's_((checksum|compressed)=\").*(\")_\1\2_' -i filename
I am using this command to replace the checksum and compressed filed with empty? But it didn't change anything?
for example, I want change this line " checksum="XXXXX" with checksum="", and also replace
compressed="XXXX" with compressed=""
What is wrong with my sed command?
It's because sed uses a funny regex dialect by default: you have to escape capturing brackets.
If you want to use "normal" regex that you're familiar with, use the -r flag (if you're on unix, GNU sed) or the -E flag (Mac OS X BSD sed):
sed -r 's_((checksum|compressed)=\").*(\")_\1\3_' -i filename
Additionally, note that you have three sets of capturing brackets in your sed, and I think you want to change the \1\2 to \1\3. (\1 contains checksum=", \2 contains checksum, and \3 contains ").
(For interest, here's how you would do it without the extended-regexp (-r/-E) flag, note that capturing brackets and the OR | are only considered in the regex sense if they are escaped:
sed 's_\(\(checksum\|compressed\)=\"\).*\(\"\)_\1\3_' -i filename
)
This might work for you:
echo 'checksum="XXXXX" compressed="YYYYYYY"' |
sed 's/\(checksum\|compressed\)="[^"]*"/\1=""/g'
checksum="" compressed=""
In sed (without the -r switch), ()|+?{}'s must have a \ prepended to give them the qualities of grouping. alternation, one or more, zero or one and intervals. .[]* work as metacharacters either way.
Try:
sed 's/\(\(checksum\|compressed\)\)="[^"]*"/\1=""/' -i filename