[[:space:]] is not matching newline in sed - sed

I am trying to replace all words "BBB" ins a file to "XXX" But for some reason I cant seem to make [[:space:]] match the newline:
[root#REDHAT]# cat file
AAA
BBB
BBB
CCC
[root#REDHAT]# sed 's/[[:space:]]BBB/XXX/g' file
AAA
BBB
XXX
CCC
Note how only the second BBB was replaced; [[:space:]] didn't match the newline preceding the first occurrence.

As Sundeep points out:
sed reads its input line by line by default.
Each line read doesn't include the trailing newline,
so the only thing [[:space:]] can possibly match is a space or a tab char.
Perhaps the following command does what you want (works with BSD Sed and GNU Sed):
$ sed -E 's/(^|[[:blank:]])BBB/XXX/' file
AAA
XXX
XXX
CCC
(^|[[:blank:]]) matches either the beginning of a line (^) or a single tab or space character ([[:blank:]]).
I've omitted the g option, under the assumption that there's at most one BBB per line.

Related

sed: get a line number with regex and insert text at that line

I want to get the first line of a file that is not commented out with an hash, then append a line of text just after that line just before that line.
I managed to get the number of the line:
sed -n '/^\s*#/!{=;q}' file // prints 2
and also to insert text (specifying the line manually):
sed '2 a extralinecontent' file
I can't get them working together as a one liner or in a batch.
I tried command substitution (with $(command) and also with backticks) but I get an error from bash:
sed '$(sed -n '/^\s*#/!{=;q}' file) a extralinecontent' file
-bash: !{=: event not found
and also tried many other combinations, but no luck.
I'm using gnu-sed (via brew) on macOS.
This might work for you (GNU sed):
sed -e '/^\s*#/b;a extra line content' -e ':a;n;ba' file
Bail out of any lines beginning with a comment at the beginning of the file, append an extra line following the first line that is not a comment and keep fetching/printing all the remaining lines of the file.
Here's a way to do it with GNU sed without reading the file twice
$ cat ip.txt
#comment
foo baz good
123 456 7889
$ sed -e '0,/^\s*[^#[:space:]]/ {// a XYZ' -e '}' ip.txt
#comment
foo baz good
XYZ
123 456 7889
GNU sed allows first address to be 0 if the other address is regex, that way this will work even if first line matches the condition
/^\s*[^#[:space:]]/ as sed doesn't support possessive quantifier, need to ensure that the first character being matched by the character class isn't either a # or a whitespace character
// is a handy shortcut to repeat the last regex
a XYZ your required line to be appended (note that your question mentiones insert, so if you want that, use i instead of a)

what does the "e" modifier to the s/// command mean in GNU sed?

I have read sed manual for the s/// command. There it says:
e
This command allows one to pipe input from a shell command into pattern
space. If a substitution was made, the command that is found in
pattern space is executed and pattern space is replaced with its output.
A trailing newline is suppressed; results are undefined if the command
to be executed contains a nul character. This is a GNU sed extension.
I don't know what is useful:
echo "1234" | sed 's/1/echo ss/e'
echo "1234" | sed 's/1/ss/'
These two commands result in the same, so what is the e modifier about?
$ printf "%s\n" 1234 2345 3456 |
> sed -e 's/\(..\)\(..\)/echo $((\1 * \2))/e'
408
1035
1904
$
This printf command echoes three 4-digit numbers on three lines. The sed script splits each line into a pair of 2-digit numbers, creates a command echo $((12 * 34)), for example, runs it, and the output (408 for the given values) is included in (as) the pattern space — which is then printed. So, for this script, the pairs of 2-digit numbers are multiplied and the result is shown.
You can get fancier if you wish:
$ printf "%s\n" 1234 2345 3456 |
> sed -e 's/\(..\)\(..\)/echo \1 \\* \2 = $((\1 * \2))/e'
12 * 34 = 408
23 * 45 = 1035
34 * 56 = 1904
$
Note the double backslash — that's rather important. You could avoid the need for that using double quotes:
printf "%s\n" 1234 2345 3456 |
sed -e 's/\(..\)\(..\)/echo "\1 * \2 = $((\1 * \2))"/e'
Beware: the notation will run the external command every time the s/// command actually makes a substitution. If you have millions of lines of data in your files, that could mean millions of commands executed.
The /e option is a GNU sed extension. It causes the result of the replacement to be passed to the shell for evaluation as a command. Observe:
vnix$ sed 's/a/pwd/' <<<a
pwd
vnix$ sed 's/a/pwd/e' <<<a
/home/tripleee
Your example caused identical behavior (echoing back exactly the replaced text) so it was a poorly chosen example.
Out of the box, the pattern space refers to the current input line, but there are ways to put something else in the pattern space. Substitution modifies the pattern space, but there are other sed commands which modify the pattern space in other ways. (For a trivial example, x swaps the pattern space with the hold space, which is basically another built-in variable which you can use for whatever you want.)

Replace game1 with game001 with sed nested commands

cat 1.txt | sed -E 's,game([0-9]+),game$(printf %03d \1),g'
to replace 1.txt from:
game1 xxx vs yyy
game11 aaa vs bbb
to:
game001 xxx vs yyy
game011 aaa vs bbb
but the result is:
$ echo "game1 xxx vs yyy" | sed -E 's,game([0-9]+),game$(printf %03d \1),g'
game$(printf %03d 1) xxx vs yyy
How to make printf %03d \1 correctly evaluated?
You need to use double quotes when you need substitution
sed -E "s,game([0-9]+),game$(printf %03d \1),g" 1.txt
Edit:
And, I don't think sed can pass value of \1 to external commands. perl can help in this case:
$ cat 1.txt
game1 xxx vs yyy
game11 aaa vs bbb
game21 aaa vs bbb
$ sed -E "s,game([0-9]+),game$(printf %03d \1),g" 1.txt
game001 xxx vs yyy
game001 aaa vs bbb
game001 aaa vs bbb
$ # can also use: perl -pe 's/game\K\d+/sprintf "%03d", $&/ge'
$ perl -pe 's/game([0-9]+)/sprintf "game%03d", $1/ge' 1.txt
game001 xxx vs yyy
game011 aaa vs bbb
game021 aaa vs bbb
You can't combine shell commands and sed backreferences like this (and if you could, you'd have to double quote the sed command, see other answer). The shell would try to evaluate the command before sed sees it, but \1 wouldn't mean anything to the shell.
You can do it as follows, though:
$ sed -E 's/^(game)([[:digit:]]+)/\100\2/;s/^(game).{0,2}([[:digit:]]{3})/\1\2/' 1.txt
game001 xxx vs yyy
game011 aaa vs bbb
The first substitution, s/^(game)([[:digit:]]+)/\100\2/, adds two zeros in front of the digits after game:
$ sed -E 's/game([[:digit:]]+)/game00\1/' 1.txt
game001 xxx vs yyy
game0011 aaa vs bbb
The second substitution, s/^(game).{0,2}([[:digit:]]{3})/\1\2/ removes up to two characters between game and three digits that follow it, to get rid of unwanted extra zeros.
Notice that
I've used / instead of , as delimiter, just because I'm more used to it.
I've anchored game at the start of the line with ^.
I've used one more capture group for game so I don't have to type it twice per command.
I've used the POSIX character class [[:digit:]] instead of [0-9].
I've used sed '<command>' 1.txt instead of cat 1.txt | sed '<command>' to avoid the useless use of cat.
Just use awk:
$ awk '{sub(/game/,""); $1=sprintf("game%03d",$1)} 1' file
game001 xxx vs yyy
game011 aaa vs bbb
or in general to operate on saved capture groups with GNU awk for the 3rd arg to match():
$ awk 'match($0,/(game)([0-9]+)(.*)/,a){ printf "%s%03d%s\n", a[1], a[2], a[3] }' file
game001 xxx vs yyy
game011 aaa vs bbb
With sed you'd need:
$ sed -E 's/(game)([0-9]) /\10\2 /; s/(game)([0-9]{2}) /\10\2 /' file
game001 xxx vs yyy
game011 aaa vs bbb

How to force sed to print what it does with my file?

How to force sed to print what it does with my file?
My text01.txt file:
aaa
bbb
ccc
ddd
c
ee
My code:
sed -i 's/c/X/g' ./text01.txt
I want to get in terminal something like this:
sed: line 3 change ccc to XXX
sed: line 5 change c to X
sed -i"bak" 's/c/X/g' text01.txt && diff text01.txt text01.txtbak
will give you a diff summary. like:
3c3
< XXX
---
> ccc
5c5
< X
---
> c
You can read diff man page, to adjust the diff output, e.g. with -c/-u/-y... options as you like.
If you want to get exactly same format you described, you can do some work on diff output as well.
This comes pretty close to your requirement:
$ paste <(cat -n text01.txt) <(sed 's/c/X/g' ./text01.txt)
1 aaa aaa
2 bbb bbb
3 ccc XXX
4 ddd ddd
5 c X
6 ee ee
cat -n prepends line numbers, and the paste command with process substitution prints the file and the sed output next to each other.
Or, more elaborate, with awk:
awk '{ getline mod_line < ARGV[2]
if ($0 != mod_line) {
printf "sed line %d change %s to %s\n", NR, $0, mod_line }
}' text01.txt <(sed 's/c/X/g' text01.txt)
This reads, for each line of text01.txt, the corresponding line as modified by sed. If they are different, the line number and both lines get printed:
sed line 3 change ccc to XXX
sed line 5 change c to X
plus an awk warning because it tries to close an anonymous pipe – this can be suppressed by redirecting stderr, i.e., appending 2> /dev/null to the command.
The closest thing to sed "built-in" debugging is the l command, which prints the current content of the pattern space. If you'd like to go all in, there are proper debuggers, for example sedsed.
This might work for you (GNU sed):
sed -i -e 'h;/c/!b;s//X/g;H;x;s/\n/ to /;s/^/sed: changed /w/dev/stdout' -e 'x' file
This makes a copy of each line in the hold space (HS) and if the substitution pattern does not match, no further action takes place. Otherwise, the substitution is made on the line in the pattern space (PS) and this is appended to the HS. Focus is then changed to the HS and format of before and after effected. The formated line is then written out to the standard output i.e. the terminal and finally focus is reverted to the PS so that the substituted line is included in the original updated file.

Using sed selectively to delete lines

I have a text file (say file)
Name
aaa
bbb
ccc
Name
xxxx
Name
yyyy
tttt
I want to remove "Name" from the file except if it occurs in the header. I know sed removes lines, but if I do
sed '/Name/d' file
it removes all "Name".
Desired ouput:
Name
aaa
bbb
ccc
xxxx
yyyy
tttt
Can you suggest what options I should use?
Use this:
sed '1!{/Name/d}' file
The previous command applies to all lines except of the first line.
If you know that the first header is on the first line, skip it like this:
sed '1!{/Name/d}' infile
That means the pattern should apply on all lines except line 1.
Or the other way around:
sed -n '2,${/Name/d};p' infile
Perhaps with awk:
awk '/Name/ && c++ == 0 || !/Name/' infile
Output in all cases:
Name
aaa
bbb
ccc
xxxx
yyyy
tttt
You might find the awk syntax more intuitive:
awk 'NR==1 || !/Name/' file
the above just says if it's line number 1 or the line doesn't include "Name" then print it