sed - What's the difference here? - sed

test.txt file contains:
AAAAA
BBBBB
CCCCC
or in hex:
41 41 41 41 41 0A 42 42 42 42 42 0A 43 43 43 43 43 0A
If I run:
sed s/A/B/g test.txt
it returns:
BBBBB
BBBBB
CCCCC
Likewise:
sed 's/\x41/B/g' test.txt
returns:
BBBBB
BBBBB
CCCCC
but if I run:
sed 's/\x0A/B/g' test.txt
it still returns:
AAAAA
BBBBB
CCCCC
Why?

sed works on one line at a time. For each line of the file, sed puts it on pattern-space by removing the new line (\n) from the line and does some action. Once the action is done, it places the new line back to the line and prints it out by default and reads the next line into pattern-space (unless forced not to by using -n option). This continues until the end of file is reached.
For your attempt, when sed reads the first line, it has already removed the new line from the line, hence your substitution is basically a no-op. Once that is done, it puts the new line back to your first line, prints it and reads the second line into pattern space and continues.
To get your desired output, you will have to read the entire file in to pattern space, with each line separated by a new line character.
You can do so by saying:
$ sed ':a;N;s/\x0A/B/;ba' file
AAAAABBBBBBBCCCCC
:a creates a label
N appends the next line into pattern spaces separated by a new line so your pattern spaces no contains line1\nline2.
s/\x0A/B/ is removing the \n from your pattern space and replaces it with B.
ba tells the sed to go back to label :a and repeat the process.
In the second run sed again appends the next line in to pattern space. Now your pattern spaces looks like line1Bline2\nline3. When the substitution occurs, you are left with your desired output.

Related

combine sed d and sed a scripts to a single script

file xz.txt
123
456
789
I want to merge
sed -i '1d' xz.txt
sed -i '1a abc' xz.txt
I tried
sed -i -e '1d' -e '1a abc' xz.txt
expect to get
456
abc
789
but I got
456
789
sed (GNU sed) 4.7
but it doesn't work, any help?
Sed goes line by line, first command 1d - deleted 1st line, 1st line is gone, there is no more 1st line, that is why second command 1a abc didn't work. Here is how it should be:
$ sed '1d; 2a abc' f
456
abc
789
What is going on is that the delete statement automatically ends the processing sequence:
[1addr]a\
text Write text to standard output as described previously (yes there is a new-line here)
[2addr]d: Delete the pattern space and start the next cycle
Source: Posix)
As the a command does not modify the pattern space but just writes to stdout, you can simply do
[POSIX]$ sed -e '1a\
abc' -e '1d'
[GNU]$ sed -e '1a abc' -e '1d'
However, the easiest is just to use the replace command c:
[POSIX]$ sed -e '1c\
abc'
[GNU]$ sed -e '1c abc`
Note: The reason the commands a and c write directly to the output and not to the pattern space is most likely that it would mess up the address ranging using line-numbers.

[[:space:]] is not matching newline in sed

I am trying to replace all words "BBB" ins a file to "XXX" But for some reason I cant seem to make [[:space:]] match the newline:
[root#REDHAT]# cat file
AAA
BBB
BBB
CCC
[root#REDHAT]# sed 's/[[:space:]]BBB/XXX/g' file
AAA
BBB
XXX
CCC
Note how only the second BBB was replaced; [[:space:]] didn't match the newline preceding the first occurrence.
As Sundeep points out:
sed reads its input line by line by default.
Each line read doesn't include the trailing newline,
so the only thing [[:space:]] can possibly match is a space or a tab char.
Perhaps the following command does what you want (works with BSD Sed and GNU Sed):
$ sed -E 's/(^|[[:blank:]])BBB/XXX/' file
AAA
XXX
XXX
CCC
(^|[[:blank:]]) matches either the beginning of a line (^) or a single tab or space character ([[:blank:]]).
I've omitted the g option, under the assumption that there's at most one BBB per line.

what does the "e" modifier to the s/// command mean in GNU sed?

I have read sed manual for the s/// command. There it says:
e
This command allows one to pipe input from a shell command into pattern
space. If a substitution was made, the command that is found in
pattern space is executed and pattern space is replaced with its output.
A trailing newline is suppressed; results are undefined if the command
to be executed contains a nul character. This is a GNU sed extension.
I don't know what is useful:
echo "1234" | sed 's/1/echo ss/e'
echo "1234" | sed 's/1/ss/'
These two commands result in the same, so what is the e modifier about?
$ printf "%s\n" 1234 2345 3456 |
> sed -e 's/\(..\)\(..\)/echo $((\1 * \2))/e'
408
1035
1904
$
This printf command echoes three 4-digit numbers on three lines. The sed script splits each line into a pair of 2-digit numbers, creates a command echo $((12 * 34)), for example, runs it, and the output (408 for the given values) is included in (as) the pattern space — which is then printed. So, for this script, the pairs of 2-digit numbers are multiplied and the result is shown.
You can get fancier if you wish:
$ printf "%s\n" 1234 2345 3456 |
> sed -e 's/\(..\)\(..\)/echo \1 \\* \2 = $((\1 * \2))/e'
12 * 34 = 408
23 * 45 = 1035
34 * 56 = 1904
$
Note the double backslash — that's rather important. You could avoid the need for that using double quotes:
printf "%s\n" 1234 2345 3456 |
sed -e 's/\(..\)\(..\)/echo "\1 * \2 = $((\1 * \2))"/e'
Beware: the notation will run the external command every time the s/// command actually makes a substitution. If you have millions of lines of data in your files, that could mean millions of commands executed.
The /e option is a GNU sed extension. It causes the result of the replacement to be passed to the shell for evaluation as a command. Observe:
vnix$ sed 's/a/pwd/' <<<a
pwd
vnix$ sed 's/a/pwd/e' <<<a
/home/tripleee
Your example caused identical behavior (echoing back exactly the replaced text) so it was a poorly chosen example.
Out of the box, the pattern space refers to the current input line, but there are ways to put something else in the pattern space. Substitution modifies the pattern space, but there are other sed commands which modify the pattern space in other ways. (For a trivial example, x swaps the pattern space with the hold space, which is basically another built-in variable which you can use for whatever you want.)

How to force sed to print what it does with my file?

How to force sed to print what it does with my file?
My text01.txt file:
aaa
bbb
ccc
ddd
c
ee
My code:
sed -i 's/c/X/g' ./text01.txt
I want to get in terminal something like this:
sed: line 3 change ccc to XXX
sed: line 5 change c to X
sed -i"bak" 's/c/X/g' text01.txt && diff text01.txt text01.txtbak
will give you a diff summary. like:
3c3
< XXX
---
> ccc
5c5
< X
---
> c
You can read diff man page, to adjust the diff output, e.g. with -c/-u/-y... options as you like.
If you want to get exactly same format you described, you can do some work on diff output as well.
This comes pretty close to your requirement:
$ paste <(cat -n text01.txt) <(sed 's/c/X/g' ./text01.txt)
1 aaa aaa
2 bbb bbb
3 ccc XXX
4 ddd ddd
5 c X
6 ee ee
cat -n prepends line numbers, and the paste command with process substitution prints the file and the sed output next to each other.
Or, more elaborate, with awk:
awk '{ getline mod_line < ARGV[2]
if ($0 != mod_line) {
printf "sed line %d change %s to %s\n", NR, $0, mod_line }
}' text01.txt <(sed 's/c/X/g' text01.txt)
This reads, for each line of text01.txt, the corresponding line as modified by sed. If they are different, the line number and both lines get printed:
sed line 3 change ccc to XXX
sed line 5 change c to X
plus an awk warning because it tries to close an anonymous pipe – this can be suppressed by redirecting stderr, i.e., appending 2> /dev/null to the command.
The closest thing to sed "built-in" debugging is the l command, which prints the current content of the pattern space. If you'd like to go all in, there are proper debuggers, for example sedsed.
This might work for you (GNU sed):
sed -i -e 'h;/c/!b;s//X/g;H;x;s/\n/ to /;s/^/sed: changed /w/dev/stdout' -e 'x' file
This makes a copy of each line in the hold space (HS) and if the substitution pattern does not match, no further action takes place. Otherwise, the substitution is made on the line in the pattern space (PS) and this is appended to the HS. Focus is then changed to the HS and format of before and after effected. The formated line is then written out to the standard output i.e. the terminal and finally focus is reverted to the PS so that the substituted line is included in the original updated file.

How do I prepend a zero to a digit with sed?

I would like to add a zero to the middle of a line of formatted text using sed or awk.
Example Input File
line 1
line 2
line 3
Expected Output
line 01
line 02
line 03
There is some ambiguity with your question but how about using printf with awk to pad the second field with zeros:
$ awk 'NF==2 { printf "%s %02d\n", $1, $2}' file
line 01
line 02
line 03
line 10
line 100
$ awk 'NF==2 { printf "%s %04d\n", $1, $2}' file
line 0001
line 0002
line 0003
line 0010
line 0100
If you don't want blank lines stripped do awk 'NF==2 { printf "%s %04d\n", $1, $2} NF!=2' file.
Use a POSIX Character Class with Backreference
Given /tmp/foo containing:
line 1
line 2
line 3
you can call sed 's/[[:digit:]]/0&/' /tmp/foo in order to get the following output:
line 01
line 02
line 03
In awk this is quite easy:
awk '{print $1, 0$2}' input.txt > output.txt
In this case the input text you have shown is in the file input.txt and I saved it to a new file output.txt.
This is also an easy command to understand. The $1 variable is your "line" text and the $2 variable is your number. This command assumes there are no spaces in your "line" text. Are there?
The simplest answer to your question as posted is:
sed 's/ / 0/'
If that doesn't do it, post some more representative input and expected output as there are many things you might need to take into account.
This might work for you (GNU sed):
sed 's/\</0/2' file
awk 'NF==2{$2="0"$2}1' your_file