sed delete block of lines after pattern1 to pattern2, but not the line matching pattern1 itself? - sed

I am struggling to use sed to work through 'testfile.txt' and every time it encounters a line that starts delete_me: abc it will then:
leave the line delete_me: abc intact
but delete all the lines that follow until the next blank line is reached in the file.
eg. I want this input:
delete_me: abc
sSAsaAaSA
AsaSAsaSAsa
asASAsS
^--- <blank line>
...to be changed to just this one line:
delete_me: abc
I have tried:
sed '/delete_me/ {n;d}' jil_testfile.txt
# deletes only the first line after 'delete_me'
sed '/delete_me/,/^$/d' jil_testfile.txt
# nearly works but deletes the 'delete_me' line too which I want to stay preserved.
Any suggestions please?

This might work for you (GNU sed):
sed -n ':a;/delete_me/{p;:b;n;//ba;bb};p' file
Print lines as normal until the first occurrence of delete_me. Print this line and do not print any further lines unless that line contains delete_me.
As the spec has changed since I wrote the first solution, here is new one:
sed -n '/delete_me/{p;:a;n;/^$/b;ba};p' file

Related

Delete a paragraph from a file using sed

I have a markdown file that looks something like this:
markdown.md
# Title1
line 1
line 2
line 3
# Title2
line 1
line 2
line 3
I'd like to be able to delete one of the paragraphs by searching for the title. I would need to delete the title, the following line, and then every subsequent line that is not blank.
The desired output would be:
# Title2
line 1
line 2
line 3
I was doing some reading about using {} to group multiple commands together but I can't seem to quite get the syntax right.
cat markdown.md | sed '/^# Title1.*/,+1d {/^\s*$/d}'
My thinking was this would delete the line beginning with '# Title1', then the following line with ,+1d, then subsequent lines until a blank line, but i see the following error:
sed: 1: "/^# Title1.*/,+1d { ...": extra characters at the end of d command
I've tried a few variations but no luck. Any help would be appreciated!
This is the kind of sed puzzle that makes me wish for a slightly different tool.
sed -n -e '/Title1/!{p;d;};n;' -e ':a' -e 'n;/./ba'
Loosely translated: "Don't print anything. If it doesn't contain 'Title1', then all right, print it, then start over with the next line. But if it does contain 'Title1', then grab the next line (which will be blank), enter a loop, and keep grabbing new lines until you come to the next empty line."
Using GNU sed
$ sed -z 's/# Title1[^#]*//' input_file
# Title2
line 1
line 2
line 3
This might work for you (GNU sed):
sed '/^# /h;G;/\n# Title1/!P;d' file
If a line begins # , make a copy.
Append the copy to each line and if that line does not contain \n# Title1, print it.
Delete all lines.
Alternative:
sed '/^# Title1/{:a;N;/\n#/!s/\n//;ta;D}' file

Can I avoid duplicate strings with the sed "a\" command?

Can I avoid duplicate strings with the sed "a" command?
I added the word "apple" under "true" in my file.txt.
The problem is that every time I run the command "apple" is appended.
$ sed -i '/true/a\apple' file.txt ...execute 3 time
$ cat file.txt
true
apple
apple
apple
If the word "apple" already exists, repeating the sed command does not want to add any more.
I have no idea, please help me
...
I want to do this,
...execute sed command anytime
$ cat file.txt
true
apple
It seems you don't want to append the line apple if the line following the true already contains apple. Then this sed command should do the trick.
sed -i.backup '
/true/!b
$!{N;/\napple$/!s/\n/&apple&/;p;d;}
a\
apple
' file.txt
Explanation of sed commands:
If the line doesn't contain true then jump to the end of the script, which will print out the line read (/true/!b).
Otherwise the line contains true:
If it isn't the last line ($!) then• read the next line (N).• If the next line doesn't consist of apple (/\napple$/!) then insert the apple between two lines (s/\n/&apple&/).• Print out the pattern space (p) and start a new cycle (d)
Otherwise it is the last line (and contains true)
Append apple (a\ apple)
Edit:
The above sed script won't work properly if two consecutive true line occurs in the file, as pointed out by #potong. The version below should fix this, if I haven't overlooked something.
sed -i.backup ':a
/true/!b
a\
apple
n
/^apple$/d
ba
' file.txt
Explanation:
/true/!b: If the line doesn't contain true, no further processing is required. Jump to the end of the script. This will print the current pattern space.
a\ apple: Otherwise, the line contains true. Append apple.
n: Print the current pattern space and appended line (apple) and replace the pattern space with the next line. This will end the script if no next line available.
/^apple$/d: If the line read consists of string apple then delete it and start a new cycle (because it is already appended before)
ba: Jump to the start of the script (label a) without reading an input line.
There is no general solution for sed unless the file is sorted. If sorted, the following deletes the duplicate lines:
sed '$!N; /^\(.*\)\n\1$/!P; D'
This was taken from this link: https://www.unix.com/shell-programming-and-scripting/146404-command-remove-duplicate-lines-perl-sed-awk.html
Great answer by M. Nejat Aydin but to make things simpler just add grep:
grep -q apple file.txt || sed -i '/true/a\apple' file.txt
This might work for you (GNU sed):
sed -e ':a;/true/!b;$a apple' -e 'n;/apple/b;i apple' -e 'ba' file
If a line does not contain true just print it.
Otherwise, if it is the last line, append the line apple.
Otherwise, print that line and fetch the next.
If that line contains apple just print it.
Otherwise, insert a line apple and jump to the first sed instruction since the fetched line might be one containing true.
N.B. This uses both the a command (for end of file condition) and the i command for when there is a following line.

how to understand dollar sign ($) in sed script programming?

everybody.
I don't understand dollar sign ($) in sed script programming, it is stand for last line of a file or a counter of sed?
I want to reverse order of lines (emulates "tac") of /etc/passwd. like following:
$ cat /etc/passwd | wc -l ----> 52 // line numbers
$ sed '1!G;h;$!d' /etc/passwd | wc -l ----> 52 // working correctly
$ sed '1!G;h;$d' /etc/passwd | wc -l ----> 1326 // no ! followed by $
$ sed '1!G;h;$p' /etc/passwd | wc -l ----> 1430 // instead !d by p
Last two example don't work right, who can tell me what mean does dollar sign stand for?
All the commands "work right." They just do something you don't expect. Let's consider the first version:
sed '1!G;h;$!d
Start with the first two commands:
1!G; h
After these two commands have been executed, the pattern space and the hold space both contain all the lines reads so far but in reverse order.
At this point, if we do nothing, sed would take its default action which is to print the pattern space. So:
After the first line is read, it would print the first line.
After the second line is read, it would print the second line followed by the first line.
After the third line is read, it would print the third line, followed by the second line, followed by the first line.
And so on.
If we are emulating tac, we don't want that. We want it to print only after it has read in the last line. So, that is where the following command comes in:
$!d
$ means the last line. $! means not-the-last-line. $!d means delete if we are not on the last line. Thus, this tells sed to delete the pattern space unless we are on the last line, in which case it will be printed, displaying all lines in reverse order.
With that in mind, consider your second example:
sed '1!G;h;$d'
This prints all the partial tacs except the last one.
Your third example:
sed '1!G;h;$p'
This prints all the partial tacs up through the last one but the last one is printed twice: $p is an explicit print of the pattern space for the last line in addition to the implicit print that would happen anyway.

sed: replace pattern only if followed by empty line

I need to replace a pattern in a file, only if it is followed by an empty line. Suppose I have following file:
test
test
test
...
the following command would replace all occurrences of test with xxx
cat file | sed 's/test/xxx/g'
but I need to only replace test if next line is empty. I have tried matching a hex code, but that doesn ot work:
cat file | sed 's/test\x0a/xxx/g'
The desired output should look like this:
test
xxx
xxx
...
Suggested solutions for sed, perl and awk:
sed
sed -rn '1h;1!H;${g;s/test([^\n]*\n\n)/xxx\1/g;p;}' file
I got the idea from sed multiline search and replace. Basically slurp the entire file into sed's hold space and do global replacement on the whole chunk at once.
perl
$ perl -00 -pe 's/test(?=[^\n]*\n\n)$/xxx/m' file
-00 triggers paragraph mode which makes perl read chunks separated by one or several empty lines (just what OP is looking for). Positive look ahead (?=) to anchor substitution to the last line of the chunk.
Caveat: -00 will squash multiple empty lines into single empty lines.
awk
$ awk 'NR==1 {l=$0; next}
/^$/ {gsub(/test/,"xxx", l)}
{print l; l=$0}
END {print l}' file
Basically store previous line in l, substitute pattern in l if current line is empty. Print l. Finally print the very last line.
Output in all three cases
test
xxx
xxx
...
This might work for you (GNU sed):
sed -r '$!N;s/test(\n\s*)$/xxx\1/;P;D' file
Keep a window of 2 lines throughout the length of the file and if the second line is empty and the first line contains the pattern then make a substitution.
Using sed
sed -r ':a;$!{N;ba};s/test([^\n]*\n(\n|$))/xxx\1/g'
explanation
:a # set label a
$ !{ # if not end of file
N # Add a newline to the pattern space, then append the next line of input to the pattern space
b a # Unconditionally branch to label. The label may be omitted, in which case the next cycle is started.
}
# simply, above command :a;$!{N;ba} is used to read the whole file into pattern.
s/test([^\n]*\n(\n|$))/xxx\1/g # replace the key word if next line is empty (\n\n) or end of line ($)

Move line starting with # to the end of next line. Preferred AWK or SED

I need to move every line starting with # to the end of next line ( AWK/SED ? ).
Testfile.txt:
# FIRST COMMENT
alias1: john#domain.com, tom#domain.com
alias2: betty#domain.com
# SECOND COMMENT
alias3: anna#domain.com, mark#domain.com
alias4: dan#domain.com
Expected output:
alias1: john#domain.com, tom#domain.com # FIRST COMMENT
alias2: betty#domain.com
alias3: anna#domain.com, mark#domain.com # SECOND COMMENT
alias4: dan#domain.com
I managed to do this that way ( but I'm sure it's not best solution ):
sed '/^#/ N;s/\n/$/' testfile.txt | sed -e 's/\(.*\)$\(.*\)/\2 \1/
First SED merge line with next one with $ separator.
Second SED switch everything between $ character
Any advice how to make it better way ( performance & looking )?
Thanks
I need to move every line starting with # to the end of next line
try this line:
awk '/^#/{x=$0;next}{if(x)print $0,x;else print;x=0}' file
or
awk '/^#/{x=$0;next}{print $0 (x? FS x:"");x=0}' file
test with your example:
kent$ echo "# FIRST COMMENT
alias1: john#domain.com, tom#domain.com
alias2: betty#domain.com
# SECOND COMMENT
alias3: anna#domain.com, mark#domain.com
alias4: dan#domain.com"|awk '/^#/{x=$0;next}{if(x)print $0,x;else print;x=0}'
alias1: john#domain.com, tom#domain.com # FIRST COMMENT
alias2: betty#domain.com
alias3: anna#domain.com, mark#domain.com # SECOND COMMENT
alias4: dan#domain.com
This might work for you (GNU sed):
sed -r '/^#/{$!N;s/(.*)\n(.*)/\2 \1/}' file
If the line begins with a hash, then append a newline and the next line (unless the current line is the last line in the file) and then swap that line with the current line and put a space between them.