how to understand dollar sign ($) in sed script programming? - sed

everybody.
I don't understand dollar sign ($) in sed script programming, it is stand for last line of a file or a counter of sed?
I want to reverse order of lines (emulates "tac") of /etc/passwd. like following:
$ cat /etc/passwd | wc -l ----> 52 // line numbers
$ sed '1!G;h;$!d' /etc/passwd | wc -l ----> 52 // working correctly
$ sed '1!G;h;$d' /etc/passwd | wc -l ----> 1326 // no ! followed by $
$ sed '1!G;h;$p' /etc/passwd | wc -l ----> 1430 // instead !d by p
Last two example don't work right, who can tell me what mean does dollar sign stand for?

All the commands "work right." They just do something you don't expect. Let's consider the first version:
sed '1!G;h;$!d
Start with the first two commands:
1!G; h
After these two commands have been executed, the pattern space and the hold space both contain all the lines reads so far but in reverse order.
At this point, if we do nothing, sed would take its default action which is to print the pattern space. So:
After the first line is read, it would print the first line.
After the second line is read, it would print the second line followed by the first line.
After the third line is read, it would print the third line, followed by the second line, followed by the first line.
And so on.
If we are emulating tac, we don't want that. We want it to print only after it has read in the last line. So, that is where the following command comes in:
$!d
$ means the last line. $! means not-the-last-line. $!d means delete if we are not on the last line. Thus, this tells sed to delete the pattern space unless we are on the last line, in which case it will be printed, displaying all lines in reverse order.
With that in mind, consider your second example:
sed '1!G;h;$d'
This prints all the partial tacs except the last one.
Your third example:
sed '1!G;h;$p'
This prints all the partial tacs up through the last one but the last one is printed twice: $p is an explicit print of the pattern space for the last line in addition to the implicit print that would happen anyway.

Related

Can I avoid duplicate strings with the sed "a\" command?

Can I avoid duplicate strings with the sed "a" command?
I added the word "apple" under "true" in my file.txt.
The problem is that every time I run the command "apple" is appended.
$ sed -i '/true/a\apple' file.txt ...execute 3 time
$ cat file.txt
true
apple
apple
apple
If the word "apple" already exists, repeating the sed command does not want to add any more.
I have no idea, please help me
...
I want to do this,
...execute sed command anytime
$ cat file.txt
true
apple
It seems you don't want to append the line apple if the line following the true already contains apple. Then this sed command should do the trick.
sed -i.backup '
/true/!b
$!{N;/\napple$/!s/\n/&apple&/;p;d;}
a\
apple
' file.txt
Explanation of sed commands:
If the line doesn't contain true then jump to the end of the script, which will print out the line read (/true/!b).
Otherwise the line contains true:
If it isn't the last line ($!) then• read the next line (N).• If the next line doesn't consist of apple (/\napple$/!) then insert the apple between two lines (s/\n/&apple&/).• Print out the pattern space (p) and start a new cycle (d)
Otherwise it is the last line (and contains true)
Append apple (a\ apple)
Edit:
The above sed script won't work properly if two consecutive true line occurs in the file, as pointed out by #potong. The version below should fix this, if I haven't overlooked something.
sed -i.backup ':a
/true/!b
a\
apple
n
/^apple$/d
ba
' file.txt
Explanation:
/true/!b: If the line doesn't contain true, no further processing is required. Jump to the end of the script. This will print the current pattern space.
a\ apple: Otherwise, the line contains true. Append apple.
n: Print the current pattern space and appended line (apple) and replace the pattern space with the next line. This will end the script if no next line available.
/^apple$/d: If the line read consists of string apple then delete it and start a new cycle (because it is already appended before)
ba: Jump to the start of the script (label a) without reading an input line.
There is no general solution for sed unless the file is sorted. If sorted, the following deletes the duplicate lines:
sed '$!N; /^\(.*\)\n\1$/!P; D'
This was taken from this link: https://www.unix.com/shell-programming-and-scripting/146404-command-remove-duplicate-lines-perl-sed-awk.html
Great answer by M. Nejat Aydin but to make things simpler just add grep:
grep -q apple file.txt || sed -i '/true/a\apple' file.txt
This might work for you (GNU sed):
sed -e ':a;/true/!b;$a apple' -e 'n;/apple/b;i apple' -e 'ba' file
If a line does not contain true just print it.
Otherwise, if it is the last line, append the line apple.
Otherwise, print that line and fetch the next.
If that line contains apple just print it.
Otherwise, insert a line apple and jump to the first sed instruction since the fetched line might be one containing true.
N.B. This uses both the a command (for end of file condition) and the i command for when there is a following line.

How to change the first occurrence of a line containing a pattern?

I need to find the line with first occurrence of a pattern, then I need to replace the whole line with a completely new one.
I found this command that replaces the first occurrence of a pattern, but not the whole line:
sed -e "0,/something/ s//other-thing/" <in.txt >out.txt
If in.txt is
one two three
four something
five six
something seven
As a result I get in out.txt:
one two three
four other-thing
five six
something seven
However, when I try to modify this code to replace the whole line, as follows:
sed -e "0,/something/ c\COMPLETE NEW LINE" <in.txt >out.txt
This is what I get in out.txt:
COMPLETE NEW LINE
five six
something seven
Do you have any idea why the first line is lost?
The c\ command deletes all lines between and inclusive the first matching address through the second matching address, when used with 2 addresses, and prints out the text specified following the c\ upon matching the second address. If there is no line matching the second address in the input, it just deletes all lines (inclusively) between the first matching address through the last line. Since you want to replace one line only, you shouldn't use the c\ command on an address range. The c\ is immediately followed by a new-line character in normal usage.
The 0,/regexp/ address range is a GNU sed extension, which will try to match regexp in the first input line too, which is different from 1,/regexp/ in that aspect. So, the correct command in GNU sed could be
sed '0,/something/{/something/c\
COMPLETE NEW LINE
}' < in.txt
or simplified as pointed out by Sundeep
sed '0,/something/{//c\
COMPLETE NEW LINE
}' < in.txt
or a one-liner,
sed -e '0,/something/{//cCOMPLETE NEW LINE' -e '}' < in.txt
if a literal new-line character is not desirable.
This one-liner also works as pointed out by potong:
sed '0,/something/!b;//cCOMPLETE NEW LINE' in.txt
This might work for you (GNU sed):
sed '1!b;:a;/something/!{n;ba};cCOMPLETE NEW LINE' file
Set up a loop that will only operate from the first line.
Within in the loop, if the key word is not found in the current line, print the current line, fetch the next and repeat until the end of the file or a match is found.
When a match is found, change the contents of the current line to the required result.
N.B. The c command terminates any further processing of sed commands in the same way the d command does.
If there are lines in the input following the key word match, the negation of address at the start of the sed cycle will capture these lines and result in their printing and no further processing.
An alternative:
sed 'x;/./{x;b};x;/something/h;//cCOMPLETE NEW LINE' file
Or (specific to GNU and bash):
sed $'0,/something/{//cCOMPLETE NEW LINE\n}' file
Just use awk:
$ awk '!done && sub(/something/,"other-thing"){done=1} {print}' file
one two three
four other-thing
five six
something seven
$ awk '!done && sub(/.*something.*/,"other-thing"){done=1} {print}' file
one two three
other-thing
five six
something seven
$ awk '!done && /something/{$0="other-thing"; done=1} {print}' file
one two three
other-thing
five six
something seven
and look what you can trivially do if you want to replace the Nth occurrence of something:
$ awk -v n=1 '/something/ && (++cnt == n){$0="other-thing"} {print}' file
one two three
other-thing
five six
something seven
$ awk -v n=2 '/something/ && (++cnt == n){$0="other-thing"} {print}' file
one two three
four something
five six
other-thing

Sed - replace with variable first occurrence only [duplicate]

I would like to update a large number of C++ source files with an extra include directive before any existing #includes. For this sort of task, I normally use a small bash script with sed to re-write the file.
How do I get sed to replace just the first occurrence of a string in a file rather than replacing every occurrence?
If I use
sed s/#include/#include "newfile.h"\n#include/
it replaces all #includes.
Alternative suggestions to achieve the same thing are also welcome.
A sed script that will only replace the first occurrence of "Apple" by "Banana"
Example
Input: Output:
Apple Banana
Apple Apple
Orange Orange
Apple Apple
This is the simple script: Editor's note: works with GNU sed only.
sed '0,/Apple/{s/Apple/Banana/}' input_filename
The first two parameters 0 and /Apple/ are the range specifier. The s/Apple/Banana/ is what is executed within that range. So in this case "within the range of the beginning (0) up to the first instance of Apple, replace Apple with Banana. Only the first Apple will be replaced.
Background: In traditional sed the range specifier is also "begin here" and "end here" (inclusive). However the lowest "begin" is the first line (line 1), and if the "end here" is a regex, then it is only attempted to match against on the next line after "begin", so the earliest possible end is line 2. So since range is inclusive, smallest possible range is "2 lines" and smallest starting range is both lines 1 and 2 (i.e. if there's an occurrence on line 1, occurrences on line 2 will also be changed, not desired in this case). GNU sed adds its own extension of allowing specifying start as the "pseudo" line 0 so that the end of the range can be line 1, allowing it a range of "only the first line" if the regex matches the first line.
Or a simplified version (an empty RE like // means to re-use the one specified before it, so this is equivalent):
sed '0,/Apple/{s//Banana/}' input_filename
And the curly braces are optional for the s command, so this is also equivalent:
sed '0,/Apple/s//Banana/' input_filename
All of these work on GNU sed only.
You can also install GNU sed on OS X using homebrew brew install gnu-sed.
# sed script to change "foo" to "bar" only on the first occurrence
1{x;s/^/first/;x;}
1,/foo/{x;/first/s///;x;s/foo/bar/;}
#---end of script---
or, if you prefer: Editor's note: works with GNU sed only.
sed '0,/foo/s//bar/' file
Source
An overview of the many helpful existing answers, complemented with explanations:
The examples here use a simplified use case: replace the word 'foo' with 'bar' in the first matching line only.
Due to use of ANSI C-quoted strings ($'...') to provide the sample input lines, bash, ksh, or zsh is assumed as the shell.
GNU sed only:
Ben Hoffstein's anwswer shows us that GNU provides an extension to the POSIX specification for sed that allows the following 2-address form: 0,/re/ (re represents an arbitrary regular expression here).
0,/re/ allows the regex to match on the very first line also. In other words: such an address will create a range from the 1st line up to and including the line that matches re - whether re occurs on the 1st line or on any subsequent line.
Contrast this with the POSIX-compliant form 1,/re/, which creates a range that matches from the 1st line up to and including the line that matches re on subsequent lines; in other words: this will not detect the first occurrence of an re match if it happens to occur on the 1st line and also prevents the use of shorthand // for reuse of the most recently used regex (see next point).1
If you combine a 0,/re/ address with an s/.../.../ (substitution) call that uses the same regular expression, your command will effectively only perform the substitution on the first line that matches re.
sed provides a convenient shortcut for reusing the most recently applied regular expression: an empty delimiter pair, //.
$ sed '0,/foo/ s//bar/' <<<$'1st foo\nUnrelated\n2nd foo\n3rd foo'
1st bar # only 1st match of 'foo' replaced
Unrelated
2nd foo
3rd foo
A POSIX-features-only sed such as BSD (macOS) sed (will also work with GNU sed):
Since 0,/re/ cannot be used and the form 1,/re/ will not detect re if it happens to occur on the very first line (see above), special handling for the 1st line is required.
MikhailVS's answer mentions the technique, put into a concrete example here:
$ sed -e '1 s/foo/bar/; t' -e '1,// s//bar/' <<<$'1st foo\nUnrelated\n2nd foo\n3rd foo'
1st bar # only 1st match of 'foo' replaced
Unrelated
2nd foo
3rd foo
Note:
The empty regex // shortcut is employed twice here: once for the endpoint of the range, and once in the s call; in both cases, regex foo is implicitly reused, allowing us not to have to duplicate it, which makes both for shorter and more maintainable code.
POSIX sed needs actual newlines after certain functions, such as after the name of a label or even its omission, as is the case with t here; strategically splitting the script into multiple -e options is an alternative to using an actual newlines: end each -e script chunk where a newline would normally need to go.
1 s/foo/bar/ replaces foo on the 1st line only, if found there.
If so, t branches to the end of the script (skips remaining commands on the line). (The t function branches to a label only if the most recent s call performed an actual substitution; in the absence of a label, as is the case here, the end of the script is branched to).
When that happens, range address 1,//, which normally finds the first occurrence starting from line 2, will not match, and the range will not be processed, because the address is evaluated when the current line is already 2.
Conversely, if there's no match on the 1st line, 1,// will be entered, and will find the true first match.
The net effect is the same as with GNU sed's 0,/re/: only the first occurrence is replaced, whether it occurs on the 1st line or any other.
NON-range approaches
potong's answer demonstrates loop techniques that bypass the need for a range; since he uses GNU sed syntax, here are the POSIX-compliant equivalents:
Loop technique 1: On first match, perform the substitution, then enter a loop that simply prints the remaining lines as-is:
$ sed -e '/foo/ {s//bar/; ' -e ':a' -e '$!{n;ba' -e '};}' <<<$'1st foo\nUnrelated\n2nd foo\n3rd foo'
1st bar
Unrelated
2nd foo
3rd foo
Loop technique 2, for smallish files only: read the entire input into memory, then perform a single substitution on it.
$ sed -e ':a' -e '$!{N;ba' -e '}; s/foo/bar/' <<<$'1st foo\nUnrelated\n2nd foo\n3rd foo'
1st bar
Unrelated
2nd foo
3rd foo
1 1.61803 provides examples of what happens with 1,/re/, with and without a subsequent s//:
sed '1,/foo/ s/foo/bar/' <<<$'1foo\n2foo' yields $'1bar\n2bar'; i.e., both lines were updated, because line number 1 matches the 1st line, and regex /foo/ - the end of the range - is then only looked for starting on the next line. Therefore, both lines are selected in this case, and the s/foo/bar/ substitution is performed on both of them.
sed '1,/foo/ s//bar/' <<<$'1foo\n2foo\n3foo' fails: with sed: first RE may not be empty (BSD/macOS) and sed: -e expression #1, char 0: no previous regular expression (GNU), because, at the time the 1st line is being processed (due to line number 1 starting the range), no regex has been applied yet, so // doesn't refer to anything.
With the exception of GNU sed's special 0,/re/ syntax, any range that starts with a line number effectively precludes use of //.
sed '0,/pattern/s/pattern/replacement/' filename
this worked for me.
example
sed '0,/<Menu>/s/<Menu>/<Menu><Menu>Sub menu<\/Menu>/' try.txt > abc.txt
Editor's note: both work with GNU sed only.
You could use awk to do something similar..
awk '/#include/ && !done { print "#include \"newfile.h\""; done=1;}; 1;' file.c
Explanation:
/#include/ && !done
Runs the action statement between {} when the line matches "#include" and we haven't already processed it.
{print "#include \"newfile.h\""; done=1;}
This prints #include "newfile.h", we need to escape the quotes. Then we set the done variable to 1, so we don't add more includes.
1;
This means "print out the line" - an empty action defaults to print $0, which prints out the whole line. A one liner and easier to understand than sed IMO :-)
Quite a comprehensive collection of answers on linuxtopia sed FAQ. It also highlights that some answers people provided won't work with non-GNU version of sed, eg
sed '0,/RE/s//to_that/' file
in non-GNU version will have to be
sed -e '1s/RE/to_that/;t' -e '1,/RE/s//to_that/'
However, this version won't work with gnu sed.
Here's a version that works with both:
-e '/RE/{s//to_that/;:a' -e '$!N;$!ba' -e '}'
ex:
sed -e '/Apple/{s//Banana/;:a' -e '$!N;$!ba' -e '}' filename
With GNU sed's -z option you could process the whole file as if it was only one line. That way a s/…/…/ would only replace the first match in the whole file. Remember: s/…/…/ only replaces the first match in each line, but with the -z option sed treats the whole file as a single line.
sed -z 's/#include/#include "newfile.h"\n#include'
In the general case you have to rewrite your sed expression since the pattern space now holds the whole file instead of just one line. Some examples:
s/text.*// can be rewritten as s/text[^\n]*//. [^\n] matches everything except the newline character. [^\n]* will match all symbols after text until a newline is reached.
s/^text// can be rewritten as s/(^|\n)text//.
s/text$// can be rewritten as s/text(\n|$)//.
#!/bin/sed -f
1,/^#include/ {
/^#include/i\
#include "newfile.h"
}
How this script works: For lines between 1 and the first #include (after line 1), if the line starts with #include, then prepend the specified line.
However, if the first #include is in line 1, then both line 1 and the next subsequent #include will have the line prepended. If you are using GNU sed, it has an extension where 0,/^#include/ (instead of 1,) will do the right thing.
Just add the number of occurrence at the end:
sed s/#include/#include "newfile.h"\n#include/1
A possible solution:
/#include/!{p;d;}
i\
#include "newfile.h"
:a
n
ba
Explanation:
read lines until we find the #include, print these lines then start new cycle
insert the new include line
enter a loop that just reads lines (by default sed will also print these lines), we won't get back to the first part of the script from here
I know this is an old post but I had a solution that I used to use:
grep -E -m 1 -n 'old' file | sed 's/:.*$//' - | sed 's/$/s\/old\/new\//' - | sed -f - file
Basically use grep to print the first occurrence and stop there. Additionally print line number ie 5:line. Pipe that into sed and remove the : and anything after so you are just left with a line number. Pipe that into sed which adds s/.*/replace to the end number, which results in a 1 line script which is piped into the last sed to run as a script on the file.
so if regex = #include and replace = blah and the first occurrence grep finds is on line 5 then the data piped to the last sed would be 5s/.*/blah/.
Works even if first occurrence is on the first line.
i would do this with an awk script:
BEGIN {i=0}
(i==0) && /#include/ {print "#include \"newfile.h\""; i=1}
{print $0}
END {}
then run it with awk:
awk -f awkscript headerfile.h > headerfilenew.h
might be sloppy, I'm new to this.
As an alternative suggestion you may want to look at the ed command.
man 1 ed
teststr='
#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>
'
# for in-place file editing use "ed -s file" and replace ",p" with "w"
# cf. http://wiki.bash-hackers.org/howto/edit-ed
cat <<-'EOF' | sed -e 's/^ *//' -e 's/ *$//' | ed -s <(echo "$teststr")
H
/# *include/i
#include "newfile.h"
.
,p
q
EOF
I finally got this to work in a Bash script used to insert a unique timestamp in each item in an RSS feed:
sed "1,/====RSSpermalink====/s/====RSSpermalink====/${nowms}/" \
production-feed2.xml.tmp2 > production-feed2.xml.tmp.$counter
It changes the first occurrence only.
${nowms} is the time in milliseconds set by a Perl script, $counter is a counter used for loop control within the script, \ allows the command to be continued on the next line.
The file is read in and stdout is redirected to a work file.
The way I understand it, 1,/====RSSpermalink====/ tells sed when to stop by setting a range limitation, and then s/====RSSpermalink====/${nowms}/ is the familiar sed command to replace the first string with the second.
In my case I put the command in double quotation marks becauase I am using it in a Bash script with variables.
Using FreeBSD ed and avoid ed's "no match" error in case there is no include statement in a file to be processed:
teststr='
#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>
'
# using FreeBSD ed
# to avoid ed's "no match" error, see
# *emphasized text*http://codesnippets.joyent.com/posts/show/11917
cat <<-'EOF' | sed -e 's/^ *//' -e 's/ *$//' | ed -s <(echo "$teststr")
H
,g/# *include/u\
u\
i\
#include "newfile.h"\
.
,p
q
EOF
This might work for you (GNU sed):
sed -si '/#include/{s//& "newfile.h\n&/;:a;$!{n;ba}}' file1 file2 file....
or if memory is not a problem:
sed -si ':a;$!{N;ba};s/#include/& "newfile.h\n&/' file1 file2 file...
If anyone came here to replace a character for the first occurrence in all lines (like myself), use this:
sed '/old/s/old/new/1' file
-bash-4.2$ cat file
123a456a789a
12a34a56
a12
-bash-4.2$ sed '/a/s/a/b/1' file
123b456a789a
12b34a56
b12
By changing 1 to 2 for example, you can replace all the second a's only instead.
The use case can perhaps be that your occurences are spread throughout your file, but you know your only concern is in the first 10, 20 or 100 lines.
Then simply adressing those lines fixes the issue - even if the wording of the OP regards first only.
sed '1,10s/#include/#include "newfile.h"\n#include/'
The following command removes the first occurrence of a string, within a file. It removes the empty line too. It is presented on an xml file, but it would work with any file.
Useful if you work with xml files and you want to remove a tag. In this example it removes the first occurrence of the "isTag" tag.
Command:
sed -e 0,/'<isTag>false<\/isTag>'/{s/'<isTag>false<\/isTag>'//} -e 's/ *$//' -e '/^$/d' source.txt > output.txt
Source file (source.txt)
<xml>
<testdata>
<canUseUpdate>true</canUseUpdate>
<isTag>false</isTag>
<moduleLocations>
<module>esa_jee6</module>
<isTag>false</isTag>
</moduleLocations>
<node>
<isTag>false</isTag>
</node>
</testdata>
</xml>
Result file (output.txt)
<xml>
<testdata>
<canUseUpdate>true</canUseUpdate>
<moduleLocations>
<module>esa_jee6</module>
<isTag>false</isTag>
</moduleLocations>
<node>
<isTag>false</isTag>
</node>
</testdata>
</xml>
ps: it didn't work for me on Solaris SunOS 5.10 (quite old), but it works on Linux 2.6, sed version 4.1.5
Nothing new but perhaps a little more concrete answer: sed -rn '0,/foo(bar).*/ s%%\1%p'
Example: xwininfo -name unity-launcher produces output like:
xwininfo: Window id: 0x2200003 "unity-launcher"
Absolute upper-left X: -2980
Absolute upper-left Y: -198
Relative upper-left X: 0
Relative upper-left Y: 0
Width: 2880
Height: 98
Depth: 24
Visual: 0x21
Visual Class: TrueColor
Border width: 0
Class: InputOutput
Colormap: 0x20 (installed)
Bit Gravity State: ForgetGravity
Window Gravity State: NorthWestGravity
Backing Store State: NotUseful
Save Under State: no
Map State: IsViewable
Override Redirect State: no
Corners: +-2980+-198 -2980+-198 -2980-1900 +-2980-1900
-geometry 2880x98+-2980+-198
Extracting window ID with xwininfo -name unity-launcher|sed -rn '0,/^xwininfo: Window id: (0x[0-9a-fA-F]+).*/ s%%\1%p' produces:
0x2200003
POSIXly (also valid in sed), Only one regex used, need memory only for one line (as usual):
sed '/\(#include\).*/!b;//{h;s//\1 "newfile.h"/;G};:1;n;b1'
Explained:
sed '
/\(#include\).*/!b # Only one regex used. On lines not matching
# the text `#include` **yet**,
# branch to end, cause the default print. Re-start.
//{ # On first line matching previous regex.
h # hold the line.
s//\1 "newfile.h"/ # append ` "newfile.h"` to the `#include` matched.
G # append a newline.
} # end of replacement.
:1 # Once **one** replacement got done (the first match)
n # Loop continually reading a line each time
b1 # and printing it by default.
' # end of sed script.
A possible solution here might be to tell the compiler to include the header without it being mentioned in the source files. IN GCC there are these options:
-include file
Process file as if "#include "file"" appeared as the first line of
the primary source file. However, the first directory searched for
file is the preprocessor's working directory instead of the
directory containing the main source file. If not found there, it
is searched for in the remainder of the "#include "..."" search
chain as normal.
If multiple -include options are given, the files are included in
the order they appear on the command line.
-imacros file
Exactly like -include, except that any output produced by scanning
file is thrown away. Macros it defines remain defined. This
allows you to acquire all the macros from a header without also
processing its declarations.
All files specified by -imacros are processed before all files
specified by -include.
Microsoft's compiler has the /FI (forced include) option.
This feature can be handy for some common header, like platform configuration. The Linux kernel's Makefile uses -include for this.
I needed a solution that would work both on GNU and BSD, and I also knew that the first line would never be the one I'd need to update:
sed -e "1,/pattern/s/pattern/replacement/"
Trying the // feature to not repeat the pattern did not work for me, hence needing to repeat it.
I will make a suggestion that is not exactly what the original question asks for, but for those who also want to specifically replace perhaps the second occurrence of a match, or any other specifically enumerated regular expression match. Use a python script, and a for loop, call it from a bash script if needed. Here's what it looked like for me, where I was replacing specific lines containing the string --project:
def replace_models(file_path, pixel_model, obj_model):
# find your file --project matches
pattern = re.compile(r'--project.*')
new_file = ""
with open(file_path, 'r') as f:
match = 1
for line in f:
# Remove line ending before we do replacement
line = line.strip()
# replace first --project line match with pixel
if match == 1:
result = re.sub(pattern, "--project='" + pixel_model + "'", line)
# replace second --project line match with object
elif match == 2:
result = re.sub(pattern, "--project='" + obj_model + "'", line)
else:
result = line
# Check that a substitution was actually made
if result is not line:
# Add a backslash to the replaced line
result += " \\"
print("\nReplaced ", line, " with ", result)
# Increment number of matches found
match += 1
# Add the potentially modified line to our new file
new_file = new_file + result + "\n"
# close file / save output
f.close()
fout = open(file_path, "w")
fout.write(new_file)
fout.close()
sed -e 's/pattern/REPLACEMENT/1' <INPUTFILE

How does this sed command: "sed -e :a -e '$d;N;2,10ba' -e 'P;D' " work?

I saw a sed command to delete the last 10 rows of data:
sed -e :a -e '$d;N;2,10ba' -e 'P;D'
But I don't understand how it works. Can someone explain it for me?
UPDATE:
Here is my understanding of this command:
The first script indicates that a label “a” is defined.
The second script indicates that it first determines whether the
line currently reading pattern space is the last line. If it is,
execute the "d" command to delete it and restart the next cycle; if
not, skip the "d" command; then execute "N" command: append a new
line from the input file to the pattern space, and then execute
"2,10ba": if the line currently reading the pattern space is a line
in the 2nd to 10th lines, jump to label "a".
The third script indicates that if the line currently read into
pattern space is not a line from line 2 to line 10, first execute "P" command: the first line
in pattern space is printed, and then execute "D" command: the first line in pattern
space is deleted.
My understanding of "$d" is that "d" will be executed when sed reads the last line into the pattern space. But it seems that every time "ba" is executed, "d" will be executed, regardless of Whether the current line read into pattern space is the last line. why?
:a is a label. $ in the address means the last line, d means delete. N stands for append the next line into the pattern space. 2,10 means lines 2 to 10, b means branch (i.e. goto), P prints the first line from the pattern space, D is like d but operates on the pattern space if possible.
In other words, you create a sliding window of the size 10. Each line is stored into it, and once it has 10 lines, lines start to get printed from the top of it. Every time a line is printed, the current line is stored in the sliding window at the bottom. When the last line gets printed, the sliding window is deleted, which removes the last 10 lines.
You can modify the commands to see what's getting deleted (()), stored (<>), and printed by the P ([]):
$ printf '%s\n' {1..20} | \
sed -e ':a ${s/^/(/;s/$/)/;p;d};s/^/</;s/$/>/;N;2,10ba;s/^/[/;s/$/]/;P;D'
[<<<<<<<<<<1>
[<2>
[<3>
[<4>
[<5>
[<6>
[<7>
[<8>
[<9>
[<10>
(11]>
12]>
13]>
14]>
15]>
16]>
17]>
18]>
19]>
20])
a simpler resort, if your data in 'd' file by gnu sed,
sed -Ez 's/(.*\n)(.*\n){10}$/\1/' d
^
pointed 10 is number of last line to remove
just move the brace group to invert, ie. to get only the last 10 lines
sed -Ez 's/.*\n((.*\n){10})$/\1/' d

sed: replace pattern only if followed by empty line

I need to replace a pattern in a file, only if it is followed by an empty line. Suppose I have following file:
test
test
test
...
the following command would replace all occurrences of test with xxx
cat file | sed 's/test/xxx/g'
but I need to only replace test if next line is empty. I have tried matching a hex code, but that doesn ot work:
cat file | sed 's/test\x0a/xxx/g'
The desired output should look like this:
test
xxx
xxx
...
Suggested solutions for sed, perl and awk:
sed
sed -rn '1h;1!H;${g;s/test([^\n]*\n\n)/xxx\1/g;p;}' file
I got the idea from sed multiline search and replace. Basically slurp the entire file into sed's hold space and do global replacement on the whole chunk at once.
perl
$ perl -00 -pe 's/test(?=[^\n]*\n\n)$/xxx/m' file
-00 triggers paragraph mode which makes perl read chunks separated by one or several empty lines (just what OP is looking for). Positive look ahead (?=) to anchor substitution to the last line of the chunk.
Caveat: -00 will squash multiple empty lines into single empty lines.
awk
$ awk 'NR==1 {l=$0; next}
/^$/ {gsub(/test/,"xxx", l)}
{print l; l=$0}
END {print l}' file
Basically store previous line in l, substitute pattern in l if current line is empty. Print l. Finally print the very last line.
Output in all three cases
test
xxx
xxx
...
This might work for you (GNU sed):
sed -r '$!N;s/test(\n\s*)$/xxx\1/;P;D' file
Keep a window of 2 lines throughout the length of the file and if the second line is empty and the first line contains the pattern then make a substitution.
Using sed
sed -r ':a;$!{N;ba};s/test([^\n]*\n(\n|$))/xxx\1/g'
explanation
:a # set label a
$ !{ # if not end of file
N # Add a newline to the pattern space, then append the next line of input to the pattern space
b a # Unconditionally branch to label. The label may be omitted, in which case the next cycle is started.
}
# simply, above command :a;$!{N;ba} is used to read the whole file into pattern.
s/test([^\n]*\n(\n|$))/xxx\1/g # replace the key word if next line is empty (\n\n) or end of line ($)