I thought I understood sed but I guess not. I have the following two files, in which I want to replace the "why" and "huh" lines with one different line. No whitespace at all.
test.txt:
hi
why
huh
hi
why
huh
test2.txt:
1
hi
why
huh
hi
why
huh
The following two commands give the following results:
sed "N; s/<why\/>\n<huh\/>/yo/g" test.txt > out.txt
out.txt:
hi
why
huh
hi
yo
sed "N; s/<why\/>\n<huh\/>/yo/g" test2.txt > out2.txt
out2.txt:
1
hi
yo
hi
why
huh
What am I not understanding about sed? Why don't both output files contain the following:
hi
yo
hi
yo
Your expression is almost correct, but it has two problems:
If you want to match why as a word, you should put \< and \> around it. You did put just < and \/> around it. So, the first correction is:
$ sed 'N; s/\<why\>\n\<huh\>/yo/g' test.txt
But it will not work, either:
$ sed 'N; s/\<why\>\n\<huh\>/yo/g' test.txt
hi
why
huh
hi
yo
Why does it replace only the second pair of lines? Well, in the first line, the N command will concatenate why to hi, leaving in the pattern space the string hi\nwhy. This string is not matched by the s/// command, so the line is just printed. Next time, you have the string huh in the pattern space and concatenate hi to it. Just in the next line you will have why\nhuh in the pattern space to be replaced.
The solution is to concatenate the next line only when your current line is why, using the address /^why$/:
$ sed '/^why$/ {N; s/\<why\>\n\<huh\>/yo/g}' test.txt
hi
yo
hi
yo
The reason why it didn't replace both pairs of lines is explained beautifully in brandizzi's answer.
However, if we take one step further. Say we have the following file and we want to replace "apple\njuice" with "pure\nmilk".
test.txt
water water water water
water water water apple
juice water water apple
juice water water water
The pattern filter way would not work.
sed '/apple/ {N; s/apple\njuice/pure\nmilk/g}' test.txt
water water water water
water water water pure
milk water water apple
juice water water water
Because the 2nd apple from test.txt, which has been concatenated to the previous line by N, didn't get caught by pattern filter.
One solution I can think of is to use branch to concatenate all lines and then do the replacement.
sed ':a;N;$!ba;s/apple\njuice/pure\nmilk/g' test.txt
water water water water
water water water pure
milk water water pure
milk water water water
It looks dumb, but I haven't think of a better way yet.
This should work for test.txt file:
sed '/hi/! { N ; s/why\nhuh/yo/ }' test.txt
It means:
When not found hi in a line (it will be why), read next one and substitute all it with yo. Otherwise print directly (when hi).
Output:
hi
yo
hi
yo
Related
This question already has answers here:
How to print lines between two patterns, inclusive or exclusive (in sed, AWK or Perl)?
(9 answers)
Closed 4 years ago.
Suppose a file with such content:
a
x
y
z
c
some to be omitted
a
b
c
I want to print all those lines which are between the "a" line and "c" line (both with and without the "c" line are ok):
a
x
y
z
c
a
b
c
I tried the sed command:
sed -n '/a/{:my_tag /./p; N; /c/ t end; t my_tag; :end}'
and it gives me the output:
a
a
Does the N in the command work for once only? I cannot see any loop work out here.
the help info of sed is a little bit confusing and it seemed like I was doing in the wrong way.
Or maybe some tools rather than sed would be more helpful and efficient to this problem and please show me how.
Could you please try following(if ok with awk).
awk '/^a/{flag=1} flag; /^c/{flag=""}' Input_file
Output will be as follows.
a
x
y
z
c
a
b
c
With sed:
sed -n '/^a$/,/^c$/p' file
or
sed '/^a$/,/^c$/!d' file
Output:
a
x
y
z
c
a
b
c
I have a huge text file that has several iterations of the same thing at different times, with a basic structure of:
Header (5 lines)
Data (thousands of lines)
Header (5 lines)
Data (thousands of lines)
Header (5 lines)
Data (thousands of lines)
This repeats and goes on for a while.
I want to cull this file, by removing every other set of Header + Data. I was thinking I'd use sed, but I can't figure out how.
It might be of help that each "cycle" starts with the same line (for the purpose of this example, imagine it says Program X output) and that exact line only appears once, at the beginning of each "cycle".
Thanks
Keep track of how often you see the keywords, and print only when this count is an odd number:
awk '/Program X output/ {n++} n%2 == 1' <<END
Program X output
a
b
c
Program X output
d
e
Program X output
f
g
h
i
j
Program X output
m
n
o
END
Program X output
a
b
c
Program X output
f
g
h
i
j
Sounds like all you need is:
awk '/Program X output/ && c++{exit} 1' file
e.g.
$ seq 50 | awk '/2/ && c++{exit} 1'
1
2
3
4
5
6
7
8
9
10
11
If that's not all you need then edit your question to clarify your requirements and show us concise, testable sample input and expected output.
This might work for you (GNU sed):
sed -r '/Program X output/{x;s/^/x/;x};G;/\n(x{2})*$/!P;d' file
When encountering a header line, add 1 to a counter in the hold space (HS). Append the HS to every line and only print the first line in the pattern space (PS) if the counter is a multiple of the required amount.
I have a data file which has the following arrangement :
#REY2_0 REY1_0 alpha1 alpha2 omega
1000 10000 (-3,0) (1,0) (-0.21259151,-0.17763971)
I have to use the REY2_0, REY1_0 and the second element of omega i.e -0.17763971 in this case. How would I be able to use this in splot ? Can I add multiple separators to gnuplot and then use the resulting columns ? How is this done ? Can I change the data file using sed?
Edit :
The sample output would be :
#REY2_0 REY1_0 alpha1 alpha2 omega
1000 10000 -3 0 1 0 -0.21259151 -0.17763971
You can use this sed,
sed 's/[(,)]/\t/g' yourfile
If you want to made the changes in file,
sed -i.bak 's/[(,)]/\t/g' yourfile
To get proper formatted output,
sed 's/[(,)]/\t/g' yourfile | column -t > newupdatedfile
It is working for your sample input file.
I am trying to read a text file into matlab where the text file has been designed so that the columns are right-aligned so that my columns look like,
3 6 10.5
13 12 9.5
104 5 200000
This has given me two situations that I'm not sure how to handle in matlab, the first is the whitespace before the first data and the other is the variable number of whitespace characters in each row which seems to be beyond my knowledge of textscan. I'm tempted to use sed to reformat the text file but I'm sure this is trivial to someone. Is there a way that I can an arbitrary amount of whitespace as the delimeter (and have the line start with the delimeter)?
Use regexp on every line.
M = regexp(str, '\w+(\d+)','tokens')
Use the load command:
l = load('C:\myFile.txt')
It will work as long as you have only numbers, and same number of columns.
I have several equations mixed throughout a document, appearing in the following forms:
5^4 %A
3^-1 %B
5.01 x 10^2.05 %C
5.01 x 10^2 %D
-5 x 10^3 %E
In other words, they fit in the format of x^y, or z * x^y, where z, x, and y can be any integer or rational number (expressed with a decimal point), positive or negative.
I wish to convert these to math mode for TeX. E.g.:
$5.01 \cdot 10^2$
With much assistance from others, I have managed to create this BASH script with sed to solve items A and B:
sed "s/\-\{0,1\}[0-9]\{1,\}^\-\{0,1\}[0-9]\{1,\}/$&$/" input > output
This is able to convert items A and B to math mode, but I found it only converts the first occurrence it finds within a line. For instance, if a line says 5^10 is greater than 1^2 it converts this to $5^10$ is greater than 1^2. A second pass with the script results in $$5^10$$ is greater than 1^2.
I managed to modify the above script to handle items C, D, and E, but cannot figure out how to handle the back second part (I have marked it with "???"):
sed "s/\-\{0,1\}[0-9]\{1,\}\ x\ \-\{0,1\}[0-9]\{1,\}^\-\{0,1\}[0-9]\{1,\}/???/" input > output
This presents a problem:
Even if the above could work, if I first run the first sed script, then run the second, the first confuses the second, i.e. I would end up with 5.01 x $10^2.05$. If I ran the second script first, I would end up with $5.01 x $10^2.05$$ after running the second script.
In short, how can I perform this kind of conversion for all items within a document?
5^4 --> $5^4$
3^-1 --> $3^-1$
5.01 x 10^2.05 --> $5.01 \cdot 10^2.05$
5.01 x 10^2 --> $5.01 \cdot 10^2$
-5 x 10^3 --> $-5 \cdot 10^3$
but I found it only converts the first occurrence it finds within a line
Use the /g global replacement flag.
Converting your text is best done in several passes
Pass 1
sed 's/\(-\?[0-9].\?[0-9]*\) x \(-\?[0-9]\{1,\}\)^\([0-9]\{1,\}\.\?[0-9]*\)/$\1 cdot \2^^\3$/g' input > tmp
What we've done here is capture \(...\) x \(...\)^\(...\) into the sed remembered patterns \1 \2 and \3 which we then use to convert the text.
This deals with your %C,%D,%E and for example converts 5.01 x 10^2.05 into $5.01 cdot 10^^2.05$. Note that we have converted the occurrences of ^ into ^^ temporarily.
Pass 2
sed -i 's/-\?[0-9]\+\^-\?[0-9]\+/$&$/g' tmp
This deals with your examples %A and %B. As we previously converted the ^ in 10^2.05 to ^^ this was ignored by pass 2 solving the problems you noted.
Pass 3
sed -i 's/\^^/^/g' tmp
Which simply converts the ^^ back into ^
Based on the output you need, will this following method work for you?
[jaypal~/Temp]$ cat file0
5^4
3^-1
5.01 x 10^2.05
5.01 x 10^2
-5 x 10^3
[jaypal~/Temp]$ sed -e 's/^/\$/' -e 's/$/\$/' -e 's/x/\\cdot/' file0
$5^4$
$3^-1$
$5.01 \cdot 10^2.05$
$5.01 \cdot 10^2$
$-5 \cdot 10^3$
This might work for you:
sed -i 's/\(-\?[0-9]\+\(\.[0-9]\+\)\? \)x\( -\?[0-9]\+\^-\?[0-9]\+\(\.[0-9]\+\)\?\)\|\(-\?[0-9]\+\^-\?[0-9]\+\)/$\1\\cdot\3\5$/g;s/\$\\cdot/$/g' file
although the GNU sed -r switch makes it look a lot less cluttered:
sed -ri 's/(-?[0-9]+(\.[0-9]+)? )x( -?[0-9]+\^-?[0-9]+(\.[0-9]+)?)|(-?[0-9]+\^-?[0-9]+)/$\1\\cdot\3\5$/g;s/\$\\cdot/$/g' file