How to find and replace every even-numbered appearance of a match in BASH? - perl

I am using sed -i 's/AAA/ZZZ/g' filename to replace every occurance of "AAA" with "ZZZ" in a file. I need to instead replace every even-numbered appearance of "AAA" with "ZZZ", e.g.:
This is a AAA sentence. AAA
This is another AAA sentence.
This is yet AAA another AAA sentence.
This is AAA stillAAA AAA yet AAA another AAA sentence.
This would become:
This is a AAA sentence. ZZZ
This is another AAA sentence.
This is yet ZZZ another AAA sentence.
This is ZZZ stillAAA ZZZ yet AAA another ZZZ sentence.
How to replace every even-numbered appearance of a match?

Here is a short gnu awk version
awk '{ORS=NR%2==0?"ZZZ":RS}1' RS="AAA" file
This is a AAA sentence. ZZZ
This is another AAA sentence.
This is yet ZZZ another AAA sentence.
This is ZZZ stillAAA ZZZ yet AAA another ZZZ sentence.

awk is better tool for this than sed. Consider this awk command:
awk -F 'AAA' '{for (i=1; i<NF; i++) {OFS=c%2?"ZZZ":FS; printf "%s%s", $i, OFS; c++}
print $NF}' file
This is a AAA sentence. ZZZ
This is another AAA sentence.
This is yet ZZZ another AAA sentence.
This is ZZZ stillAAA ZZZ yet AAA another ZZZ sentence.
This awk sets the input field separator as AAA and and toggles output field separator between AAA and ZZZ depending upon a counter is odd or even. Every time counter is even OFS is set to AAA and when it is odd OFS is set to ZZZ

Here is a perl solution:
$ cat inp
This is a AAA sentence. AAA
This is another AAA sentence.
This is yet AAA another AAA sentence.
This is AAA stillAAA AAA yet AAA another AAA sentence.
$ perl -pe 'my $line = "" ; while(<>){ $line=$line.$_} $line =~ s/(.*?AAA.*?)AAA/\1ZZZ/mgs; print $line;' < inp
This is another AAA sentence.
This is yet ZZZ another AAA sentence.
This is ZZZ stillAAA ZZZ yet AAA another ZZZ sentence.
Here, first I accumulate entire file in a variable $line. & Then, I replace every alternate occurrence of AAA with ZZZ; using non-greedy matching.

Perl:
perl -wpe 'BEGIN{$/="AAA"} $.%2 or s/AAA/ZZZ/' foo.txt

You can do it with sed too:
sed -n -e '1,$ {
:oddline s/AAA/\n/g; :odd s/\n/AAA/m; t even ;p;N;s/.*\n//;b oddline ;
:evenline s/AAA/\n/g; :even s/\n/ZZZ/m; t odd ; p;N;s/.*\n//;b evenline ;
}' << _END_
This is a AAA sentence. AAA
This is another AAA sentence.
This is yet AAA another AAA sentence.
This is AAA stillAAA AAA yet AAA another AAA sentence.
_END_
The sed script loops through all lines and remembers odd/even replacements (across lines). In the pattern space, all AAAs are first replaced by newlines and then replaced one at a time by either AAA or ZZZ. In order to switch to the next line it is first appended (N) and then the previous one is deleted (s/.*\n//).

sed "1 h;1 !H;$ {x;l;s/=/=e/g;s/²/=c/g;s/AAA/²/g;s/²\([^²]\{1,\}\)²/²\1ZZZ/g;s/²/AAA/g;s/=c/²/g;s/=e/=/g;}" YourFile
Using substitution (due to AAA that could be inside a .*) insurring that even with substitute char is inside it work with the double translation before and after

This might work for you (GNU sed):
sed -r ':a;$!{N;ba};/\x00/q1;s/AAA/\x00/g;s/(\x00)([^\x00]*)\1/AAA\2ZZZ/g' file
This slurps the file into memory and then replaces all occurences of AAA with a unique character. Then every odd and even occurence of the unique character is replaced by AAA and ZZZ respectively.
N.B. If the unique character is not unique, no change is made to the file and an error code of 1 is set.
This second method is more long-winded but can be used to change the N'th value and does not rely on an unique value:
sed -r 's/AAA/\n&/g;/\n/!b;G;:a;s/$/#/;s/#{2}$//;/\n$/s/\nAAA/\nZZZ/;s/\n//;/\n.*\n/ba;P;s/^.*\n//;h;d' file
It stores the number of occurences of the required pattern in the hold space and retrieves it when encounters a line with such a pattern.

Related

Using sed to extract data from a file. I know the string I'm looking but I need to get the whole block of data that this string is in

I'm using sed to extract data from a file. Lots of same style data in there. I want every occurrence of a specific string occurs but the string is part of a block of information and I want to extract the whole block based of that string.
Example data in file:
123
AAA
ABC
ZZZ
123
KJG
HJY
ZZZ
123
LPC
ABC
TRY
ZZZ
In this example 123 is the start of the block of data I want and ZZZ the end. ABC is the string I search for. So from this example my output should be:
123
AAA
ABC
ZZZ
123
LPC
ABC
TRY
ZZZ
sed -n '/ABC/{:a;p;n;/123/b;ba};' testfile.txt > testfile2.txt
the output with this is
ABC
ZZZ
ABC
TRY
ZZZ
so I'm not getting the data before ABC in the block
This might work for you (GNU sed):
sed -n '/123/{:a;N;/ZZZ/!ba;/ABC/p}' file
Gather up lines between 123 and ZZZ and then print them if they contain ABC.
N.B. n prints the current line and replaces it with the next. Whereas N appends the next line to the pattern space, inserting a newline. Thus keeping those lines current and searchable.

how to use markdown to create orderlist in github

anyone know how can i create a order number list using markdown on github? for unorder list I can
* aaa
* bbb
* ccc
then it looks like
aaa
bbb
ccc
but i want it looks like
aaa
bbb
ccc
The GitHub Flavored Markdown specification for list items does include:
An ordered list marker is a sequence of 1–9 arabic digits (0-9), followed by either a . character or a ) character.
(The reason for the length limit is that with 10 digits we start seeing integer overflows in some browsers.)
So your markdown source must include those digits explicitly:
1. aaa
2. aaa
3. aaa
As flaxel points out, you also have lazy numbering, which means the markdown source would be:
1. aaa
1. aaa
1. aaa

Delete string after '#' using sed

I have a text file that looks like:
#filelists.txt
a
# aaa
b
#bbb
c #ccc
I want to delete parts of lines starting with '#' and afterwards, if line starts with #, then to delete whole line.
So I use 'sed' command in my shell:
sed -e "s/#*//g" -e "/^$/d" filelists.txt
I wish its result is:
a
b
c
but actually result is:
filelists.txt
a
aaa
b
bbb
c ccc
What's wrong in my "sed" command?
I know '*' which means "any", so I think that '#*' means string after "#".
Isn't it?
You may use
sed 's/#.*//;/^$/d' file > outfile
The s/#.*// removes # and all the rest of the line and /^$/d drops empty lines.
See an online test:
s="#filelists.txt
a
# aaa
b
#bbb
c #ccc"
sed 's/#.*//;/^$/d' <<< "$s"
Output:
a
b
c
Another idea: match lines having #, then remove # and the rest of the line there and drop if the line is empty:
sed '/#/{s/#.*//;/^$/d}' file > outfile
See another online demo.
This way, you keep the original empty lines.
* does not mean "any" (at least not in regular expression context). * means "zero or more of the preceding pattern element". Which means you are deleting "zero or more #". Since you only have one #, you delete it, and the rest of the line is intact.
You need s/#.*//: "delete # followed by zero or more of any character".
EDIT: was suggesting grep -v, but didn't notice the third example (# in the middle of the line).

perl : copy line with specific string and replace strings in copied line [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
Here is my question. lets say i have one file, lets say a.txt with contents as follow:
a aa aaa
b bb bbb
c cc ccc
b bb bbb
d dd ddd
then i want to find all lines (one by one) having string lets say "bb" then copy this line and replace the string "bb" with "ee" in newly copied/duplicated line.
so my final output i.e. contents of file a.txt, will be as follow:
a aa aaa
b bb bbb
b ee bbb
c cc ccc
b bb bbb
b ee bbb
d dd ddd
Can anybody help me with perl command for this? or is there any better option than perl to achieve this?
This is a pretty straightforward Perl program; just read lines, match them with whatever you're searching for, and substitute text as necessary.
Run this as perl script.pl < a.txt:
while (<>) {
print;
if (m/bb/) {
s/bb/ee/g;
print;
}
}
Note that if you only want to substitute whole words (instead of things that just contain bb), you'll need to adjust the regexes with \b word boundaries accordingly:
while (<>) {
print;
if (m/\bbb\b/) {
s/\bbb\b/ee/g;
print;
}
}
Edit: Since the request was specifically for a "perl command for this," here's a one-liner that embeds the script directly:
perl -n -e 'print; if (m/\bbb\b/) { s/\bbb\b/ee/g; print; }' < a.txt

tricky multiline erase in SED

here is the input:
aaa
bbb
ccc
ddd
eee
fff
what I want? do sth like" sed "/ccc/,/(eee)/d" BUT ALSO DELETE "bbb" line (before "ccc")
so that output is:
aaa
fff
any ideas?
This might work for you (GNU sed):
sed ':a;$!{N;/\nccc/!{P;D};/\neee/!ba;d}' file
If you are fine with awk, this should do:
$ awk '/ccc/,/eee/{if(i!=1){i=1;x="";}next}{if (x)print x;x=$0;}END{print x}' file
aaa
fff
Every previous line is printed in the above case. Normal range filtering is done using awk. However, within the range filter, the variable x is reset so that the previous record just before the range is not printed.
Update:
sed solution:
$ sed '${x;p;};/ccc/,/eee/{/ccc/{s/.*//;x;};d;};1{h;d;};x;/^$/d;' file
You could do this in a simple 2-pass approach, first pass to identify the lines to delete and the second pass to print only the lines that are not marked for deletion:
awk '/ccc/,/eee/{d[NR]=d[NR-1]=1} NR!=FNR && !d[FNR]' file file