how to find specific lines when it hits a certain pattern - sed

whenever it see a pattern of ; and abc[0] i need to print first line just after ; and a line which contain abc[0].
i have something like this
blah blah;
blah blah blah;
xyz blah blah,
blah blah
abc[2]
abc[1],
abc[0]
blah blah,
blah blah
abc[1],
abc[0]
blah blah
blah blah;
pqr blah blah
blah blah blah
abc[0]
required output is as shown below
xyz blah blah,
abc[0]
pqr blah blah
abc[0]
Thanks.

awk '/;/ { f=1; next } f{ print $0 ; f=0; next} /abc\[0\]/ { print }' inputfile
Explaination:
/;/ { f=1; next } - Set the flag to 1 when you encounter a line with `;` pattern.
Since I believe you want to print the line after the `;` and not one with the `;`
you do next to skip the entire pattern action statements
f{ print $0 ; f=0; next} - If the flag is true, you print the line, set the flag to false
and skip the rest.
/abc\[0\]/ { print } - If you find the second pattern you print it.

Using GNU Grep
With the input from your question stored as /tmp/corpus, the following will provide the correct output by filtering out context lines.
{ egrep -A1 ';|abc\[0\]' | egrep -v ';|^--'; } < /tmp/corpus
If you have GNU Grep, but your shell isn't Bash or doesn't support the redirection as above, the following pipeline is equivalent but (in my opinion) less readable:
egrep -A1 ';|abc\[0\]' /tmp/corpus | egrep -v ';|^--'

Pure GNU sed
sed -n '/;/ {:k n;h; // bk}; /abc\[0\]/ {x ;//! {p;x;p;x}}' file
xyz blah blah,
abc[0]
pqr blah blah
abc[0]

Related

Using sed, how can I append a string to lines that contain a pattern and don't already have the string appended?

Eq.
I have file with the following contents
ERR001 just some random text
ERR002 blah blah blah
ERR001 again some text //IGNORE
ERR001 blah blahblah blah blah
ERR002 abc def ghi
I to write a sed command that append //IGNORE to all line that have ERR001 but do not already have //IGNORE appended. So the sed command should give the below output for the above file
ERR001 just some random text //IGNORE
ERR002 blah blah blah
ERR001 again some text //IGNORE
ERR001 blah blahblah blah blah //IGNORE
ERR002 abc def ghi
sed solution:
sed '/\/\/IGNORE$/! s/^ERR001 .*/& \/\/IGNORE/' inputfile
/\/\/IGNORE$/! - negated matching, ensures that line is not ended with //IGNORE (! - negation sign)
s/^ERR001 .*/& \/\/IGNORE/ - substitute a line that started with ERR001
The output:
ERR001 just some random text //IGNORE
ERR002 blah blah blah
ERR001 again some text //IGNORE
ERR001 blah blahblah blah blah //IGNORE
ERR002 abc def ghi

Sed - substitute only within the line containing braces

I have been struggling with this all day. Trying to make variables in sections of a line only contained within braces.
Lines look like this:
blah blah [ae b c] blah [zv y] blah
I need to make this:
blah blah [$ae $b $c] blah [$zv $y] blah
There must be an easy way to do this. However, whenever I try
$ echo "blah blah [ae b c] blah [zv y] blah" | sed 's/\[\(\b.*\b\)\]/$\1/g'
I get greedy matching and just one variable:
blah blah $ae b c] blah [zv y blah
Is there something better?
Thanks,
$ echo "blah blah [ae b c] blah [zv y] blah" | sed -r ':b; s/([[][^]$]* )([[:alnum:]]+)/\1$\2/g; t b; s/[[]([[:alnum:]])/[$\1/g'
blah blah [$ae $b $c] blah [$zv $y] blah
How it works
-r
This turns on extended regex.
:b
This creates a label b.
s/([[][^]$]* )([[:alnum:]]+)/\1$\2/g
This looks for [, followed by anything except ] or $, followed by a space, followed by any alphanumeric characters. It puts a $ in front of the alphanumeric characters.
Note that awk convention that makes [[] match [ while [^]$] matches anything except ] and $. This is more portable than attempting to escape these characters with backslashes.
t b
If the command above resulted in a substitution, this branches back to label b so that the substitution is attempted again.
s/[[]([[:alnum:]])/[$\1/g
The last step is to look for [ followed by an alphanumeric character and put a $ between them.
Because [[:alnum:]] is used, this code is unicode-safe.
Mac OSX (BSD) Version
On BSD sed (OSX) limits the ability to combine statements with semicolons. Try this instead:
sed -E -e ':b' -e 's/([[][^]$]* )([[:alnum:]]+)/\1$\2/g' -e 't b' -e 's/[[]([[:alnum:]])/[$\1/g'
To disable it being greedy, instead of matching any character, match any character except closing bracket:
sed 's/\[\(\b[^]]*\b\)\]/$\1/g'
The task you want to do cannot be done with sed because context-sensitive matching cannot be described with regular grammar.
It's difficult to solve it using sed. As alternative, you can use perl with the help of the Text::Balanced module, that extracts text between balanced delimiters, like square brackets. Each call returns an array with the content between delimiters, the text before them and the text after them, so you can apply the regex that insert $ sign to the significative part of the string.
perl -MText::Balanced=extract_bracketed -lne '
BEGIN { $result = q||; }
do {
#result = extract_bracketed($_, q{[]}, q{[^[]*});
if (! defined $result[0]) {
$result .= $result[1];
last;
}
$result[0] =~ s/(\[|\s+)/$1\$/g;
$result .= $result[2] . $result[0];
$_ = $result[1];
} while (1);
END { printf qq|%s\n|, $result; }
' infile
It yields:
blah blah [$ae $b $c] blah [$zv $y] blah
sed 's/\[\([^]]*\)\]/[ \1]/g
:loop
s/\(\(\[[^]$]*\)\([[:blank:]]\)\)\([^][:blank:]$][^]]*\]\)/\1\$\4/g
t loop
s/\[ \([^]]*\)\]/[\1]/g' YourFile
posix version
assuming there is no bracket inside bracket like [a b[c] d ]
algo:
add a space char after opening bracket (needed to use blank as starting word separator an often no space for first one)
label anchor for a loop
add a $ in front of last word between bracket that does not have one (not starting by $). Do it for each bracket group in line, but 1 add per group only
if occuring, retry another time going to label loop
remove the first space added in first operation
This might work for you (GNU sed):
sed -r 'h;s/\</$/g;T;G;s/^/\n/;:a;s/\n[^[]*(\[[^]]*\])(.*\n)([^[]*)[^]]*\]/\3\1\n\2/;ta;s/\n(.*)\n(.*)/\2/' file
Make a copy of the current line. Insert $ infront of all start-of-word boundaries. If nothing is substituted print the current line and bale out. Otherwise append the copy of the unadulterated line and insert a newline at the start of the adulterated current line. Using substitution and pattern matching replace the parts of the line between [...] with the original matching parts using the newline to move the match forwards through the line. When all matches have been made replace the end of the original line and remove the newlines.

Perl conditional extraction of lines from file

Could anyone suggest me a method where I could extract a few lines of text while reading it.
file sample structure:
A blah blah string1
B blah blah
C blah string2
D blah string3 blah
E blah blah
F blah string2
G blah string3 blah
H blah blah string1
I blah blah
J blah string2
Here I want to extract lines starting with string "string1" followed/ended by "string2"
In effect I want lines A-C and H-J in the above example.
My experiments are failing with the presence of line F which I would want to ignore.
Perl one liner and the Flip-flop Operator ..:
$ perl -ne 'print if /\bstring1\b/ .. /\bstring2\b/' file
A blah blah string1
B blah blah
C blah string2
H blah blah string1
I blah blah
J blah string2
\b in the above regex is called word boundary. It matches between a word characetr and a non word character.
From Perl --help
-n assume "while (<>) { ... }" loop around program
-e program one line of program (several -e's allowed, omit programfile)
This can be done as well in awk and sed by indicating the patterns between which you want to print the lines:
sed -n '/string1/,/string2/p' file
awk '/string1/,/string2/' file
In Perl you can say:
perl -e 'while (<>){print if (/string1/../string2/);}' file
Which is equivalent to
perl -ne '{print if (/string1/../string2/)}' file
^
All of them return:
A blah blah string1
B blah blah
C blah string2
H blah blah string1
I blah blah
J blah string2

Replace complete line getting number from variable

I have a file with a certain line, let's say...
AAA BBB CCC
I need to replace that entire line, after finding it, so I did:
q1=`grep -Hnm 1 "AAA" FILE | cut -d : -f 2`
That outputs me the line number of the first occurrence (in q1), because it has more than one occurrence, now, here comes my problem... In a previous step I was using this sed to replace a certain line in the file:
sed -e '3s/.*/WHATEVER/' FILE
To replace (in the example, line 3) the full line with WHATEVER, but now if I try to use $q1 instead of the "3" indicating the line number it doesn't work:
sed -e '$q1s/.*/WHATEVER/' FILE
It's probably a stupid syntax mistake, any help is welcome; thanks in advance
Try:
sed -e "${q1}s/.*/WHATEVER/" FILE
I'd use awk for this:
awk '/AAA/ && !r {print "WHATEVER"; r=1; next} {print}' <<END
a
b
AAA BBB CCC
d
e
AAA foo bar
f
END
a
b
WHATEVER
d
e
AAA foo bar
f
If you want to replace the first occurrence of a string in a file, you could use this awk script:
awk '/occurrence/ && !f++ {print "replacement"; next}1' file
The replacement will only be printed the first time, as !f++ will only evaluate to true once (on subsequent evaluations, f will be greater than zero so !f will be false. The 1 at the end is always true, so for each line other than the matched one, awk does the default action and prints the line.
Testing it out:
$ cat file
blah
blah
occurrence 1 and some other stuff
blah
blah
some more stuff and occurrence 2
blah
$ awk '/occurrence/ && !f++ {print "replacement"; next}1' file
blah
blah
replacement
blah
blah
some more stuff and occurrence 2
blah
The "replacement" string could easily be set to the value of a shell variable in the following way:
awk -v rep="$r" '/occurrence/ && !f++ {print rep; next}1' file
where $r is a shell variable.
Using the same file as above and the example variable in your comment:
$ q2="x=\"Second\""
$ awk -v rep="$q2" '/occurrence/ && !f++ {print rep; next}1' file
blah
blah
x="Second"
stuff
blah
blah
some more stuff and occurrence 2
blah
sed "${q1} c\
WHATEVER" YourFile
but you can directly use
sed '/YourPatternToFound/ {s/.*/WHATEVER/
:a
N;$!ba
}' YourFile

SED renaming with unknown amount of characters before a "

I have a file1 that has some PHP code in it. I need to find the following: action="blahblah" and replace it with action="error.php". Problem is, I don't know how many characters are between the quotes in the original.
Here's what I have that doesn't work:
sed 's:action="^[^"]*":action="error.php":' <file1> file2
How can I do this?
Why have you got the ^ start-of-line marker before the character class? Try it with:
sed 's:action="[^"]*":action="error.php":' <file1 > file2
Here's a transcript showing your version alongside that correction:
pax$ echo 'blah action="something" blah' | sed '
...$ s:action="^[^"]*":action="error.php":'
blah action="something" blah
pax$ echo 'blah action="something" blah' | sed '
...$ s:action="[^"]*":action="error.php":'
blah action="error.php" blah