sed - add new line with matched groups value - sed

I have input text with a pattern '([\w_]+TAG) = "\w+";', and if matched, then append a new line with matched group string like 'found \1'. for example:
input text:
abcTAG = "hello";
efgTAG = "world";
expected output:
abcTAG = "hello";
found abcTAG
efgTAG = "world";
found efgTAG
I try below sed command but not work:
sed -E '/(\w+TAG) = "\w+";/a found \1' a.txt
Current output:
abcTAG = "hello";
found 1
efgTAG = "world";
found 1

You cannot use the backreference \1 in a command. Please try instead:
sed -E 's/(\w+TAG) = "\w+";/&\nfound \1/' a.txt
Output:
abcTAG = "hello";
found abcTAG
efgTAG = "world";
found efgTAG
Please note it assumes GNU sed which supports \w and \n.
[Edit]
If you want to match the line endings with the input file, please try:
sed -E 's/(\w+TAG) = "\w+";(\r?)/&\nfound \1\2/' a.txt

Related

How to use sed to find the line and its following line

I would like to get the lines which begin with "Xboy" and its following lines which begins with "+". How can I do this by using sed?
The input looks like below:
Xapple
+apple1
+apple2
.ends
Xboy
+boy1
+boy2
V2
Xcat
+cat1
+cat2
Xcat
The output should look like below:
Xboy
+boy1
+boy2
This will do the job in sed, but really this problem is more complicated than sed is intended for. You'd be better off using perl or python.
$ cat foo.txt
Xapple
+apple1
+apple2
.ends
Xboy
+boy1
+boy2
V2
Xcat
+cat1
+cat2
Xcat
$ sed ':section;/Xboy/!d;:plusline;n;/^+/b plusline;b section' foo.txt
Xboy
+boy1
+boy2
In a proper programming language, the nested loop structure of the data becomes clearer, and we can be more confident there are no edge cases we've forgotten about.
In Perl:
my $line = <>;
while (defined($line)) {
chomp($line);
if ($line eq "Xboy") {
print $line, "\n";
$line = <>;
while (defined($line) && $line =~ /^\+/) {
print $line;
$line = <>;
}
}
else {
$line = <>;
}
}
In Python:
import fileinput
lines = fileinput.input()
line = lines.readline()
while line != '':
line = line.rstrip('\n')
if line == 'Xboy':
print(line)
line = lines.readline()
while line != '' and line.startswith('+'):
print(line, end='')
line = lines.readline()
else:
line = lines.readline()
An awk version
awk '/Xboy/ {f=1;print;next} {/^+/?a=1:f=0} a&&f' file
Xboy
+boy1
+boy2
This might work for you (GNU sed):
sed -n ':a;/Xboy/{:b;p;n;/^+/bb;ba}' file
If a line contains Xboy, print it and any following lines that begin + otherwise be silent.
I guessed this is what you intended, however you may have meant that other lines beginning with non-word-like characters should also be ignored, use:
sed -n ':a;/Xboy/{:b;p;:c;n;/^+/bb;/^\W/bc;ba}' file
or perhaps you meant this:
sed -n ':a;/Xboy/{:b;p;:c;n;/^+/bb;/^[^[:upper:]]/bc;ba}' file
It may be that you only want to print Xboy if there is a line following that begins +, then use:
sed -n ':a;/Xboy/{$d;h;:b;n;/^+/{H;$!bb};x;/\n/p;x;ba}' file

How to replace a block of code between two patterns with blank lines?

I am trying replace a block of code between two patterns with blank lines
Tried using below command
sed '/PATTERN-1/,/PATTERN-2/d' input.pl
But it only removes the lines between the patterns
PATTERN-1 : "=head"
PATTERN-2 : "=cut"
input.pl contains below text
=head
hello
hello world
world
morning
gud
=cut
Required output :
=head
=cut
Can anyone help me on this?
$ awk '/=cut/{f=0} {print (f ? "" : $0)} /=head/{f=1}' file
=head
=cut
To modify the given sed command, try
$ sed '/=head/,/=cut/{//! s/.*//}' ip.txt
=head
=cut
//! to match other than start/end ranges, might depend on sed implementation whether it dynamically matches both the ranges or statically only one of them. Works on GNU sed
s/.*// to clear these lines
awk '/=cut/{found=0}found{print "";next}/=head/{found=1}1' infile
# OR
# ^ to take care of line starts with regexp
awk '/^=cut/{found=0}found{print "";next}/^=head/{found=1}1' infile
Explanation:
awk '/=cut/{ # if line contains regexp
found=0 # set variable found = 0
}
found{ # if variable found is nonzero value
print ""; # print ""
next # go to next line
}
/=head/{ # if line contains regexp
found=1 # set variable found = 1
}1 # 1 at the end does default operation
# print current line/row/record
' infile
Test Results:
$ cat infile
=head
hello
hello world
world
morning
gud
=cut
$ awk '/=cut/{found=0}found{print "";next}/=head/{found=1}1' infile
=head
=cut
This might work for you (GNU sed):
sed '/=head/,/=cut/{//!z}' file
Zap the lines between =head and =cut.

Merge two lines into one within a configuration file

I have several AIX systems with a configuration file, let's call it /etc/bar/config. The file may or may not have a line declaring values for foo. An example would be:
foo = A_1,GROUP_1,USER_1,USER_2,USER_3
The foo line may or may not be the same on all systems. Different systems may have different values and different a different number of values. My task is to add "bare minimum" values in the config file on all systems. The bare minimum line will look like this.
foo = A_1,USER_1,SYS_1,SYS_2
If the line does not exist, I must create it. If the line does exist, I must merge the two lines. Using my examples, the result would be this. The order of the values does not matter.
foo = A_1,GROUP_1,USER_1,USER_3,USER_2,SYS_1,SYS_2
Obviously I want a script to do my work. I have the standard sh, ksh, awk, sed, grep, perl, cut, etc. Since this is AIX, I do not have access to the GNU versions of these utilities.
Originally, I had a script with these commands to replace the entire foo line.
cp /etc/bar/config /etc/bar/config.$$
sed "s/foo = .*/foo = A_1,USER_1,SYS_1,SYS_2/" /etc/bar/config.$$ > /etc/bar/config
But this simply replaces the line. It does take into consideration any pre-existing configuration, including a line that's missing. And I'm doing other configuration modifications in the script, such as adding completely unique lines to other files and restarting a process, so I'd perfer this be some type of shell-based code snippet I can add to my change script. I am open to other options, especially if the solution is simpler.
Some dirty bash/sed:
#!/usr/bin/bash
input_file="some_filename"
v=$(grep -n '^foo *=' "$input_file")
lineno=$(cut -d: -f1 <<< "${v}0:")
base="A_1,USER_1,SYS_1,SYS_2,"
if [[ "$lineno" == 0 ]]; then
echo "foo = A_1,USER_1,SYS_1,SYS_2" >> "$input_file"
else
all=$(sed -n ${lineno}'s/^foo *= */'"$base"'/p' "$input_file" | \
tr ',' '\n' | sort | uniq | tr '\n' ',' | \
sed -e 's/^/foo = /' -e 's/, *$//' -e 's/ */ /g' <<< "$all")
sed -i "${lineno}"'s/.*/'"$all"'/' "$input_file"
fi
Untested bash, etc.
config=/etc/bar/config
default=A_1,USER_1,SYS_1,SYS_2
pattern='^foo[[:blank:]]*=[[:blank:]]*' # shared with grep and sed
if current=$( grep "$pattern" "$config" | sed "s/$pattern//" )
then
new=$( echo "$current,$default" | tr ',' '\n' | sort | uniq | paste -sd, )
sed "s/$pattern.*/foo = $new/" "$config" > "$config.$$.tmp" &&
mv "$config.$$.tmp" "$config"
else
echo "foo = $default" >> "$config"
fi
A vanilla perl solution:
perl -i -lpe '
BEGIN {%foo = map {$_ => 1} qw/A_1 USER_1 SYS_1 SYS_2/}
if (s/^foo\s*=\s*//) {
$found=1;
$foo{$_}=1 for split /,/;
$_ = "foo = " . join(",", keys %foo);
}
END {print "foo = " . join(",", keys %foo) unless $found}
' /etc/bar/config
This Perl code will do as you ask. It expects the path to the file to be modified as a parameter on the command line.
Note that it reads the entire input file into the array #config and then overwrites the same file with the modified data.
It works by building a hash %values from a combination of the items already present in the foo = line and the list of defaults items in #defaults. The combination is sorted in alphabetical order and joined eith a comma
use strict;
use warnings;
my #defaults = qw/ A_1 USER_1 SYS_1 SYS_2 /;
my ($file) = #ARGV;
my #config = <>;
open my $out_fh, '>', $file or die $!;
select $out_fh;
for ( #config ) {
if ( my ($pfx, $vals) = /^(foo \s* = \s* ) (.+) /x ) {
my %values;
++$values{$_} for $vals =~ /[^,\s]+/g;
++$values{$_} for #defaults;
print $pfx, join(',', sort keys %values), "\n";
}
else {
print;
}
}
close $out_fh;
output
foo = A_1,GROUP_1,SYS_1,SYS_2,USER_1,USER_2,USER_3
Since you didn't provide sample input and expected output I couldn't test this but this is the right approach:
awk '
/foo = / { old = ","$3; next }
{ print }
END {
split("A_1,USER_1,SYS_1,SYS_2"old,all,/,/)
for (i in all)
if (!seen[all[i]]++)
new = (new ? new "," : "") all[i]
print "foo =", new
}
' /etc/bar/config > tmp && mv tmp /etc/bar/config

Sed: Print lines between string and another string in one line

I have 100 html files in a directory
I need to print a line from each file that matches a regex and at the same time print the lines between 2 regex.
The commands below provide the results, correctly
sed -n '/string1/p' *.html >result.txt
sed -n '/string2/,/string3/p' *.html > result2.txt
but I need them in one result.txt file, in the format
string1
string2
string3
I have been trying with grep, awk and sed and have searched but I have not found the answer.
Any help would be appreciated.
This might work for you:
sed -n '/string1/p;/string2/;/string3/p' INPUTFILE > OUTPUTFILE
Or here's an awk solution:
awk '/string1/ { print }
/srting2/ { print ; p = 1 }
p == 1 { print }
/string3/ { print ; p = 0 }' INPUTFILE > OUTPUTFILE
Simply put both SED epressions in one invocation:
echo $'a\nstring1\nb\nstring2\nc\nstring3\nd\n' | \
sed -n -e '/string1/p' -e '/string2/,/string3/p'
Input is:
a
string1
b
string2
c
string3
d
Output is:
string1
string2
c
string3

multiple line tag content replacement if content matches

I am not very proficient in perl, awk, or sed and I have been searching the web for a solution to my problem for some while now, but wasn't very successful.
I would like to replace
<math> ... </math>
with
<math>\begin{align} ... \end{align}</math>
if ... contains \\. My problem is that the string between the <math> tags can span multiple lines. I managed to replace the tags within one line with sed but couldn't get it to run for multiple lines.
Any simple solution with perl, awk, or sed is very welcome. Thanks a lot.
Use separate expressions for each tag and the script will be immune to multilinedness:
sed -e 's,<math>,&\\begin{align},g' -e 's,</math>,&\\end{align},g'
Edit:
Multiline awk version:
awk '/<math>/,/<\/math>/ {
if (index($0, "<math>")) {
a=$0
} else {
b = b $0
}
if (index($0, "</math>")) {
if (index(b,"\\\\")) {
sub("<math>","&\\begin{align}", a)
sub("</math>","\\end{align}&", b)
};
print a,b
a=""
b=""
}
}'
Try next perl command. How it works? It reads content file in slurp mode saving it in $f variable and later add with a regexp in single mode (match newlines with .) \begin{regex} and \end{regex} if found \\ between math tags.
perl -e '
do {
$/ = undef;
$f = <>
};
$f =~ s#(<math>)(.*\\\\.*)(</math>)#$1\\begin{align}$2\\end{align}$3#s;
printf qq|%s|, $f
' infile
This might work for you (GNU sed):
sed ':a;$!{N;ba}
/[\x00\x01\x02]/q1
s/<math>/\x00/g
s/<\/math>/\x01/g
s/\\\\/\x02/g
s/\x00\([^\x01\x02]*\)\x01/<math>\1<\/math>/g
s/\x00/<math>\\begin{align}/g
s/\x01/\\end{align}<\/math>/g
s/\x02/\\\\/g' file