unix extract multiple lines - sed

I have the following file:
$cat somefile
Line1 T:10 Hello
Var1 = value1
Var2 = value2
Line2 T:2 Where
VarX1 = ValueX1
VarX2 = ValueX2
Line3 T:10 AAAA
Var10 = Val1
Var11 = Val11
Line4 T:10 ABCC
Var101 = Val110
...
What I need is by giving the search criteria, it should get multiple lines.
For example, if the search criteria is -- T:10 -- then it should give
Line1 T:10 Hello
Var1 = value1
Var2 = value2
Line3 T:10 AAAA
Var10 = Val1
Var11 = Val11
I tried the sed command
sed -ne '/T:10/,/^$/p' somefile
But this is not working properly, it is getting other lines too sometimes.
Is there anything that I am doing wrong here?

This is a "paragraph" grep. Linux/GNU grep doesn't have a paragraph mode and I haven't done it in sed, but you can use perl.
perl -00 -ne 'print if /T:10/' somefile

Here's a bash solution
#!/bin/bash
data=$(<file)
search="T:10"
OLDIFS="$IFS"
IFS="|"
data=(${data//$'\n\n'/|})
for i in "${!data[#]}"
do
case "${data[$i]}" in
*"${search}"* ) echo "$i : ${data[$i]}" ;;
esac
done
IFS="$OLDIFS"

This might work for you (GNU sed):
sed -n '/T:10/{:a;N;/^$/M!ba;p}' file
Turn off automatic printing by using the -n option. Gather up lines between T:10 and an empty line and print them otherwise do not.

Related

sed read entire lines from one file and replace lines in another file

I have two files: file1, file with content as follows:
file1:
f1 line1
f1 line2
f1 line3
file2:
f2 line1
f2 line2
f2 line3
f2 line4
Wonder how I can use sed to read line 1 to line 3 from file1 and use these lines to replace line2 to line 3 in file 2.
The output should be like this:
f2 line1
f1 line1
f1 line2
f1 line3
f2 line4
Can any one help? Thanks,
If you have GNU sed which supports s///e, you can use the e to call head -n3 file1
I think (w/o yet testing):
sed '2d;3s/.*/head -n3 file1/e' file2
I'll go verify...
sed is the best tool for doing s/old/new on individual strings. That's not what you're trying to do so you shouldn't be considering trying to use sed for it. Using any awk in any shell on every UNIX box:
$ cat tst.awk
NR==FNR {
if ( FNR <= num ) {
new = new $0 ORS
}
next
}
(beg <= FNR) && (FNR <= end) {
if ( !done++ ) {
printf "%s", new
}
next
}
{ print }
.
$ awk -v num=3 -v beg=2 -v end=3 -f tst.awk file1 file2
f2 line1
f1 line1
f1 line2
f1 line3
f2 line4
To read a different number of lines from file1 just change the value of num, to use the replacement text for a different range of lines in file2 just change the values of beg and end. So, for example, if you wanted to use 10 lines of data from a pipe (e.g. seq 15 |) instead of file1 and wanted to replace between lines 3 and 17 of a file2 like you have but with 20 lines instead of 4 then you'd leave the awk script as-is and just tweak how you call it:
$ seq 15 | awk -v num=10 -v beg=3 -v end=17 -f tst.awk - file2
f2 line1
f2 line2
1
2
3
4
5
6
7
8
9
10
f2 line18
f2 line19
f2 line20
Try doing the same with sed for such a minor change in requirements and note how you're changing the script and it's not portable to other sed versions.

How to delete the records that has '?' in a file using sed

How to delete the records that has '?' in the file ?
Input file data
12345 Line1
?
34567 Line2
?
89101 Line3
Expected Output
12345 Line1
34567 Line2
89101 Line3
sed '/?/d' yourfile
or
grep -v '?' yourfile
if you only wanted just a '?' and nothing else, do '^?$' instead of just the ?.
Seems like you want to replace a blankline followed by a ? symbol again followed blank line with a single blank line. If so then try the below perl command.
perl -0777 -pe 's/\n\?\n//g' file
Example:
$ perl -0777 -pe 's/\n\?\n//g' file
12345 Line1
34567 Line2
89101 Line3

Sed: Print lines between string and another string in one line

I have 100 html files in a directory
I need to print a line from each file that matches a regex and at the same time print the lines between 2 regex.
The commands below provide the results, correctly
sed -n '/string1/p' *.html >result.txt
sed -n '/string2/,/string3/p' *.html > result2.txt
but I need them in one result.txt file, in the format
string1
string2
string3
I have been trying with grep, awk and sed and have searched but I have not found the answer.
Any help would be appreciated.
This might work for you:
sed -n '/string1/p;/string2/;/string3/p' INPUTFILE > OUTPUTFILE
Or here's an awk solution:
awk '/string1/ { print }
/srting2/ { print ; p = 1 }
p == 1 { print }
/string3/ { print ; p = 0 }' INPUTFILE > OUTPUTFILE
Simply put both SED epressions in one invocation:
echo $'a\nstring1\nb\nstring2\nc\nstring3\nd\n' | \
sed -n -e '/string1/p' -e '/string2/,/string3/p'
Input is:
a
string1
b
string2
c
string3
d
Output is:
string1
string2
c
string3

Find duplicate records in file

I have a text file with lines like below:
name1#domainx.com, name1
info#domainy.de, somename
name2#domainz.com, othername
name3#domainx.com, name3
How can I find duplicate domains like domainx.com with sed or awk?
With GNU awk you can do:
$ awk -F'[#,]' '{a[$2]++}END{for(k in a) print a[k],k}' file
1 domainz.com
2 domainx.com
1 domainy.de
You can use sort to order the output i.e. ascending numerical with -n:
$ awk -F'[#,]' '{a[$2]++}END{for(k in a) print a[k],k}' file | sort -n
1 domainy.de
1 domainz.com
2 domainx.com
Or just to print duplicate domains:
$ awk -F'[#,]' '{a[$2]++}END{for(k in a)if (a[k]>1) print k}' file
domainx.com
Here:
sed -n '/#domainx.com/ p' yourfile.txt
(Actually is grep what you should use for that)
Would you like to count them? add an |nl to the end.
Using that minilist you gave, using the sed line with |nl, outputs this:
1 name1#domainx.com, name1
2 name3#domainx.com, name3
What if you need to count how many repetitions have each domain? For that try this:
for line in `sed -n 's/.*#\([^,]*\).*/\1/p' yourfile.txt|sort|uniq` ; do
echo "$line `grep -c $line yourfile.txt`"
done
The output of that is:
domainx.com 2
domainy.de 1
domainz.com 1
Print only duplicate domains
awk -F"[#,]" 'a[$2]++==1 {print $2}'
domainx.com
Print a "*" in front of line that are listed duplicated.
awk -F"[#,]" '{a[$2]++;if (a[$2]>1) f="* ";print f$0;f=x}'
name1#domainx.com, name1
info#domainy.de, somename
name2#domainz.com, othername
* name3#domainx.com, name3
This version paints all line with duplicate domain in color red
awk -F"[#,]" '{a[$2]++;b[NR]=$0;c[NR]=$2} END {for (i=1;i<=NR;i++) print ((a[c[i]]>1)?"\033[1;31m":"\033[0m") b[i] "\033[0m"}' file
name1#domainx.com, name1 <-- This line is red
info#domainy.de, somename
name2#domainz.com, othername
name3#domainx.com, name3 <-- This line is red
Improved version (reading the file twice):
awk -F"[#,]" 'NR==FNR{a[$2]++;next} a[$2]>1 {$0="\033[1;31m" $0 "\033[0m"}1' file file
name1#domainx.com, name1 <-- This line is red
info#domainy.de, somename
name2#domainz.com, othername
name3#domainx.com, name3 <-- This line is red
If you have GNU grep available, you can use the PCRE matcher to do a positive look-behind to extract the domain name. After that sort and uniq can find duplicate instances:
<infile grep -oP '(?<=#)[^,]*' | sort | uniq -d
Output:
domainx.com

merging matched lines with sed

I saw some answers here, but can't make them work for me.
I have text like this:
line1
line2 text=^M
line3
line4
basically what i need is to replace =^M\n with empty character something like s/=^M\n//, so the output is (^M is special character ctrl+v ctrl+m)
line1
line2 textline3
line4
I know it's some sed branches but I have problem with making them work.
One way:
$ sed '/^M/{N;s/=^M\n//;}' file
line1
line2 textline3
line4
Where ^M has to be typed as: Ctrl-V + Ctrl-M
awk solution for this
#awk -f myawk.sh temp.txt
BEGIN { print "Start Records"}
{
if ($2 ~ /=\^M/){
a=$1;
gsub("=\\^M","",$2);
b=$2; f=1
}
else {
if(f==1){
print a""b""$0;
a="";
b="";
}else{
print $0
}
}
}
END {print "Process Complete"}