Filter text based in a multiline match criteria

Filter text based in a multiline match criteria - sed

I have the following sed command. I need to execute the below command in single line
cat File | sed -n '
/NetworkName/ {
N
/\n.*ims3/ p
}' | sed -n 1p | awk -F"=" '{print $2}'
I need to execute the above command in single line. can anyone please help.
Assume that the contents of the File is
System.DomainName=shayam
System.Addresses=Fr6
System.Trusted=Yes
System.Infrastructure=No
System.NetworkName=AS
System.DomainName=ims5.com
System.DomainName=Ram
System.Addresses=Fr9
System.Trusted=Yes
System.Infrastructure=No
System.NetworkName=Peer
System.DomainName=ims7.com
System.DomainName=mani
System.Addresses=Hello
System.Trusted=Yes
System.Infrastructure=No
System.NetworkName=Peer
System.DomainName=ims3.com
And after executing the command you will get only peer as the output. Can anyone please help me out?

You can use a single nawk command. And you can lost the useless cat
nawk -F"=" '/NetworkName/{n=$2;getline;if($2~/ims3/){print n} }' file
You can use sed as well as proposed by others, but i prefer less regex and less clutter.
The above save the value of the network name to "n". Then, get the next line and check the 2nd field against "ims3". If matched, then print the value of "n".

Put that code in a separate .sh file, and run it as your single-line command.

cat File | sed -n '/NetworkName/ { N; /\n.*ims3/ p }' | sed -n 1p | awk -F"=" '{print $2}'

Assuming that you want the network name for the domain ims3, this command line works without sed:
grep -B 1 ims3 File | head -n 1 | awk -F"=" '{print $2}'

So, you want the network name where the domain name on the following line includes 'ims3', and not the one where the following line includes 'ims7' (even though the network names in the example are the same).
sed -n '/NetworkName/{N;/ims3/{s/.*NetworkName=\(.*\)\n.*/\1/p;};}' File
This avoids abuse of felines, too (not to mention reducing the number of commands executed).
Tested on MacOS X 10.6.4, but there's no reason to think it won't work elsewhere too.
However, empirical evidence shows that Solaris sed is different from MacOS sed. It can all be done in one sed command, but it needs three lines:
sed -n '/NetworkName/{N
/ims3/{s/.*NetworkName=\(.*\)\n.*/\1/p;}
}' File
Tested on Solaris 10.

You just need to put -e pretty much everywhere you'd break the command at a newline or have a semicolon. You don't need the extra call to sed or awk or cat.
sed -n -e '/NetworkName/ {' -e 'N' -e '/\n.*ims3/ s/[^\n]*=\(.*\).*/\1/P' -e '}' File

Related

How to replace consecutive symbols using only one sed command?

I have a simple .csv file with lines that holds 't' values. Here is the example:
2ABC;t;t;t;tortuga;fault;t;t;bored
I want to replace them to '1' using sed.
If I make sed "s/;t;/;1;/g" I get the next result:
2ABC;1;t;1;tortuga;fault;1;t;bored
As you can see, consecutive ';t;' have been replaced through one. Yes, I can replace all ';t;' by sed -e "s/;t;/;1;/g" -e "s/;t;/;1;/g" but this is boring.
How can I make the replacement by one sed command?

If there is something to replace, branch to replace again.
sed ': again; /;t;/{ s//;1;/; b again }'
Overall, parsing cvs with sed is crude. Consider awk.
awk -F';' -v OFS=';' '{ for(i=1;i<=NF;++i) if ($i=="t") $i=1 } 1'

Lookarounds is helpful in such cases:
$ s='t;2ABC;t;t;t;tortuga;fault;t;t;bored;t'
$ echo "$s" | perl -lpe 's/(?<![^;])t(?![^;])/1/g'
1;2ABC;1;1;1;tortuga;fault;1;1;bored;1

echo '2ABC;t;t;t;tortuga;fault;t;t;bored' |
— gawk-specific solution
gawk -be '(ORS = RT)^!(NF = NF)' FS='^t$' OFS=1 RS=';'
— cross-awk-solution
{m,g,n}awk 'gsub(FS, OFS, $!(NF = NF))^_' FS=';t;' OFS=';1;' RS=
2ABC;1;1;1;tortuga;fault;1;1;bored

Using a single sed call to split and grep

This is mostly by curiosity, I am trying to have the same behavior as:
echo -e "test1:test2:test3"| sed 's/:/\n/g' | grep 1
in a single sed command.
I already tried
echo -e "test1:test2:test3"| sed -e "s/:/\n/g" -n "/1/p"
But I get the following error:
sed: can't read /1/p: No such file or directory
Any idea on how to fix this and combine different types of commands into a single sed call?
Of course this is overly simplified compared to the real usecase, and I know I can get around by using multiple calls, again this is just out of curiosity.
EDIT: I am mostly interested in the sed tool, I already know how to do it using other tools, or even combinations of those.
EDIT2: Here is a more realistic script, closer to what I am trying to achieve:
arch=linux64
base=https://chromedriver.storage.googleapis.com
split="<Contents>"
curl $base \
| sed -e 's/<Contents>/<Contents>\n/g' \
| grep $arch \
| sed -e 's/^<Key>\(.*\)\/chromedriver.*/\1/' \
| sort -V > out
What I would like to simplify is the curl line, turning it into something like:
curl $base \
| sed 's/<Contents>/<Contents>\n/g' -n '/1/p' -e 's/^<Key>\(.*\)\/chromedriver.*/\1/' \
| sort -V > out

Here are some alternatives, awk and sed based:
sed -E "s/(.*:)?([^:]*1[^:]*).*/\2/" <<< "test1:test2:test3"
awk -v RS=":" '/1/' <<< "test1:test2:test3"
# or also
awk 'BEGIN{RS=":"} /1/' <<< "test1:test2:test3"
Or, using your logic, you would need to pipe a second sed command:
sed "s/:/\n/g" <<< "test1:test2:test3" | sed -n "/1/p"
See this online demo. The awk solution looks cleanest.
Details
In sed solution, (.*:)?([^:]*1[^:]*).* pattern matches an optional sequence of any 0+ chars and a :, then captures into Group 2 any 0 or more chars other than :, 1, again 0 or more chars other than :, and then just matches the rest of the line. The replacement just keeps Group 2 contents.
In awk solution, the record separator is set to : and then /1/ regex is used to only return the record having 1 in it.

This might work for you (GNU sed):
sed 's/:/\n/;/^[^\n]*1/P;D' file
Replace each : and if the first line in the pattern space contains 1 print it.
Repeat.
An alternative:
sed -Ez 's/:/\n/g;s/^[^1]*$//mg;s/\n+/\n/;s/^\n//' file
This slurps the whole file into memory and replaces all colons by newlines. All lines that do not contain 1 are removed and surplus newlines deleted.

An alternative to the really ugly sed is: grep -o '\w*2\w*'
$ printf "test1:test2:test3\nbob3:bob2:fred2\n" | grep -o '\w*2\w*'
test2
bob2
fred2
grep -o: only matching
Or: grep -o '[^:]*2[^:]*'

echo -e "test1:test2:test3" | sed -En 's/:/\n/g;/^[^\n]*2[^\n]*(\n|$)/P;//!D'
sed -n doesn't print unless told to
sed -E allows using parens to match (\n|$) which is newline or the end of the pattern space
P prints the pattern buffer up to the first newline.
D trims the pattern buffer up to the first newline
[^\n] is a character class that matches anything except a newline
// is sed shorthand for repeating a match
//! is then matching everything that didn't match previously
So, after you split into newlines, you want to make sure the 2 character is between the start of the pattern buffer ^ and the first newline.
And, if there is not the character you are looking for, you want to D delete up to the first newline.
At that point, it works for one line of input, with one string containing the character you're looking for.
To expand to several matches within a line, you have to ta, conditionally branch back to label :a:
$ printf "test1:test2:test3\nbob3:bob2:fred2\n" | \
sed -En ':a s/:/\n/g;/^[^\n]*2[^\n]*(\n|$)/P;D;ta'
test2
bob2
fred2

This is simply NOT a job for sed. With GNU awk for multi-char RS:
$ echo "test1:test2:test3:test4:test5:test6"| awk -v RS='[:\n]' '/1/'
test1
$ echo "test1:test2:test3:test4:test5:test6"| awk -v RS='[:\n]' 'NR%2'
test1
test3
test5
$ echo "test1:test2:test3:test4:test5:test6"| awk -v RS='[:\n]' '!(NR%2)'
test2
test4
test6
$ echo "foo1:bar1:foo2:bar2:foo3:bar3" | awk -v RS='[:\n]' '/foo/ || /2/'
foo1
foo2
bar2
foo3
With any awk you'd just have to strip the \n from the final record before operating on it:
$ echo "test1:test2:test3:test4:test5:test6"| awk -v RS=':' '{sub(/\n$/,"")} /1/'
test1

Better way to fix mocha lcov output using sed

Due to the know prob of mocha-lcov-mocha breaking file paths, I need to fix the current output paths that looks like this:
SF:Vis/test-Guid.coffee
SF:Vis/Guid.coffee
SF:Vis/test-Vis-Edge.coffee
SF:Vis/Vis-Edge.coffee
into
SF:test/Vis/test-Guid.coffee
SF:src/Vis/Guid.coffee
SF:test/Vis/test-Vis-Edge.coffee
SF:src/Vis/Vis-Edge.coffee
I'm not very good with sed, but I got it to work using:
mocha -R mocha-lcov-reporter _coverage/test --recursive | sed 's,SF:,SF:src/,' | sed s',SF.*test.*,SF:test//&,' | sed s',/SF:,,' | sed s',test/src,test,' | ./node_modules/coveralls/bin/coveralls.js
which is basically doing 4 sed commands in sequence
sed 's,SF:,SF:src/,'
sed s',SF.*test.*,SF:test//&,'
sed s',/SF:,,'
sed s',test/src,test,'
my question is if there is a way to do with this one sed command, or use another osx/linux command line tool

Initially put "src/" after every ":" and then if "test" is found on the line replace "src" with "test":
$ sed 's,:,:src/,;/test/s,src,test,' file
SF:test/Vis/test-Guid.coffee
SF:src/Vis/Guid.coffee
SF:test/Vis/test-Vis-Edge.coffee
SF:src/Vis/Vis-Edge.coffee

You could put all the sed commands in a file, one line per command, and just use "sed -e script". But if you just want it on a single command-line, separate with semicolons. This works for me:
sed 's,SF:,SF:src/,;s,SF.*test.*,SF:test//&,;s,SF:,,;s,test/src/,test,'

sed command
sed '\#test#!{s#SF:Vis/#SF:src/Vis/#g};\#SF:Vis/test#{s#SF:Vis/test#SF:test/Vis/test#g};' my_file

Here is an awk version:
awk -F: '/SF/ {$0=$1FS (/test/?"test/":"src/")$2}1' file
SF:test/Vis/test-Guid.coffee
SF:src/Vis/Guid.coffee
SF:test/Vis/test-Vis-Edge.coffee
SF:src/Vis/Vis-Edge.coffee
How it works:
awk -F: ' # Set field separator to ":"
/SF/{ # Does line start with "SF"?
$0=$1FS (/test/?"test/":"src/")$2 # Recreat String by adding "test" if line contains "test", else "src"
}
1 # Print all lines
' file # read the file

GREP SED how can I search for a pattern span into two lines?

SOLUTION
Initial solution
find . -type f -exec sed -i ':a;N;$!ba;s/\n //g' {} + | grep -l "672.15687489"
Initial post:
I was wondering how to search for a pattern in a file. The but is that the pattern is spanned in two lines and I don't know in which part the pattern is divided.
Example:
The pattern: _"672.15687489"_
But, in the file could be one of these several options:
672.15\n687489
672.156\n87489
672.1568\n7489
672.15687\n489
...
I don't care how the pattern is splitted, the only thing I want is the name of the file that have the pattern.

Thank you for the hilarious sed | grep "solution":
sed -i ':a;N;$!ba;s/\n //g' {} + | grep -l "672.15687489"
but in reality, just use awk. Here's a GNU awk solution that won't change your original file, doesn't require multiple commands and a pipe, and does not require a James Bond decoder ring to understand an arcane combination of letters and punctuation marks:
$ cat file
foo
672.15
687489
bar
$ gawk -v RS='\0' '{gsub(/\n/,"")} /672.15687489/{print FILENAME; exit}' file
file
All you need to know is that setting RS to the Null character tells gawk to read the whole file as a single record. Other awks may or may not support this but GNU awk does. There are other awk solutions, all of which would be clearer than the posted sed+grep solution.

In-place replacement

I have a CSV. I want to edit the 35th field of the CSV and write the change back to the 35th field. This is what I am doing on bash:
awk -F "," '{print $35}' test.csv | sed -i 's/^0/+91/g'
so, I am pulling the 35th entry using awk and then replacing the "0" in the starting position in the string with "+91". This one works perfet and I get desired output on the console.
Now I want this new entry to get written in the file. I am thinking of sed's "in -place" replacement feature but this fetuare needs and input file. In above command, I cannot provide input file because my primary command is awk and sed is taking the input from awk.
Thanks.

You should choose one of the two tools. As for sed, it can be done as follows:
sed -ri 's/^(([^,]*,){34})0([^,]*)/\1+91\3/' test.csv
Not sure about awk, but #shellter's comment might help with that.

The in-place feature of sed is misnamed, as it does not edit the file in place. Instead, it creates a new file with the same name. eg:
$ echo foo > foo
$ ln -f foo bar
$ ls -i foo bar # These are the same file
797325 bar 797325 foo
$ echo new-text > foo # Changes bar
$ cat bar
new-text
$ printf '/new/s//newer\nw\nq\n' | ed foo # Edit foo "in-place"; changes bar
9
newer-text
11
$ cat bar
newer-text
$ ls -i foo bar # Still the same file
797325 bar 797325 foo
$ sed -i s/new/newer/ foo # Does not edit in-place; creates a new file
$ ls -i foo bar
797325 bar 792722 foo
Since sed is not actually editing the file in place, but writing a new file and then renaming it to the old file, you might as well do the same.
awk ... test.csv | sed ... > test.csv.1 && mv test.csv.1 test.csv
There is the misperception that using sed -i somehow avoids the creation of the temporary file. It does not. It just hides the fact from you. Sometimes abstraction is a good thing, but other times it is unnecessary obfuscation. In the case of sed -i, it is the latter. The shell is really good at file manipulation. Use it as intended. If you do need to edit a file in place, don't use the streaming version of ed; just use ed

So, it turned out there are numerous ways to do it. I got it working with sed as below:
sed -i 's/0\([0-9]\{10\}\)/\+91\1/g' test.csv
But this is little tricky as it will edit any entry which matches the criteria. however in my case, It is working fine.
Similar implementation of above logic in perl:
perl -p -i -e 's/\b0(\d{10})\b/\+91$1/g;' test.csv
Again, same caveat as mentioned above.
More precise way of doing it as shown by Lev Levitsky because it will operate specifically on the 35th field
sed -ri 's/^(([^,]*,){34})0([^,]*)/\1+91\3/g' test.csv
For more complex situations, I will have to consider using any of the csv modules of perl.
Thanks everyone for your time and input. I surely know more about sed/awk after reading your replies.

This might work for you:
sed -i 's/[^,]*/+91/35' test.csv
EDIT:
To replace the leading zero in the 35th field:
sed 'h;s/[^,]*/\n&/35;/\n0/!{x;b};s//+91/' test.csv
or more simply:
|sed 's/^\(\([^,]*,\)\{34\}\)0/\1+91/' test.csv

If you have moreutils installed, you can simply use the sponge tool:
awk -F "," '{print $35}' test.csv | sed -i 's/^0/+91/g' | sponge test.csv
sponge soaks up the input, closes the input pipe (stdin) and, only then, opens and writes to the test.csv file.
As of 2015, moreutils is available in package repositories of several major Linux distributions, such as Arch Linux, Debian and Ubuntu.

Another perl solution to edit the 35th field in-place:
perl -i -F, -lane '$F[34] =~ s/^0/+91/; print join ",",#F' test.csv
These command-line options are used:
-i edit the file in-place
-n loop around every line of the input file
-l removes newlines before processing, and adds them back in afterwards
-a autosplit mode – split input lines into the #F array. Defaults to splitting on whitespace.
-e execute the perl code
-F autosplit modifier, in this case splits on ,
#F is the array of words in each line, indexed starting with 0
$F[34] is the 35 element of the array
s/^0/+91/ does the substitution

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Filter text based in a multiline match criteria - sed

Put that code in a separate .sh file, and run it as your single-line command.

cat File | sed -n '/NetworkName/ { N; /\n.*ims3/ p }' | sed -n 1p | awk -F"=" '{print $2}'

Assuming that you want the network name for the domain ims3, this command line works without sed: grep -B 1 ims3 File | head -n 1 | awk -F"=" '{print $2}'

You just need to put -e pretty much everywhere you'd break the command at a newline or have a semicolon. You don't need the extra call to sed or awk or cat. sed -n -e '/NetworkName/ {' -e 'N' -e '/\n.ims3/ s/[^\n]=\(.\)./\1/P' -e '}' File

Related

How to replace consecutive symbols using only one sed command?

Using a single sed call to split and grep

Better way to fix mocha lcov output using sed

GREP SED how can I search for a pattern span into two lines?

In-place replacement

Categories

Resources

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Filter text based in a multiline match criteria - sed

Put that code in a separate .sh file, and run it as your single-line command.

cat File | sed -n '/NetworkName/ { N; /\n.*ims3/ p }' | sed -n 1p | awk -F"=" '{print $2}'

Assuming that you want the network name for the domain ims3, this command line works without sed: grep -B 1 ims3 File | head -n 1 | awk -F"=" '{print $2}'

You just need to put -e pretty much everywhere you'd break the command at a newline or have a semicolon. You don't need the extra call to sed or awk or cat. sed -n -e '/NetworkName/ {' -e 'N' -e '/\n.*ims3/ s/[^\n]*=\(.*\).*/\1/P' -e '}' File

Related

How to replace consecutive symbols using only one sed command?

Using a single sed call to split and grep

Better way to fix mocha lcov output using sed

GREP SED how can I search for a pattern span into two lines?

In-place replacement

Categories

Resources

You just need to put -e pretty much everywhere you'd break the command at a newline or have a semicolon. You don't need the extra call to sed or awk or cat. sed -n -e '/NetworkName/ {' -e 'N' -e '/\n.ims3/ s/[^\n]=\(.\)./\1/P' -e '}' File