GREP SED how can I search for a pattern span into two lines? - sed

SOLUTION
Initial solution
find . -type f -exec sed -i ':a;N;$!ba;s/\n //g' {} + | grep -l "672.15687489"
Initial post:
I was wondering how to search for a pattern in a file. The but is that the pattern is spanned in two lines and I don't know in which part the pattern is divided.
Example:
The pattern: _"672.15687489"_
But, in the file could be one of these several options:
672.15\n687489
672.156\n87489
672.1568\n7489
672.15687\n489
...
I don't care how the pattern is splitted, the only thing I want is the name of the file that have the pattern.

Thank you for the hilarious sed | grep "solution":
sed -i ':a;N;$!ba;s/\n //g' {} + | grep -l "672.15687489"
but in reality, just use awk. Here's a GNU awk solution that won't change your original file, doesn't require multiple commands and a pipe, and does not require a James Bond decoder ring to understand an arcane combination of letters and punctuation marks:
$ cat file
foo
672.15
687489
bar
$ gawk -v RS='\0' '{gsub(/\n/,"")} /672.15687489/{print FILENAME; exit}' file
file
All you need to know is that setting RS to the Null character tells gawk to read the whole file as a single record. Other awks may or may not support this but GNU awk does. There are other awk solutions, all of which would be clearer than the posted sed+grep solution.

Related

sed remove line if neither pattern provided don't match

I am trying to create a filter command to reduce the lines from a log file, assume each line contains partition made of date,
/iamthepath01/20200301/file01.txt
/iamthepath02/20200302/file02.txt
....
/iamthepathxx/20210619/filexx.txt
then from thousands of lines I only want to keep the ones with two string in the path
/202106
/202105
and remove any other lines
I have tried following command
sed -i -e '\(/202105\|/202106\)!d' ~/log.txt
above command threw
sed: -e expression #1, char 24: unterminated address regex
You can use
sed -i '/\/20210[56]/!d' ~/log.txt
Or, if you need to use more specific alternatives and further enhance the pattern:
sed -i -E '/\/(202105|202106)/!d' ~/log.txt
Details:
-i - GNU sed option for inline file replacement
-E - option enabling POSIX ERE regex syntax
/\/20210[56]/ - regex that matches /20210 and then either 5 or 6
\/(202105|202106) - the POSIX ERE pattern that matches / and then either 202105 or 202106
!d - removes the lines not matching the pattern.
See the online demo:
#!/bin/bash
s='/iamthepath01/20200301/file01.txt
/iamthepath02/20200302/file02.txt
/iamthepathxx/20210619/filexx.txt'
sed '/\/20210[56]/!d' <<< "$s"
Output:
/iamthepathxx/20210619/filexx.txt
sed is the wrong tool for this. If you want a script that's as fragile as the sed one then use grep as it's the tool that exists solely to do a simple g/re/p (hence the name) like you're doing:
$ grep '/20210[56]' file
/iamthepathxx/20210619/filexx.txt
or if you want a more robust solution that focuses just on the part of the line you want to match and so will avoid false matches, then use awk:
$ awk -F '/' '$3 ~ /^20210[56]/' file
/iamthepathxx/20210619/filexx.txt
This might work for you (GNU sed):
sed -ni '\#/20210[56]#p' file
This uses seds -n grep-like option to turn off implicit printing and -i option to edit the file in place.
Normally sed uses the /.../ to match but other delimiters may be used if the first is escaped e.g. \#...#.
So the above solution will filter the existing file down to lines that contain either /202105 or /202106.
N.B. grep will almost certainly be faster in finding the above lines however the use of the -i option may be the ultimate reason for choosing sed (although the same outcome can be achieved by tacking on the > tmpFile && mv tmpFile file to a grep solution).

In-place replacement

I have a CSV. I want to edit the 35th field of the CSV and write the change back to the 35th field. This is what I am doing on bash:
awk -F "," '{print $35}' test.csv | sed -i 's/^0/+91/g'
so, I am pulling the 35th entry using awk and then replacing the "0" in the starting position in the string with "+91". This one works perfet and I get desired output on the console.
Now I want this new entry to get written in the file. I am thinking of sed's "in -place" replacement feature but this fetuare needs and input file. In above command, I cannot provide input file because my primary command is awk and sed is taking the input from awk.
Thanks.
You should choose one of the two tools. As for sed, it can be done as follows:
sed -ri 's/^(([^,]*,){34})0([^,]*)/\1+91\3/' test.csv
Not sure about awk, but #shellter's comment might help with that.
The in-place feature of sed is misnamed, as it does not edit the file in place. Instead, it creates a new file with the same name. eg:
$ echo foo > foo
$ ln -f foo bar
$ ls -i foo bar # These are the same file
797325 bar 797325 foo
$ echo new-text > foo # Changes bar
$ cat bar
new-text
$ printf '/new/s//newer\nw\nq\n' | ed foo # Edit foo "in-place"; changes bar
9
newer-text
11
$ cat bar
newer-text
$ ls -i foo bar # Still the same file
797325 bar 797325 foo
$ sed -i s/new/newer/ foo # Does not edit in-place; creates a new file
$ ls -i foo bar
797325 bar 792722 foo
Since sed is not actually editing the file in place, but writing a new file and then renaming it to the old file, you might as well do the same.
awk ... test.csv | sed ... > test.csv.1 && mv test.csv.1 test.csv
There is the misperception that using sed -i somehow avoids the creation of the temporary file. It does not. It just hides the fact from you. Sometimes abstraction is a good thing, but other times it is unnecessary obfuscation. In the case of sed -i, it is the latter. The shell is really good at file manipulation. Use it as intended. If you do need to edit a file in place, don't use the streaming version of ed; just use ed
So, it turned out there are numerous ways to do it. I got it working with sed as below:
sed -i 's/0\([0-9]\{10\}\)/\+91\1/g' test.csv
But this is little tricky as it will edit any entry which matches the criteria. however in my case, It is working fine.
Similar implementation of above logic in perl:
perl -p -i -e 's/\b0(\d{10})\b/\+91$1/g;' test.csv
Again, same caveat as mentioned above.
More precise way of doing it as shown by Lev Levitsky because it will operate specifically on the 35th field
sed -ri 's/^(([^,]*,){34})0([^,]*)/\1+91\3/g' test.csv
For more complex situations, I will have to consider using any of the csv modules of perl.
Thanks everyone for your time and input. I surely know more about sed/awk after reading your replies.
This might work for you:
sed -i 's/[^,]*/+91/35' test.csv
EDIT:
To replace the leading zero in the 35th field:
sed 'h;s/[^,]*/\n&/35;/\n0/!{x;b};s//+91/' test.csv
or more simply:
|sed 's/^\(\([^,]*,\)\{34\}\)0/\1+91/' test.csv
If you have moreutils installed, you can simply use the sponge tool:
awk -F "," '{print $35}' test.csv | sed -i 's/^0/+91/g' | sponge test.csv
sponge soaks up the input, closes the input pipe (stdin) and, only then, opens and writes to the test.csv file.
As of 2015, moreutils is available in package repositories of several major Linux distributions, such as Arch Linux, Debian and Ubuntu.
Another perl solution to edit the 35th field in-place:
perl -i -F, -lane '$F[34] =~ s/^0/+91/; print join ",",#F' test.csv
These command-line options are used:
-i edit the file in-place
-n loop around every line of the input file
-l removes newlines before processing, and adds them back in afterwards
-a autosplit mode – split input lines into the #F array. Defaults to splitting on whitespace.
-e execute the perl code
-F autosplit modifier, in this case splits on ,
#F is the array of words in each line, indexed starting with 0
$F[34] is the 35 element of the array
s/^0/+91/ does the substitution

Filter text based in a multiline match criteria

I have the following sed command. I need to execute the below command in single line
cat File | sed -n '
/NetworkName/ {
N
/\n.*ims3/ p
}' | sed -n 1p | awk -F"=" '{print $2}'
I need to execute the above command in single line. can anyone please help.
Assume that the contents of the File is
System.DomainName=shayam
System.Addresses=Fr6
System.Trusted=Yes
System.Infrastructure=No
System.NetworkName=AS
System.DomainName=ims5.com
System.DomainName=Ram
System.Addresses=Fr9
System.Trusted=Yes
System.Infrastructure=No
System.NetworkName=Peer
System.DomainName=ims7.com
System.DomainName=mani
System.Addresses=Hello
System.Trusted=Yes
System.Infrastructure=No
System.NetworkName=Peer
System.DomainName=ims3.com
And after executing the command you will get only peer as the output. Can anyone please help me out?
You can use a single nawk command. And you can lost the useless cat
nawk -F"=" '/NetworkName/{n=$2;getline;if($2~/ims3/){print n} }' file
You can use sed as well as proposed by others, but i prefer less regex and less clutter.
The above save the value of the network name to "n". Then, get the next line and check the 2nd field against "ims3". If matched, then print the value of "n".
Put that code in a separate .sh file, and run it as your single-line command.
cat File | sed -n '/NetworkName/ { N; /\n.*ims3/ p }' | sed -n 1p | awk -F"=" '{print $2}'
Assuming that you want the network name for the domain ims3, this command line works without sed:
grep -B 1 ims3 File | head -n 1 | awk -F"=" '{print $2}'
So, you want the network name where the domain name on the following line includes 'ims3', and not the one where the following line includes 'ims7' (even though the network names in the example are the same).
sed -n '/NetworkName/{N;/ims3/{s/.*NetworkName=\(.*\)\n.*/\1/p;};}' File
This avoids abuse of felines, too (not to mention reducing the number of commands executed).
Tested on MacOS X 10.6.4, but there's no reason to think it won't work elsewhere too.
However, empirical evidence shows that Solaris sed is different from MacOS sed. It can all be done in one sed command, but it needs three lines:
sed -n '/NetworkName/{N
/ims3/{s/.*NetworkName=\(.*\)\n.*/\1/p;}
}' File
Tested on Solaris 10.
You just need to put -e pretty much everywhere you'd break the command at a newline or have a semicolon. You don't need the extra call to sed or awk or cat.
sed -n -e '/NetworkName/ {' -e 'N' -e '/\n.*ims3/ s/[^\n]*=\(.*\).*/\1/P' -e '}' File

Have sed make substitute on string but SKIP first occurrence

I have been through the sed one liners but am still having trouble with my goal. I want to substitue matching strings on all but the first occurrence of a line. My exact usage would be:
$ echo 'cd /Users/joeuser/bump bonding/initial trials' | sed <<MAGIC HAPPENS>
cd /Users/joeuser/bump\ bonding/initial\ trials
The line replaced the space in bump bonding with the slash space bump\ bonding so that I can execute this line (since when the spaces aren't escaped I wouldn't be able to cd to it).
Update: I solved this by just using single quotes and outputting
cd 'blah blah/thing/another space/'
and then using source to execute the command. But it didn't answer my question. I'm still curious though... how would you use sed to fix it?
s/ /\\ /2g
The 2 specifies that the second one should apply, and the g specifies that all the rest should apply too. (This probably only works on GNU sed. According to the Open Group Base Specification, "If both g and n are specified, the results are unspecified.")
You can avoid the problem with g and n
Replace all of them, then undo the first one:
sed -e 's/ /\\ /g' -e 's/\\ / /1'
Here's another method which uses the t branch-if-substituted command:
sed ':a;s/\([^ ]* .*[^\\]\) \(.*\)/\1\\ \2/;ta'
which has the advantage of leaving existing backslash-space sequences in the input intact.
use awk
$ echo cd 'blah blah/thing/another space/' | awk '{for(i=2;i<NF;i++) $i=$i"\\"}1'
cd blah\ blah/thing/another\ space/
$ echo 'cd /Users/joeuser/bump bonding/initial trials' | awk '{for(i=2;i<NF;i++) $i=$i"\\"}1'
cd /Users/joeuser/bump\ bonding/initial\ trials

How to find and replace all occurrences of a string recursively in a directory tree? [duplicate]

This question already has answers here:
How can I do a recursive find/replace of a string with awk or sed?
(37 answers)
Closed 1 year ago.
Using just grep and sed, how do I replace all occurrences of:
a.example.com
with
b.example.com
within a text file under the /home/user/ directory tree recursively finding and replacing all occurrences in all files in sub-directories as well.
Try this:
find /home/user/ -type f | xargs sed -i 's/a\.example\.com/b.example.com/g'
In case you want to ignore dot directories
find . \( ! -regex '.*/\..*' \) -type f | xargs sed -i 's/a\.example\.com/b.example.com/g'
Edit: escaped dots in search expression
Try this:
grep -rl 'SearchString' ./ | xargs sed -i 's/REPLACESTRING/WITHTHIS/g'
grep -rl will recursively search for the SEARCHSTRING in the directories ./ and will replace the strings using sed.
Ex:
Replacing a name TOM with JERRY using search string as SWATKATS in directory CARTOONNETWORK
grep -rl 'SWATKATS' CARTOONNETWORK/ | xargs sed -i 's/TOM/JERRY/g'
This will replace TOM with JERRY in all the files and subdirectories under CARTOONNETWORK wherever it finds the string SWATKATS.
On macOS, none of the answers worked for me. I discovered that was due to differences in how sed works on macOS and other BSD systems compared to GNU.
In particular BSD sed takes the -i option but requires a suffix for the backup (but an empty suffix is permitted)
grep version from this answer.
grep -rl 'foo' ./ | LC_ALL=C xargs sed -i '' 's/foo/bar/g'
find version from this answer.
find . \( ! -regex '.*/\..*' \) -type f | LC_ALL=C xargs sed -i '' 's/foo/bar/g'
Don't omit the Regex to ignore . folders if you're in a Git repo. I realized that the hard way!
That LC_ALL=C option is to avoid getting sed: RE error: illegal byte sequence if sed finds a byte sequence that is not a valid UTF-8 character. That's another difference between BSD and GNU. Depending on the kind of files you are dealing with, you may not need it.
For some reason that is not clear to me, the grep version found more occurrences than the find one, which is why I recommend to use grep.
I know this is a really old question, but...
#vehomzzz's answer uses find and xargs when the questions says explicitly grep and sed only.
#EmployedRussian and #BrooksMoses tried to say it was a dup of awk and sed, but it's not - again, the question explicitly says grep and sed only.
So here is my solution, assuming you are using Bash as your shell:
OLDIFS=$IFS
IFS=$'\n'
for f in `grep -rl a.example.com .` # Use -irl instead of -rl for case insensitive search
do
sed -i 's/a\.example\.com/b.example.com/g' $f # Use /gi instead of /g for case insensitive search
done
IFS=$OLDIFS
If you are using a different shell, such as Unix SHell, let me know and I will try to find a syntax adjustment.
P.S.: Here's a one-liner:
OLDIFS=$IFS;IFS=$'\n';for f in `grep -rl a.example.com .`;do sed -i 's/a\.example\.com/b.example.com/g' $f;done;IFS=$OLDIFS
Sources:
Bash: Iterating over lines in a variable
grep(1) - Linux man page
Official Grep Manual
sed(1) - Linux man page
Official sed Manual
For me works the next command:
find /path/to/dir -name "file.txt" | xargs sed -i 's/string_to_replace/new_string/g'
if string contains slash 'path/to/dir' it can be replace with another character to separate, like '#' instead '/'.
For example: 's#string/to/replace#new/string#g'
We can try using the more powerful ripgrep as
rg "BYE_BYE_TEXT" ./ --files-with-matches | xargs sed -i "s/BYE_BYE_TEXT/WELCOME_TEXT/g"
Because ripgrep is good at finding and sed is great at replacing.
it is much simpler than that.
for i in `find *` ; do sed -i -- 's/search string/target string/g' $i; done
find i => will iterate over all the files in the folder and in subfolders.
sed -i => will replace in the files the relevant string if exists.
Try this command:
/home/user/ directory - find ./ -type f \
-exec sed -i -e 's/a.example.com/b.example.com/g' {} \;
The command below will search all the files recursively whose name matches the search pattern and will replace the string:
find /path/to/searchdir/ -name "serachpatter" -type f | xargs sed -i 's/stringone/StrIngTwo/g'
Also if you want to limit the depth of recursion you can put the limits as well:
find /path/to/searchdir/ -name "serachpatter" -type f -maxdepth 4 -mindepth 2 | xargs sed -i 's/stringone/StrIngTwo/g'