I have file1 containing some text, like:
abcdef 123456 abcdef
ghijkl 789123 abcdef
mnopqr 123456 abcdef
and I have file2 containing single line of text which I want to use as pattern:
ghijkl 789123
How can I use second file as a pattern to print lines containing it to third file using sed? like file3:
ghijkl 789123 abcdef
I've tried to use
sed -ne "s/r file2//p" file1 > file3
But the content of file3 is blank for some reason
P.S. using Windows
If you have sed, do have access to grep?
grep -f file2 file1 > file3
This is the simplest sed solution on linux: sed -n /`<file2`/p file1 > file3, but windows does not provides backticks. So the windows work-around would be:
set /p PATERN=<file2
sed -n /%PATERN%/p file1 > file3
The sed solution is:
cat f2.txt | xargs -I {} sed -n "/{}/p" f1.txt > f3.txt
but, as #Cyrus correctly notes, grep is the proper tool for this solution and it's much nicer:
grep -f f2.txt f1.txt > f3.txt
Note: using these incredibly powerful *nix tools like sed, grep, cat, xargs, bash, etc. on Microsoft Windows can be frustrating. Consider spinning up a Linux environment, instead -- you'll save yourself many hours of grief dealing with subtle path and environment issues from emulators like Cygwin, etc.
Related
I am trying to insert the contents of a file1.txt into file2.txt using sed. The content of file1.txt is just a single line, which is a path.
I want it to be added as a prefix to each line in file2.txt as well as add another / character.
$ cat file1.txt
/psot/rot8888/orce/db/tier/data/tine
$ cat file2.txt
o1_mf_users_abchwfg_.dbf
o1_mf_toptbs2_abchrq0_.dbf
o1_mf_toptbs1_abchrl2_.dbf
o1_mf_toptbs1_abchtlf_.dbf
Desired output should be like:
/psot/rot8888/orce/db/tier/data/tine/o1_mf_users_abchwfg_.dbf
/psot/rot8888/orce/db/tier/data/tine/o1_mf_toptbs2_abchrq0_.dbf
/psot/rot8888/orce/db/tier/data/tine/o1_mf_toptbs1_abchrl2_.dbf
/psot/rot8888/orce/db/tier/data/tine/o1_mf_toptbs1_abchtlf_.dbf
Tried command:
$ sed '/o1/ r file1.txt' file2.txt >> test.txt
$ cat test.txt
o1_mf_users_abchwfg_.dbf
/psot/rot8888/orce/db/tier/data/tine
o1_mf_toptbs2_abchrq0_.dbf
/psot/rot8888/orce/db/tier/data/tine
o1_mf_toptbs1_abchrl2_.dbf
/psot/rot8888/orce/db/tier/data/tine
o1_mf_toptbs1_abchtlf_.dbf
/psot/rot8888/orce/db/tier/data/tine
You can use pr for this without having to worry about sed metacharacters, delimiters, etc.
$ cat ip.txt
abcd.xyz
123.txt
foo_baz.txt
$ cat f1
/a/b/c/d/
$ pr -mts"$(< f1)" /dev/null ip.txt
/a/b/c/d/abcd.xyz
/a/b/c/d/123.txt
/a/b/c/d/foo_baz.txt
Where -m allows pasting files parallely and -s is the separator between the files to be merged. Here, /dev/null is used as a dummy for one of the files as only the separator has to be prefixed.
If you need to add some more characters after the contents of file containing the prefix:
$ cat ip.txt
abcd.xyz
123.txt
foo_baz.txt
$ cat f1
/a/b/c/d
$ pr -mts"$(< f1)"'/' /dev/null ip.txt
/a/b/c/d/abcd.xyz
/a/b/c/d/123.txt
/a/b/c/d/foo_baz.txt
This will work using any awk in any shell on every UNIX box:
$ awk 'NR==FNR{p=$0; next} {print p "/" $0}' file1 file2
/psot/rot8888/orce/db/tier/data/tine/o1_mf_users_abchwfg_.dbf
/psot/rot8888/orce/db/tier/data/tine/o1_mf_toptbs2_abchrq0_.dbf
/psot/rot8888/orce/db/tier/data/tine/o1_mf_toptbs1_abchrl2_.dbf
/psot/rot8888/orce/db/tier/data/tine/o1_mf_toptbs1_abchtlf_.dbf
This might work for you (GNU sed):
sed '1h;1d;G;s/\(.*\)\n\(.*\)/\2\1/' file1 file2
Copy file1 into the hold space and append it to each line in file2. Using regexp and back references, manipulate the two lines into one in the correct order.
Alternative:
sed 'x;s/.*/cat file1/e;G;s/\n//' file2
Insert file1 into the hold space, append the current line of file2, and remove the newline connecting them.
A third way:
sed 'r file1' file2 | sed -E 'N;s/(.*)\n(.*)/\2\/\1/'
is it possible to use the pipe to redirect the output of the previous command, to sed, and let sed use this as input(pattern or string) to access a file?
I know if you only use sed, you can use something like
sed -i '1 i\anything' file
But can I do something like
head -1 file1 | sed -i '1 i\OutputFromPreviousCmd' file2
This way, I don't need to manually copy the output and change the sed command everytime
Update:
Added the files I meant
head -3 file1.txt
Side A,Age(us),mm:ss.ms_us_ns_ps
84 Vendor Specific, 0000000009096, 0349588242
84 Vendor Specific, 0000000011691, 0349591828
head -3 file2.txt
84 Vendor Specific, 0000000000418, 0349575322
83 Vendor Specific, 0000000002099, 0349575343
83 Vendor Specific, 0000000001628, 0349576662
I'd like to grab the first line of file1 and insert it to file2, so the result should be :
head -3 file2.txt
Side A,Age(us),mm:ss.ms_us_ns_ps
84 Vendor Specific, 0000000000418, 0349575322
83 Vendor Specific, 0000000002099, 0349575343
83 Vendor Specific, 0000000001628, 0349576662
head -1 file1 | sed '1s/^/1i /' | sed -i -f- file2
This takes your one line of output, prepends the sed 1i command, the pipes that sed command stream to sed using -f- to take sed commands from stdin.
For example:
$ echo bob > bob.txt
$ echo alice | sed '1s/^/1i /' | sed -i -f- bob.txt
$ more bob.txt
alice
bob
This looks like pipes and not commands ending in > temp ; mv temp file2, but sed is doing that nonetheless when -i is used.
This might work for you (GNU sed):
head -1 file1 | sed -i '1e cat /dev/stdin' file2
Insert the first line of file1 into the start of file2.
But why not use cat?:
cat <(head -1 file1) file2
I have a file with many lines, in each line
there is either substring
whatever_blablablalsfjlsdjf;asdfjlds;f/watch?v=yPrg-JN50sw&,whatever_blabla
or
whatever_blablabla"/watch?v=yPrg-JN50sw&" class=whatever_blablablavwhate
I want to extract a substring, like the "yPrg-JN50s" above
the matching pattern is
the 11 characters after the string "/watch?="
how to extract the substring
I hope it is sed, awk in one line
if not, a pn line perl script is also ok
You can do
grep -oP '(?<=/watch\?v=).{11}'
if your grep knows Perl regex, or
sed 's/.*\/watch?v=\(.\{11\}\).*/\1/g'
$ cat file
/watch?v=yPrg-JN50sw&
"/watch?v=yPrg-JN50sw&" class=
$
$ awk 'match($0,/\/watch\?v=/) { print substr($0,RSTART+RLENGTH,11) }' file
yPrg-JN50sw
yPrg-JN50sw
Just with the shell's parameter expansion, extract the 11 chars after "watch?v=":
while IFS= read -r line; do
tmp=${line##*watch?v=}
echo ${tmp:0:11}
done < filename
You could use sed to remove the extraneous information:
sed 's/[^=]\+=//; s/&.*$//' file
Or with awk and sensible field separators:
awk -F '[=&]' '{print $2}' file
Contents of file:
cat <<EOF > file
/watch?v=yPrg-JN50sw&
"/watch?v=yPrg-JN50sw&" class=
EOF
Output:
yPrg-JN50sw
yPrg-JN50sw
Edit accommodating new requirements mentioned in the comments
cat <<EOF > file
<div id="" yt-grid-box "><div class="yt-lockup-thumbnail"><a href="/watch?v=0_NfNAL3Ffc" class="ux-thumb-wrap yt-uix-sessionlink yt-uix-contextlink contains-addto result-item-thumb" data-sessionlink="ved=CAMQwBs%3D&ei=CPTsy8bhqLMCFRR0fAodowXbww%3D%3D"><span class="video-thumb ux-thumb yt-thumb-default-185 "><span class="yt-thumb-clip"><span class="yt-thumb-clip-inner"><img src="//i1.ytimg.com/vi/0_NfNAL3Ffc/mqdefault.jpg" alt="Miniature" width="185" ><span class="vertical-align"></span></span></span></span><span class="video-time">5:15</span>
EOF
Use awk with sensible record separator:
awk -v RS='[=&"]' '/watch/ { getline; print }' file
Note, you should use a proper XML parser for this sort of task.
grep --perl-regexp --only-matching --regexp="(?<=/watch\\?=)([^&]{0,11})"
Assuming your lines have exactly the format you quoted, this should work.
awk '{print substr($0,10,11)}'
Edit: From the comment in another answer, I guess your lines are much longer and complicated than this, in which case something more comprehensive is needed:
gawk '{if(match($0, "/watch\\?v=(\\w+)",a)) print a[1]}'
I have a CSV. I want to edit the 35th field of the CSV and write the change back to the 35th field. This is what I am doing on bash:
awk -F "," '{print $35}' test.csv | sed -i 's/^0/+91/g'
so, I am pulling the 35th entry using awk and then replacing the "0" in the starting position in the string with "+91". This one works perfet and I get desired output on the console.
Now I want this new entry to get written in the file. I am thinking of sed's "in -place" replacement feature but this fetuare needs and input file. In above command, I cannot provide input file because my primary command is awk and sed is taking the input from awk.
Thanks.
You should choose one of the two tools. As for sed, it can be done as follows:
sed -ri 's/^(([^,]*,){34})0([^,]*)/\1+91\3/' test.csv
Not sure about awk, but #shellter's comment might help with that.
The in-place feature of sed is misnamed, as it does not edit the file in place. Instead, it creates a new file with the same name. eg:
$ echo foo > foo
$ ln -f foo bar
$ ls -i foo bar # These are the same file
797325 bar 797325 foo
$ echo new-text > foo # Changes bar
$ cat bar
new-text
$ printf '/new/s//newer\nw\nq\n' | ed foo # Edit foo "in-place"; changes bar
9
newer-text
11
$ cat bar
newer-text
$ ls -i foo bar # Still the same file
797325 bar 797325 foo
$ sed -i s/new/newer/ foo # Does not edit in-place; creates a new file
$ ls -i foo bar
797325 bar 792722 foo
Since sed is not actually editing the file in place, but writing a new file and then renaming it to the old file, you might as well do the same.
awk ... test.csv | sed ... > test.csv.1 && mv test.csv.1 test.csv
There is the misperception that using sed -i somehow avoids the creation of the temporary file. It does not. It just hides the fact from you. Sometimes abstraction is a good thing, but other times it is unnecessary obfuscation. In the case of sed -i, it is the latter. The shell is really good at file manipulation. Use it as intended. If you do need to edit a file in place, don't use the streaming version of ed; just use ed
So, it turned out there are numerous ways to do it. I got it working with sed as below:
sed -i 's/0\([0-9]\{10\}\)/\+91\1/g' test.csv
But this is little tricky as it will edit any entry which matches the criteria. however in my case, It is working fine.
Similar implementation of above logic in perl:
perl -p -i -e 's/\b0(\d{10})\b/\+91$1/g;' test.csv
Again, same caveat as mentioned above.
More precise way of doing it as shown by Lev Levitsky because it will operate specifically on the 35th field
sed -ri 's/^(([^,]*,){34})0([^,]*)/\1+91\3/g' test.csv
For more complex situations, I will have to consider using any of the csv modules of perl.
Thanks everyone for your time and input. I surely know more about sed/awk after reading your replies.
This might work for you:
sed -i 's/[^,]*/+91/35' test.csv
EDIT:
To replace the leading zero in the 35th field:
sed 'h;s/[^,]*/\n&/35;/\n0/!{x;b};s//+91/' test.csv
or more simply:
|sed 's/^\(\([^,]*,\)\{34\}\)0/\1+91/' test.csv
If you have moreutils installed, you can simply use the sponge tool:
awk -F "," '{print $35}' test.csv | sed -i 's/^0/+91/g' | sponge test.csv
sponge soaks up the input, closes the input pipe (stdin) and, only then, opens and writes to the test.csv file.
As of 2015, moreutils is available in package repositories of several major Linux distributions, such as Arch Linux, Debian and Ubuntu.
Another perl solution to edit the 35th field in-place:
perl -i -F, -lane '$F[34] =~ s/^0/+91/; print join ",",#F' test.csv
These command-line options are used:
-i edit the file in-place
-n loop around every line of the input file
-l removes newlines before processing, and adds them back in afterwards
-a autosplit mode – split input lines into the #F array. Defaults to splitting on whitespace.
-e execute the perl code
-F autosplit modifier, in this case splits on ,
#F is the array of words in each line, indexed starting with 0
$F[34] is the 35 element of the array
s/^0/+91/ does the substitution
My target is to delete line in file only if PATH match the PATH in the file
For example, I need to delete all lines that have /etc/sysconfig PATH from /tmp/file file
more /tmp/file
/etc/sysconfig/network-scripts/ifcfg-lo file1
/etc/sysconfig/network-scripts/ifcfg-lo file2
/etc/sysconfig/network-scripts/ifcfg-lo file3
I write the following Perl code (the perl code integrated in my bash script) in order to delete lines that have "/etc/sysconfig"
export FILE=/etc/sysconfig
perl -i -pe 's/\Q$ENV{FILE}\E// ' /tmp/file
But I get the following after I run the perl code: (in place to get empty lines)
/network-scripts/ifcfg-lo file1
/network-scripts/ifcfg-lo file2
/network-scripts/ifcfg-lo file3
first question:
How to change the perl syntax : perl -i -pe 's/\Q$ENV{FILE }\E// ' in order to delete line that matches the required PATH (/etc/sysconfig)?
second question:
The same as the first question but line will deleted only if PATH match the first field in the file
Example:
/tmp/file before perl edit:
file1 /etc/sysconfig/network-scripts/ifcfg-lo
/etc/sysconfig/network-scripts/ifcfg-lo file2
/etc/sysconfig/network-scripts/ifcfg-lo file3
/tmp/file after perl edit:
file1 /etc/sysconfig/network-scripts/ifcfg-lo
Perl is a fine way to do it. Use the -n switch, not -p.
perl -i -l -n -e'print unless /\Q$ENV{FILE}/' filename
s/pattern/otherpattern/ won't delete entire lines; it will only alter substrings. You need to entirely change your program to delete entire lines. In pseudocode, it would be:
while (read in a line)
{
if (doesn't match)
{
write the line back out unaltered.
}
}
It can still be rewritten as a oneliner though, with knowledge of how continue and redo work in loops: perl -pe '$_ = <> and redo if /Q$ENV{FILE}\E/'
mef#iwlappy:~$ cat /tmp/file
aaaa
/etc/sysconfig/network-scripts/ifcfg-lofile1
/etc/sysconfig/network-scripts/ifcfg-lofile2
/etc/sysconfig/network-scripts/ifcfg-lofile3
aaa
mef#iwlappy:~$ perl -i -pe 's/$ENV{FILE}\E.*//' /tmp/file
mef#iwlappy:~$ cat /tmp/file
aaaa
aaa
You can do a further regexp to remove empty lines with s/^$//
If I were doing this from the command line, I probably wouldn't even use Perl. I'd just use a negated grep:
$ mv old.txt old.bak; grep -v $FILE old.bak > old.txt
Renaming the original file and writing to a new file with the old name is the same thing that perl's -i switch does for you.
If you want to match just the first column, then I might punt to perl so I don't have to use awk or cut. perl's -a switch splits the line on whitespace and puts the results in #F:
$ perl -ai.bak -ne 'print if $F[0] !~ /^\Q$ENV{FILE}/' old.txt
When you think you have it right, you can remove the .bak training wheels that saves a copy of your original file. Or not. I tend to like the safety net.
See perlrun for the details of command-line switches.