Delete everything but TWO patterns - sed

Is there a way to delete everything but TWO patterns in bash?
I know i can delete everything but pattern with
sed '/pattern/!d'
but I'm looking for a way delete everything but two patterns... something like this
sed '/pattern1 and pattern2/!d'
I don't know how to do this.
btw. I'm trying to delete everything but <..> and <..>:
Thanks for help.

You want to delete all lines that don't contain pattern1 or pattern2. Use the proper OR for this.
$ cat in.txt
foo
bar
baz
qux
$ sed '/foo\|bar/!d' < in.txt
foo
bar

Better answer to what I had before. You could just use grep for this task assuming your patterns are in in.txt and your output will be in out.txt
grep -E -o '(pattern1|pattern2) in.txt > out.txt'
E is for extended regular expressions and o is to show only matching patterns.

You can do so with either awk or sed.
$ cat file
foo
bar
baz
qux
$ sed -n '/foo\|bar/p' file
foo
bar
$ awk '/foo|bar/' file
foo
bar
The above will print any lines containing foo or bar. If you wish to be more specific, for instance, only print when foo or bar are at the start of the line, you can use ^, which means print only those that start with your pattern.

instead of !d you can use -n and the p command:
sed -n '/pattern1/p; /pattern2/p'
Which should also work with seds other than GNU sed...

Related

How to replace only last match in a line with sed?

With sed, I can replace the first match in a line using
sed 's/pattern/replacement/'
And all matches using
sed 's/pattern/replacement/g'
How do I replace only the last match, regardless of how many matches there are before it?
Copy pasting from something I've posted elsewhere:
$ # replacing last occurrence
$ # can also use sed -E 's/:([^:]*)$/-\1/'
$ echo 'foo:123:bar:baz' | sed -E 's/(.*):/\1-/'
foo:123:bar-baz
$ echo '456:foo:123:bar:789:baz' | sed -E 's/(.*):/\1-/'
456:foo:123:bar:789-baz
$ echo 'foo and bar and baz land good' | sed -E 's/(.*)and/\1XYZ/'
foo and bar and baz lXYZ good
$ # use word boundaries as necessary - GNU sed
$ echo 'foo and bar and baz land good' | sed -E 's/(.*)\band\b/\1XYZ/'
foo and bar XYZ baz land good
$ # replacing last but one
$ echo 'foo:123:bar:baz' | sed -E 's/(.*):(.*:)/\1-\2/'
foo:123-bar:baz
$ echo '456:foo:123:bar:789:baz' | sed -E 's/(.*):(.*:)/\1-\2/'
456:foo:123:bar-789:baz
$ # replacing last but two
$ echo '456:foo:123:bar:789:baz' | sed -E 's/(.*):((.*:){2})/\1-\2/'
456:foo:123-bar:789:baz
$ # replacing last but three
$ echo '456:foo:123:bar:789:baz' | sed -E 's/(.*):((.*:){3})/\1-\2/'
456:foo-123:bar:789:baz
Further Reading:
Buggy behavior if word boundaries is used inside a group with quanitifiers - for example: echo 'it line with it here sit too' | sed -E 's/with(.*\bit\b){2}/XYZ/' fails
Greedy vs. Reluctant vs. Possessive Quantifiers
Reference - What does this regex mean?
sed manual: Back-references and Subexpressions
This might work for you (GNU sed):
sed 's/\(.*\)pattern/\1replacement/' file
Use greed to swallow up the pattern space and then regexp engine will step back through the line and find the first match i.e. the last match.
A fun way to do this, is to use rev to reverse the characters of each line and write your sed replacement backwards.
rev input_file | sed 's/nrettap/tnemecalper/' | rev

Replacing several lines in a script with a single line using sed

Say I have a script where I want to change several lines for a single line.
For example, I got a new function that can summarize several commands, so that I can replace in my script as follows:
Original
some_code
command1
command2
command3
some_more_code
Edited
some_code
foo()
some_more_code
How would you do that using sed?
sed '/some_code/,/command3/ !b
/some_code/ b
/command3/ a\
foo()
d' YourFile
be carrefull about meta character ( like &\\^$[]{}().) in any of the pattern (except your foo() line)
I am answering my own question here.
I couldn't figure out a way to do it in one go, so I split the problem into two parts.
Part 1: replace the first line
sed -e 's/command1/foo()/g' file1 > file2
Part 2: remove the rest of the lines
sed -e '/command2/,+1d/' file2 > file3
I'd prefer a more elegant way though, where I can be flexible in the number of lines that I am replacing, possibly matching the last command in the block. Any ideas?
Just use awk:
$ awk -v RS='^$' -v ORS= '{sub(/command1\ncommand2\ncommand3/,"foo()")}1' file
some_code
foo()
some_more_code
The above uses GNU awk for multi-char RS.
This might work for you (GNU sed):
sed '/command1/,/command3/c\foo()' file

GREP SED how can I search for a pattern span into two lines?

SOLUTION
Initial solution
find . -type f -exec sed -i ':a;N;$!ba;s/\n //g' {} + | grep -l "672.15687489"
Initial post:
I was wondering how to search for a pattern in a file. The but is that the pattern is spanned in two lines and I don't know in which part the pattern is divided.
Example:
The pattern: _"672.15687489"_
But, in the file could be one of these several options:
672.15\n687489
672.156\n87489
672.1568\n7489
672.15687\n489
...
I don't care how the pattern is splitted, the only thing I want is the name of the file that have the pattern.
Thank you for the hilarious sed | grep "solution":
sed -i ':a;N;$!ba;s/\n //g' {} + | grep -l "672.15687489"
but in reality, just use awk. Here's a GNU awk solution that won't change your original file, doesn't require multiple commands and a pipe, and does not require a James Bond decoder ring to understand an arcane combination of letters and punctuation marks:
$ cat file
foo
672.15
687489
bar
$ gawk -v RS='\0' '{gsub(/\n/,"")} /672.15687489/{print FILENAME; exit}' file
file
All you need to know is that setting RS to the Null character tells gawk to read the whole file as a single record. Other awks may or may not support this but GNU awk does. There are other awk solutions, all of which would be clearer than the posted sed+grep solution.

In-place replacement

I have a CSV. I want to edit the 35th field of the CSV and write the change back to the 35th field. This is what I am doing on bash:
awk -F "," '{print $35}' test.csv | sed -i 's/^0/+91/g'
so, I am pulling the 35th entry using awk and then replacing the "0" in the starting position in the string with "+91". This one works perfet and I get desired output on the console.
Now I want this new entry to get written in the file. I am thinking of sed's "in -place" replacement feature but this fetuare needs and input file. In above command, I cannot provide input file because my primary command is awk and sed is taking the input from awk.
Thanks.
You should choose one of the two tools. As for sed, it can be done as follows:
sed -ri 's/^(([^,]*,){34})0([^,]*)/\1+91\3/' test.csv
Not sure about awk, but #shellter's comment might help with that.
The in-place feature of sed is misnamed, as it does not edit the file in place. Instead, it creates a new file with the same name. eg:
$ echo foo > foo
$ ln -f foo bar
$ ls -i foo bar # These are the same file
797325 bar 797325 foo
$ echo new-text > foo # Changes bar
$ cat bar
new-text
$ printf '/new/s//newer\nw\nq\n' | ed foo # Edit foo "in-place"; changes bar
9
newer-text
11
$ cat bar
newer-text
$ ls -i foo bar # Still the same file
797325 bar 797325 foo
$ sed -i s/new/newer/ foo # Does not edit in-place; creates a new file
$ ls -i foo bar
797325 bar 792722 foo
Since sed is not actually editing the file in place, but writing a new file and then renaming it to the old file, you might as well do the same.
awk ... test.csv | sed ... > test.csv.1 && mv test.csv.1 test.csv
There is the misperception that using sed -i somehow avoids the creation of the temporary file. It does not. It just hides the fact from you. Sometimes abstraction is a good thing, but other times it is unnecessary obfuscation. In the case of sed -i, it is the latter. The shell is really good at file manipulation. Use it as intended. If you do need to edit a file in place, don't use the streaming version of ed; just use ed
So, it turned out there are numerous ways to do it. I got it working with sed as below:
sed -i 's/0\([0-9]\{10\}\)/\+91\1/g' test.csv
But this is little tricky as it will edit any entry which matches the criteria. however in my case, It is working fine.
Similar implementation of above logic in perl:
perl -p -i -e 's/\b0(\d{10})\b/\+91$1/g;' test.csv
Again, same caveat as mentioned above.
More precise way of doing it as shown by Lev Levitsky because it will operate specifically on the 35th field
sed -ri 's/^(([^,]*,){34})0([^,]*)/\1+91\3/g' test.csv
For more complex situations, I will have to consider using any of the csv modules of perl.
Thanks everyone for your time and input. I surely know more about sed/awk after reading your replies.
This might work for you:
sed -i 's/[^,]*/+91/35' test.csv
EDIT:
To replace the leading zero in the 35th field:
sed 'h;s/[^,]*/\n&/35;/\n0/!{x;b};s//+91/' test.csv
or more simply:
|sed 's/^\(\([^,]*,\)\{34\}\)0/\1+91/' test.csv
If you have moreutils installed, you can simply use the sponge tool:
awk -F "," '{print $35}' test.csv | sed -i 's/^0/+91/g' | sponge test.csv
sponge soaks up the input, closes the input pipe (stdin) and, only then, opens and writes to the test.csv file.
As of 2015, moreutils is available in package repositories of several major Linux distributions, such as Arch Linux, Debian and Ubuntu.
Another perl solution to edit the 35th field in-place:
perl -i -F, -lane '$F[34] =~ s/^0/+91/; print join ",",#F' test.csv
These command-line options are used:
-i edit the file in-place
-n loop around every line of the input file
-l removes newlines before processing, and adds them back in afterwards
-a autosplit mode – split input lines into the #F array. Defaults to splitting on whitespace.
-e execute the perl code
-F autosplit modifier, in this case splits on ,
#F is the array of words in each line, indexed starting with 0
$F[34] is the 35 element of the array
s/^0/+91/ does the substitution

How do I push `sed` matches to the shell call in the replacement pattern?

I need to replace several URLs in a text file with some content dependent on the URL itself. Let's say for simplicity it's the first line of the document at the URL.
What I'm trying is this:
sed "s/^URL=\(.*\)/TITLE=$(curl -s \1 | head -n 1)/" file.txt
This doesn't work, since \1 is not set. However, the shell is getting called. Can I somehow push the sed match variables to that subprocess?
The accept answer is just plain wrong. Proof:
Make an executable script foo.sh:
#! /bin/bash
echo $* 1>&2
Now run it:
$ echo foo | sed -e "s/\\(foo\\)/$(./foo.sh \\1)/"
\1
$
The $(...) is expanded before sed is run.
So you are trying to call an external command from inside the replacement pattern of a sed substitution. I dont' think it can be done, the $... inside a pattern just allows you to use an already existent (constant) shell variable.
I'd go with Perl, see the /e option in the search-replace operator (s/.../.../e).
UPDATE: I was wrong, sed plays nicely with the shell, and it allows you do to that. But, then, the backlash in \1 should be escaped. Try instead:
sed "s/^URL=\(.*\)/TITLE=$(curl -s \\1 | head -n 1)/" file.txt
Try this:
sed "s/^URL=\(.*\)/\1/" file.txt | while read url; do sed "s#URL=\($url\)#TITLE=$(curl -s $url | head -n 1)#" file.txt; done
If there are duplicate URLs in the original file, then there will be n^2 of them in the output. The # as a delimiter depends on the URLs not including that character.
Late reply, but making sure people don't get thrown off by the answers here -- this can be done in gnu sed using the e command. The following, for example, decrements a number at the beginning of a line:
echo "444 foo" | sed "s/\([0-9]*\)\(.*\)/expr \1 - 1 | tr -d '\n'; echo \"\2\";/e"
will produce:
443 foo