can please somebody help me with this?
I have this line
test.txt
siemplog1.nw.lan / 172.31.180.22
I tried this command sed -Ei "s/^[a-z A-Z].*([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*/\1/" test.txt
result should be 172.31.180.22 but I got this 2.31.180.22
thank you
The .* matches as many chars as it can (it is "greedy") and since [0-9]{1,3} can match just 1 digit, the 17 is matched by the .* and 2 is matched by [0-9]{1,3}.
You may stop the .* before any non-digit:
sed -Ei 's~.*[^0-9]([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*~\1~' test.txt
Or, before /:
sed -Ei 's~.*/ *([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*~\1~' test.txt
See online sed demo:
s='siemplog1.nw.lan / 172.31.180.22'
sed -E 's~.*/ *([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*~\1~' <<< "$s"
# => 172.31.180.22
If you string is always in this format, you might simplify the sed command to
sed -E 's~.*/ *([0-9.]+)~\1~p'
sed -E 's~.*/ *([0-9.]+).*~\1~p'
If you have space before ip
$echo siemplog1.nw.lan / 172.31.180.22 | sed -E "s/.* ([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*/\1/"
172.31.180.22
$
Related
This is mostly by curiosity, I am trying to have the same behavior as:
echo -e "test1:test2:test3"| sed 's/:/\n/g' | grep 1
in a single sed command.
I already tried
echo -e "test1:test2:test3"| sed -e "s/:/\n/g" -n "/1/p"
But I get the following error:
sed: can't read /1/p: No such file or directory
Any idea on how to fix this and combine different types of commands into a single sed call?
Of course this is overly simplified compared to the real usecase, and I know I can get around by using multiple calls, again this is just out of curiosity.
EDIT: I am mostly interested in the sed tool, I already know how to do it using other tools, or even combinations of those.
EDIT2: Here is a more realistic script, closer to what I am trying to achieve:
arch=linux64
base=https://chromedriver.storage.googleapis.com
split="<Contents>"
curl $base \
| sed -e 's/<Contents>/<Contents>\n/g' \
| grep $arch \
| sed -e 's/^<Key>\(.*\)\/chromedriver.*/\1/' \
| sort -V > out
What I would like to simplify is the curl line, turning it into something like:
curl $base \
| sed 's/<Contents>/<Contents>\n/g' -n '/1/p' -e 's/^<Key>\(.*\)\/chromedriver.*/\1/' \
| sort -V > out
Here are some alternatives, awk and sed based:
sed -E "s/(.*:)?([^:]*1[^:]*).*/\2/" <<< "test1:test2:test3"
awk -v RS=":" '/1/' <<< "test1:test2:test3"
# or also
awk 'BEGIN{RS=":"} /1/' <<< "test1:test2:test3"
Or, using your logic, you would need to pipe a second sed command:
sed "s/:/\n/g" <<< "test1:test2:test3" | sed -n "/1/p"
See this online demo. The awk solution looks cleanest.
Details
In sed solution, (.*:)?([^:]*1[^:]*).* pattern matches an optional sequence of any 0+ chars and a :, then captures into Group 2 any 0 or more chars other than :, 1, again 0 or more chars other than :, and then just matches the rest of the line. The replacement just keeps Group 2 contents.
In awk solution, the record separator is set to : and then /1/ regex is used to only return the record having 1 in it.
This might work for you (GNU sed):
sed 's/:/\n/;/^[^\n]*1/P;D' file
Replace each : and if the first line in the pattern space contains 1 print it.
Repeat.
An alternative:
sed -Ez 's/:/\n/g;s/^[^1]*$//mg;s/\n+/\n/;s/^\n//' file
This slurps the whole file into memory and replaces all colons by newlines. All lines that do not contain 1 are removed and surplus newlines deleted.
An alternative to the really ugly sed is: grep -o '\w*2\w*'
$ printf "test1:test2:test3\nbob3:bob2:fred2\n" | grep -o '\w*2\w*'
test2
bob2
fred2
grep -o: only matching
Or: grep -o '[^:]*2[^:]*'
echo -e "test1:test2:test3" | sed -En 's/:/\n/g;/^[^\n]*2[^\n]*(\n|$)/P;//!D'
sed -n doesn't print unless told to
sed -E allows using parens to match (\n|$) which is newline or the end of the pattern space
P prints the pattern buffer up to the first newline.
D trims the pattern buffer up to the first newline
[^\n] is a character class that matches anything except a newline
// is sed shorthand for repeating a match
//! is then matching everything that didn't match previously
So, after you split into newlines, you want to make sure the 2 character is between the start of the pattern buffer ^ and the first newline.
And, if there is not the character you are looking for, you want to D delete up to the first newline.
At that point, it works for one line of input, with one string containing the character you're looking for.
To expand to several matches within a line, you have to ta, conditionally branch back to label :a:
$ printf "test1:test2:test3\nbob3:bob2:fred2\n" | \
sed -En ':a s/:/\n/g;/^[^\n]*2[^\n]*(\n|$)/P;D;ta'
test2
bob2
fred2
This is simply NOT a job for sed. With GNU awk for multi-char RS:
$ echo "test1:test2:test3:test4:test5:test6"| awk -v RS='[:\n]' '/1/'
test1
$ echo "test1:test2:test3:test4:test5:test6"| awk -v RS='[:\n]' 'NR%2'
test1
test3
test5
$ echo "test1:test2:test3:test4:test5:test6"| awk -v RS='[:\n]' '!(NR%2)'
test2
test4
test6
$ echo "foo1:bar1:foo2:bar2:foo3:bar3" | awk -v RS='[:\n]' '/foo/ || /2/'
foo1
foo2
bar2
foo3
With any awk you'd just have to strip the \n from the final record before operating on it:
$ echo "test1:test2:test3:test4:test5:test6"| awk -v RS=':' '{sub(/\n$/,"")} /1/'
test1
I'm trying to copy part of a line to append to the end:
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1_IonXpress_024_genomic.fna.gz
becomes:
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1/GCA_900169985_IonXpress_024_genomic.fna.gz
I have tried:
sed 's/\(.*(GCA_\)\(.*\))/\1\2\2)'
$ f1=$'ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1_IonXpress_024_genomic.fna.gz'
$ echo "$f1"
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1_IonXpress_024_genomic.fna.gz
$ sed -E 's/(.*)(GCA_.[^.]*)(.[^_]*)(.*)/\1\2\3\/\2\4/' <<<"$f1"
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1/GCA_900169985_IonXpress_024_genomic.fna.gz
sed -E (or -r in some systems) enables extended regex support in sed , so you don't need to escape the group parenthesis ( ).
The format (GCA_.[^.]*) equals to "get from GCA_ all chars up and excluding the first found dot" :
$ sed -E 's/(.*)(GCA_.[^.]*)(.[^_]*)(.*)/\2/' <<<"$f1"
GCA_900169985
Similarly (.[^_]*) means get all chars up to first found _ (excluding _ char). This is the regex way to perform a non greedy/lazy capture (in perl regex this would have been written something like as .*_?)
$ sed -E 's/(.*)(GCA_.[^.]*)(.[^_]*)(.*)/\3/' <<<"$f1"
.1
Short sed approach:
s="ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1_IonXpress_024_genomic.fna.gz"
sed -E 's/(GCA_[^._]+)\.([^_]+)/\1.\2\/\1/' <<< "$s"
The output:
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1/GCA_900169985_IonXpress_024_genomic.fna.gz
I want to extract /battle/result from following the txt file
$ cat sample
user_id=1234 /battle/start
I run following the sed command
$ cat sample | sed 's|.*\(/.*\)|\1|g'
/start
But, result is deleting /battle, so I can't extract it as I want.
What is wrong with it?
You can remove all characters up to last space:
$ sed 's/.* //' <<< "user_id=1234 /battle/start"
/battle/start
or use cut:
$ cut -d' ' -f2 <<< "user_id=1234 /battle/start"
/battle/start
Sed tries to do a greedy (maximal) match, therefore .* matches your whole line up to but not including the second /.
Try:
< sample sed 's|.* \(/.*\)|\1|g'
or
< sample sed 's|[^/]*\(/.*\)|\1|g'
In your RE the .* is greedy and swallows the /battle part, you could try to invert the logic and delete everything in front of /:
cat sample | sed 's/[^/]*//'
Here [^/]* matches everthing that is not a / and replaces it with nothing.
echo user_id=1234 /battle/start |grep -oP '\s\K.*'
/battle/start
echo user_id=1234 /battle/start |sed -r 's/(^.*\s)(.*)/\2/g'
/battle/start
how may I fix the following: sed -e 's/é/\\'{e}/g', as to substitute é by \'{e}?
Issue is that second occurence of ' is seen as command delimiter;
sed -e 's/é/\\\'{e}/g' does not work either.
With GNU sed. To replace \'{e} by é:
echo "\'{e}" | sed "s/\\\'{e}/é/"
Output:
é
I want to use sed to do this. I have 2 files:
keys.txt:
host1
host2
test.txt
host1 abc
host2 cdf
host3 abaasdf
I want to use sed to remove any lines in test.txt that contains the keyword in keys.txt. So the result of test.txt should be
host3 abaasdf
Can somebody show me how to do that with sed?
Thanks,
I'd recommend using grep for this (especially fgrep since there are no regexps involved), so
fgrep -v -f keys.txt test.txt
does it fine. With sed quickly this works:
sed -i.ORIGKEYS.txt ^-e 's_^_/_' -e 's_$_/d_' keys.txt
sed -f keys.txt test.txt
(This modifies the original keys.txt in place - with backup - to a sourceable sed script.)
fgrep -v -f is the best solution. Here are a couple of alternatives:
A combination of comm and join
comm -13 <(join keys.txt test.txt) test.txt
or awk
awk 'NR==FNR {key[$1]; next} $1 in key {next} 1' keys.txt test.txt
This might work for you (GNU sed):
sed 's|.*|/^&\\>/d|' keys.txt | sed -f - test.txt