Remove everything after the first / including the first / for each line - perl

I have a file with lots of host names. Some have a url part after the host that I'd like to remove. In other words:
google.com
facebook.com
acme.com/news/frontpage
bbc.co.uk
abc.com/home/index
Should become
google.com
facebook.com
acme.com
bbc.co.uk
abc.com

One way:
sed 's|/.*||' file
Results:
google.com
facebook.com
acme.com
bbc.co.uk
abc.com
You may want to read more about using the slash as a delimiter. HTH.

Try doing this :
cut -d '/' -f1 file.txt
or
awk -F/ '{print $1}' file.txt
or
perl -F/ -lane 'print $F[0]' file.txt

awk -F/ '{print $1}' your_file
or
all the other solutions cannot change the file inplace.but in case of steve you need to add a -i flag for that sed solution.But still it will not work on solaris.
below perl solutiopn works on all the platform and replaces the file inplace
perl -pi -e 's/\/.*//g' your_file

Related

Remove all the characters from string after last '/'

I have the followiing input file and I need to remove all the characters from the strings that appear after the last '/'. I'll also show my expected output below.
input:
/start/one/two/stopone.js
/start/one/two/three/stoptwo.js
/start/one/stopxyz.js
expected output:
/start/one/two/
/start/one/two/three/
/start/one/
I have tried to use sed but with no luck so far.
You could simply use good old grep:
grep -o '.*/' file.txt
This simple expression takes advantage of the fact that grep is matching greedy. Meaning it will consume as much characters as possible, including /, until the last / in path.
Original Answer:
You can use dirname:
while read line ; do
echo dirname "$line"
done < file.txt
or sed:
sed 's~\(.*/\).*~\1~' file.txt
perl -lne 'print $1 if(/(.*)\//)' your_file
Try this GNU sed command,
$ sed -r 's~^(.*\/).*$~\1~g' file
/start/one/two/
/start/one/two/three/
/start/one/
Through awk,
awk -F/ '{sub(/.*/,"",$NF); print}' OFS="/" file

search string and excise beginning portion - Solaris

Another Solaris question.
This is my file.
/abc/123/gfh/hello/what/is/up <THIS WOULD BE WHERE A NEW LINE STARTS>
bhn/fda/fds/hello/the/sky/is/blue <THIS WOULD BE WHERE A NEW LINE STARTS>
...etc
I need to delete everything before "hello" include the forward slash "/" infront of it for everyline in the file...
I'm stuck -> I had used a sed -E command but Solaris doesn't recognize the "-E". sigh
I think you can grep this:
grep -o hello.*
This should delete everything up to the slash before "hello":
sed -e 's|^.*hello/|hello|' <inputfile >outputfile
That should do it:
sed -e 's/.hello(.)/hello\1/'
#user4815162342: No need for "^" in the sed solution, ".*" would suffice.
Since there was an awk tag too, here's the equivalent awk solution:
awk '{sub(/.*hello\//,"hello")}1'

how to use result from a pipe in the next sed command?

I want to use sed to do this. I have 2 files:
keys.txt:
host1
host2
test.txt
host1 abc
host2 cdf
host3 abaasdf
I want to use sed to remove any lines in test.txt that contains the keyword in keys.txt. So the result of test.txt should be
host3 abaasdf
Can somebody show me how to do that with sed?
Thanks,
I'd recommend using grep for this (especially fgrep since there are no regexps involved), so
fgrep -v -f keys.txt test.txt
does it fine. With sed quickly this works:
sed -i.ORIGKEYS.txt ^-e 's_^_/_' -e 's_$_/d_' keys.txt
sed -f keys.txt test.txt
(This modifies the original keys.txt in place - with backup - to a sourceable sed script.)
fgrep -v -f is the best solution. Here are a couple of alternatives:
A combination of comm and join
comm -13 <(join keys.txt test.txt) test.txt
or awk
awk 'NR==FNR {key[$1]; next} $1 in key {next} 1' keys.txt test.txt
This might work for you (GNU sed):
sed 's|.*|/^&\\>/d|' keys.txt | sed -f - test.txt

sed/awk : match a pattern and return everything between the end of the pattern and a semicolon

I have a line:
<random junk>TYPE=snp;<more random junk>
and I need to return everything between the end of TYPE= and the ; (in this case snp but it could be any of a number of text strings.
I tried various sed / awk solutions but I can't seem to get it working. I have the feeling this is a simple problem so, sorry about that.
This seems to work:
sed 's/.*TYPE=\(.*\);.*/\1/'
EDIT:
Ah, so there can be semicolons in the random junk. Try this:
sed 's/.*TYPE=\([^;]*\);.*/\1/'
requires GNU grep:
grep -Po '(?<=TYPE=)[^;]+'
meaning: preceded by "TYPE=", find some non-semicolon characters
One way using GNU sed:
sed -r 's/.*TYPE=([^;]+).*/\1/' file.txt
Since you also tagged this awk:
$ text='<random junk>TYPE=snp;<more random junk>'
$ echo "$text" | awk -FTYPE= '{sub(/;.*/,"",$2); print $2}'
snp
$ text='foo=bar;baz=fnu;TYPE=snp;XAI=0;XAM=0'
$ echo "$text" | awk -FTYPE= '{sub(/;.*/,"",$2); print $2}'
snp
(Only using the variable to keep the lines from wrapping.)
Or, to parse this as set of variable=value pairs rather than just a string of text:
$ echo "$text" | awk -vRS=";" -F= '$1=="TYPE" {print $2}'
snp
You can also do this in pure bash, if you want:
$ t="red=blue;TYPE=snp;XAI=0.0037843;XAM=0.0170293;XAS=0.013245;XRI=0;XRM=0"
$ t=${t#*TYPE=}
$ t=${t%%;*}
$ echo $t
snp

Filter text based in a multiline match criteria

I have the following sed command. I need to execute the below command in single line
cat File | sed -n '
/NetworkName/ {
N
/\n.*ims3/ p
}' | sed -n 1p | awk -F"=" '{print $2}'
I need to execute the above command in single line. can anyone please help.
Assume that the contents of the File is
System.DomainName=shayam
System.Addresses=Fr6
System.Trusted=Yes
System.Infrastructure=No
System.NetworkName=AS
System.DomainName=ims5.com
System.DomainName=Ram
System.Addresses=Fr9
System.Trusted=Yes
System.Infrastructure=No
System.NetworkName=Peer
System.DomainName=ims7.com
System.DomainName=mani
System.Addresses=Hello
System.Trusted=Yes
System.Infrastructure=No
System.NetworkName=Peer
System.DomainName=ims3.com
And after executing the command you will get only peer as the output. Can anyone please help me out?
You can use a single nawk command. And you can lost the useless cat
nawk -F"=" '/NetworkName/{n=$2;getline;if($2~/ims3/){print n} }' file
You can use sed as well as proposed by others, but i prefer less regex and less clutter.
The above save the value of the network name to "n". Then, get the next line and check the 2nd field against "ims3". If matched, then print the value of "n".
Put that code in a separate .sh file, and run it as your single-line command.
cat File | sed -n '/NetworkName/ { N; /\n.*ims3/ p }' | sed -n 1p | awk -F"=" '{print $2}'
Assuming that you want the network name for the domain ims3, this command line works without sed:
grep -B 1 ims3 File | head -n 1 | awk -F"=" '{print $2}'
So, you want the network name where the domain name on the following line includes 'ims3', and not the one where the following line includes 'ims7' (even though the network names in the example are the same).
sed -n '/NetworkName/{N;/ims3/{s/.*NetworkName=\(.*\)\n.*/\1/p;};}' File
This avoids abuse of felines, too (not to mention reducing the number of commands executed).
Tested on MacOS X 10.6.4, but there's no reason to think it won't work elsewhere too.
However, empirical evidence shows that Solaris sed is different from MacOS sed. It can all be done in one sed command, but it needs three lines:
sed -n '/NetworkName/{N
/ims3/{s/.*NetworkName=\(.*\)\n.*/\1/p;}
}' File
Tested on Solaris 10.
You just need to put -e pretty much everywhere you'd break the command at a newline or have a semicolon. You don't need the extra call to sed or awk or cat.
sed -n -e '/NetworkName/ {' -e 'N' -e '/\n.*ims3/ s/[^\n]*=\(.*\).*/\1/P' -e '}' File