Remove range of words with sed - sed

I'm trying to remove a range of words in Unix command line with sed from a file and I just can't figure it out. For example, how can I remove the words at positions 2-4?
If the file contains: "This is a file created by me." I want it to be: "This created by me."
Thanks a lot!

Try this with GNU sed (to print word 1 and word 5 to last word):
echo "This is a file created by me." | sed 'y/ /\n/' | sed -n '1p;5,$p' | sed 'N;N;N;y/\n/ /'
Output:
This created by me.

You can use also use awk for this:
echo "This is a file created by me." | awk '{for (i=1;i<=NF;i++) if (i<2||i>4) printf "%s ",$i;print ""}'
This created by me.

This might work for you (GNU sed):
sed -r 's/(\s+\S+){3}//' file

Related

How to replace consecutive symbols using only one sed command?

I have a simple .csv file with lines that holds 't' values. Here is the example:
2ABC;t;t;t;tortuga;fault;t;t;bored
I want to replace them to '1' using sed.
If I make sed "s/;t;/;1;/g" I get the next result:
2ABC;1;t;1;tortuga;fault;1;t;bored
As you can see, consecutive ';t;' have been replaced through one. Yes, I can replace all ';t;' by sed -e "s/;t;/;1;/g" -e "s/;t;/;1;/g" but this is boring.
How can I make the replacement by one sed command?
If there is something to replace, branch to replace again.
sed ': again; /;t;/{ s//;1;/; b again }'
Overall, parsing cvs with sed is crude. Consider awk.
awk -F';' -v OFS=';' '{ for(i=1;i<=NF;++i) if ($i=="t") $i=1 } 1'
Lookarounds is helpful in such cases:
$ s='t;2ABC;t;t;t;tortuga;fault;t;t;bored;t'
$ echo "$s" | perl -lpe 's/(?<![^;])t(?![^;])/1/g'
1;2ABC;1;1;1;tortuga;fault;1;1;bored;1
echo '2ABC;t;t;t;tortuga;fault;t;t;bored' |
— gawk-specific solution
gawk -be '(ORS = RT)^!(NF = NF)' FS='^t$' OFS=1 RS=';'
— cross-awk-solution
{m,g,n}awk 'gsub(FS, OFS, $!(NF = NF))^_' FS=';t;' OFS=';1;' RS=
2ABC;1;1;1;tortuga;fault;1;1;bored

Matching pattern on multiple lines

I have a file as below
NAME(BOLIVIA) TYPE(SA)
APPLIC(Java) IP(192.70.xxx.xx)
NAME(BOLIVIA) TYPE(SA)
APPLIC(Java) IP(192.71.xxx.xx)
I am trying to extract the values NAME and IP using sed:
cat file1 |
sed ':a
N
$!ba
s/\n/ /g' | sed -n 's/.*\(NAME(BOLI...)\).*\(IP(.*)\).*/\1 \2/p'
However, I'm only getting the output:
NAME(BOLIVIA) IP(192.71.xxx.xx)
What I would like is:
NAME(BOLIVIA) IP(192.70.xxx.xx)
NAME(BOLIVIA) IP(192.71.xxx.xx)
Would appreciate it if someone could give me a pointer on what I'm missing.
TIA
Your first sed commands reformats the file into one long line. You could have used tr -d "\n" for this, but that is not the problem.
The problem is in the second part, where the .* greedy eats as much as possible until finding the last match.
Your solution could be "fixed" with the ugly
# Do not use this:
sed -zn 's/[^\n]*\(NAME(BOLI...)\)[^\n]*\n[^\n]*\(IP([^)]*)\)[^\n]*/\1 \2/gp' file1
Possible solutions:
cat file1 | paste -d " " - - | sed -n 's/.*\(NAME(BOLI...)\).*\(IP(.*)\).*/\1 \2/p'
# or
grep -Eo "(NAME\(BOLI...\)|IP\(.*\))" file1 | paste -d " " - -
# or
printf "%s %s\n" $(grep -Eo "(NAME\(BOLI...\)|IP\(.*\))" file1)
In case you are ok with awk could you please try following. Written and tested in link
https://ideone.com/bJDzgf with shown samples only.
awk '
match($0,/^NAME\([^)]*/){
name=substr($0,RSTART+5,RLENGTH-5)
next
}
match($0,/IP\([^)]*/){
print name,substr($0,RSTART+3,RLENGTH-3)
name=""
}
' Input_file
This might work for you (GNU sed):
sed -n '/NAME/{N;/IP/s/\s.*\s/ /p}' file
If a line contains NAME and the following line contains IP remove everything between and print the result.
An alternative shorter awk:
awk '$1 ~ /^NAME/ {nm = $1} $2 ~ /^IP/ {print nm, $2}' file
NAME(BOLIVIA) IP(192.70.xxx.xx)
NAME(BOLIVIA) IP(192.71.xxx.xx)
The issue in your script is the use .* which matches in a greedy way
so that you have only the first NAME(BOLI...) and last IP(.*)
If you can use python :
#!/bin/bash
python -c '
import re, sys
for ar in re.findall(r"(NAME\(BOLI.*?\)).*?(IP\(.*?\))", sys.stdin.read(), re.DOTALL):
print(*ar)
' < input-file

Better way to fix mocha lcov output using sed

Due to the know prob of mocha-lcov-mocha breaking file paths, I need to fix the current output paths that looks like this:
SF:Vis/test-Guid.coffee
SF:Vis/Guid.coffee
SF:Vis/test-Vis-Edge.coffee
SF:Vis/Vis-Edge.coffee
into
SF:test/Vis/test-Guid.coffee
SF:src/Vis/Guid.coffee
SF:test/Vis/test-Vis-Edge.coffee
SF:src/Vis/Vis-Edge.coffee
I'm not very good with sed, but I got it to work using:
mocha -R mocha-lcov-reporter _coverage/test --recursive | sed 's,SF:,SF:src/,' | sed s',SF.*test.*,SF:test//&,' | sed s',/SF:,,' | sed s',test/src,test,' | ./node_modules/coveralls/bin/coveralls.js
which is basically doing 4 sed commands in sequence
sed 's,SF:,SF:src/,'
sed s',SF.*test.*,SF:test//&,'
sed s',/SF:,,'
sed s',test/src,test,'
my question is if there is a way to do with this one sed command, or use another osx/linux command line tool
Initially put "src/" after every ":" and then if "test" is found on the line replace "src" with "test":
$ sed 's,:,:src/,;/test/s,src,test,' file
SF:test/Vis/test-Guid.coffee
SF:src/Vis/Guid.coffee
SF:test/Vis/test-Vis-Edge.coffee
SF:src/Vis/Vis-Edge.coffee
You could put all the sed commands in a file, one line per command, and just use "sed -e script". But if you just want it on a single command-line, separate with semicolons. This works for me:
sed 's,SF:,SF:src/,;s,SF.*test.*,SF:test//&,;s,SF:,,;s,test/src/,test,'
sed command
sed '\#test#!{s#SF:Vis/#SF:src/Vis/#g};\#SF:Vis/test#{s#SF:Vis/test#SF:test/Vis/test#g};' my_file
Here is an awk version:
awk -F: '/SF/ {$0=$1FS (/test/?"test/":"src/")$2}1' file
SF:test/Vis/test-Guid.coffee
SF:src/Vis/Guid.coffee
SF:test/Vis/test-Vis-Edge.coffee
SF:src/Vis/Vis-Edge.coffee
How it works:
awk -F: ' # Set field separator to ":"
/SF/{ # Does line start with "SF"?
$0=$1FS (/test/?"test/":"src/")$2 # Recreat String by adding "test" if line contains "test", else "src"
}
1 # Print all lines
' file # read the file

Delete first and last line or record from file using sed

I want to delete first and last line from the file
file1 code :
H|ACCT|XEC|1|TEMP|20130215035845|
849002|48|1208004|1
849007|28|1208004|1
T|2
After delete the output should be
849002|48|1208004|1
849007|28|1208004|1
I have tried below method but has to run it 2 times, I want one liner solution to remove both in one go!
sed '1,1d' file1.txt >> file1.out
sed '$d' file1.out >> file2
Please suggest one liner code....
You could use ;
sed '1d; $d' file
Use Command Separator
In sed, you can separate commands using a semicolon. For example:
sed '1d; $d' /path/to/file
How about:
sed '$d' < file1.txt | sed "1d"
Try sed -i '1d;$d' /path/to/file
awk 'NR>2{print v}{v=$0}'
Starting with line 3, print the previous line each time. This means the first and last lines will not be printed.

sed/awk : match a pattern and return everything between the end of the pattern and a semicolon

I have a line:
<random junk>TYPE=snp;<more random junk>
and I need to return everything between the end of TYPE= and the ; (in this case snp but it could be any of a number of text strings.
I tried various sed / awk solutions but I can't seem to get it working. I have the feeling this is a simple problem so, sorry about that.
This seems to work:
sed 's/.*TYPE=\(.*\);.*/\1/'
EDIT:
Ah, so there can be semicolons in the random junk. Try this:
sed 's/.*TYPE=\([^;]*\);.*/\1/'
requires GNU grep:
grep -Po '(?<=TYPE=)[^;]+'
meaning: preceded by "TYPE=", find some non-semicolon characters
One way using GNU sed:
sed -r 's/.*TYPE=([^;]+).*/\1/' file.txt
Since you also tagged this awk:
$ text='<random junk>TYPE=snp;<more random junk>'
$ echo "$text" | awk -FTYPE= '{sub(/;.*/,"",$2); print $2}'
snp
$ text='foo=bar;baz=fnu;TYPE=snp;XAI=0;XAM=0'
$ echo "$text" | awk -FTYPE= '{sub(/;.*/,"",$2); print $2}'
snp
(Only using the variable to keep the lines from wrapping.)
Or, to parse this as set of variable=value pairs rather than just a string of text:
$ echo "$text" | awk -vRS=";" -F= '$1=="TYPE" {print $2}'
snp
You can also do this in pure bash, if you want:
$ t="red=blue;TYPE=snp;XAI=0.0037843;XAM=0.0170293;XAS=0.013245;XRI=0;XRM=0"
$ t=${t#*TYPE=}
$ t=${t%%;*}
$ echo $t
snp