Remove range of words with sed

Remove range of words with sed - sed

I'm trying to remove a range of words in Unix command line with sed from a file and I just can't figure it out. For example, how can I remove the words at positions 2-4?
If the file contains: "This is a file created by me." I want it to be: "This created by me."
Thanks a lot!

Try this with GNU sed (to print word 1 and word 5 to last word):
echo "This is a file created by me." | sed 'y/ /\n/' | sed -n '1p;5,$p' | sed 'N;N;N;y/\n/ /'
Output:
This created by me.

You can use also use awk for this:
echo "This is a file created by me." | awk '{for (i=1;i<=NF;i++) if (i<2||i>4) printf "%s ",$i;print ""}'
This created by me.

This might work for you (GNU sed):
sed -r 's/(\s+\S+){3}//' file

Related

How to replace consecutive symbols using only one sed command?

I have a simple .csv file with lines that holds 't' values. Here is the example:
2ABC;t;t;t;tortuga;fault;t;t;bored
I want to replace them to '1' using sed.
If I make sed "s/;t;/;1;/g" I get the next result:
2ABC;1;t;1;tortuga;fault;1;t;bored
As you can see, consecutive ';t;' have been replaced through one. Yes, I can replace all ';t;' by sed -e "s/;t;/;1;/g" -e "s/;t;/;1;/g" but this is boring.
How can I make the replacement by one sed command?

If there is something to replace, branch to replace again.
sed ': again; /;t;/{ s//;1;/; b again }'
Overall, parsing cvs with sed is crude. Consider awk.
awk -F';' -v OFS=';' '{ for(i=1;i<=NF;++i) if ($i=="t") $i=1 } 1'

Lookarounds is helpful in such cases:
$ s='t;2ABC;t;t;t;tortuga;fault;t;t;bored;t'
$ echo "$s" | perl -lpe 's/(?<![^;])t(?![^;])/1/g'
1;2ABC;1;1;1;tortuga;fault;1;1;bored;1

echo '2ABC;t;t;t;tortuga;fault;t;t;bored' |
— gawk-specific solution
gawk -be '(ORS = RT)^!(NF = NF)' FS='^t$' OFS=1 RS=';'
— cross-awk-solution
{m,g,n}awk 'gsub(FS, OFS, $!(NF = NF))^_' FS=';t;' OFS=';1;' RS=
2ABC;1;1;1;tortuga;fault;1;1;bored

Matching pattern on multiple lines

I have a file as below
NAME(BOLIVIA) TYPE(SA)
APPLIC(Java) IP(192.70.xxx.xx)
NAME(BOLIVIA) TYPE(SA)
APPLIC(Java) IP(192.71.xxx.xx)
I am trying to extract the values NAME and IP using sed:
cat file1 |
sed ':a
N
$!ba
s/\n/ /g' | sed -n 's/.*\(NAME(BOLI...)\).*\(IP(.*)\).*/\1 \2/p'
However, I'm only getting the output:
NAME(BOLIVIA) IP(192.71.xxx.xx)
What I would like is:
NAME(BOLIVIA) IP(192.70.xxx.xx)
NAME(BOLIVIA) IP(192.71.xxx.xx)
Would appreciate it if someone could give me a pointer on what I'm missing.
TIA

Your first sed commands reformats the file into one long line. You could have used tr -d "\n" for this, but that is not the problem.
The problem is in the second part, where the .* greedy eats as much as possible until finding the last match.
Your solution could be "fixed" with the ugly
# Do not use this:
sed -zn 's/[^\n]*\(NAME(BOLI...)\)[^\n]*\n[^\n]*\(IP([^)]*)\)[^\n]*/\1 \2/gp' file1
Possible solutions:
cat file1 | paste -d " " - - | sed -n 's/.*\(NAME(BOLI...)\).*\(IP(.*)\).*/\1 \2/p'
# or
grep -Eo "(NAME\(BOLI...\)|IP\(.*\))" file1 | paste -d " " - -
# or
printf "%s %s\n" $(grep -Eo "(NAME\(BOLI...\)|IP\(.*\))" file1)

In case you are ok with awk could you please try following. Written and tested in link
https://ideone.com/bJDzgf with shown samples only.
awk '
match($0,/^NAME\([^)]*/){
name=substr($0,RSTART+5,RLENGTH-5)
next
}
match($0,/IP\([^)]*/){
print name,substr($0,RSTART+3,RLENGTH-3)
name=""
}
' Input_file

This might work for you (GNU sed):
sed -n '/NAME/{N;/IP/s/\s.*\s/ /p}' file
If a line contains NAME and the following line contains IP remove everything between and print the result.

An alternative shorter awk:
awk '$1 ~ /^NAME/ {nm = $1} $2 ~ /^IP/ {print nm, $2}' file
NAME(BOLIVIA) IP(192.70.xxx.xx)
NAME(BOLIVIA) IP(192.71.xxx.xx)

The issue in your script is the use .* which matches in a greedy way
so that you have only the first NAME(BOLI...) and last IP(.*)
If you can use python :
#!/bin/bash
python -c '
import re, sys
for ar in re.findall(r"(NAME\(BOLI.*?\)).*?(IP\(.*?\))", sys.stdin.read(), re.DOTALL):
print(*ar)
' < input-file

Better way to fix mocha lcov output using sed

Due to the know prob of mocha-lcov-mocha breaking file paths, I need to fix the current output paths that looks like this:
SF:Vis/test-Guid.coffee
SF:Vis/Guid.coffee
SF:Vis/test-Vis-Edge.coffee
SF:Vis/Vis-Edge.coffee
into
SF:test/Vis/test-Guid.coffee
SF:src/Vis/Guid.coffee
SF:test/Vis/test-Vis-Edge.coffee
SF:src/Vis/Vis-Edge.coffee
I'm not very good with sed, but I got it to work using:
mocha -R mocha-lcov-reporter _coverage/test --recursive | sed 's,SF:,SF:src/,' | sed s',SF.*test.*,SF:test//&,' | sed s',/SF:,,' | sed s',test/src,test,' | ./node_modules/coveralls/bin/coveralls.js
which is basically doing 4 sed commands in sequence
sed 's,SF:,SF:src/,'
sed s',SF.*test.*,SF:test//&,'
sed s',/SF:,,'
sed s',test/src,test,'
my question is if there is a way to do with this one sed command, or use another osx/linux command line tool

Initially put "src/" after every ":" and then if "test" is found on the line replace "src" with "test":
$ sed 's,:,:src/,;/test/s,src,test,' file
SF:test/Vis/test-Guid.coffee
SF:src/Vis/Guid.coffee
SF:test/Vis/test-Vis-Edge.coffee
SF:src/Vis/Vis-Edge.coffee

You could put all the sed commands in a file, one line per command, and just use "sed -e script". But if you just want it on a single command-line, separate with semicolons. This works for me:
sed 's,SF:,SF:src/,;s,SF.*test.*,SF:test//&,;s,SF:,,;s,test/src/,test,'

sed command
sed '\#test#!{s#SF:Vis/#SF:src/Vis/#g};\#SF:Vis/test#{s#SF:Vis/test#SF:test/Vis/test#g};' my_file

Here is an awk version:
awk -F: '/SF/ {$0=$1FS (/test/?"test/":"src/")$2}1' file
SF:test/Vis/test-Guid.coffee
SF:src/Vis/Guid.coffee
SF:test/Vis/test-Vis-Edge.coffee
SF:src/Vis/Vis-Edge.coffee
How it works:
awk -F: ' # Set field separator to ":"
/SF/{ # Does line start with "SF"?
$0=$1FS (/test/?"test/":"src/")$2 # Recreat String by adding "test" if line contains "test", else "src"
}
1 # Print all lines
' file # read the file

Delete first and last line or record from file using sed

I want to delete first and last line from the file
file1 code :
H|ACCT|XEC|1|TEMP|20130215035845|
849002|48|1208004|1
849007|28|1208004|1
T|2
After delete the output should be
849002|48|1208004|1
849007|28|1208004|1
I have tried below method but has to run it 2 times, I want one liner solution to remove both in one go!
sed '1,1d' file1.txt >> file1.out
sed '$d' file1.out >> file2
Please suggest one liner code....

You could use ;
sed '1d; $d' file

Use Command Separator
In sed, you can separate commands using a semicolon. For example:
sed '1d; $d' /path/to/file

How about:
sed '$d' < file1.txt | sed "1d"

Try sed -i '1d;$d' /path/to/file

awk 'NR>2{print v}{v=$0}'
Starting with line 3, print the previous line each time. This means the first and last lines will not be printed.

sed/awk : match a pattern and return everything between the end of the pattern and a semicolon

I have a line:
<random junk>TYPE=snp;<more random junk>
and I need to return everything between the end of TYPE= and the ; (in this case snp but it could be any of a number of text strings.
I tried various sed / awk solutions but I can't seem to get it working. I have the feeling this is a simple problem so, sorry about that.

This seems to work:
sed 's/.*TYPE=\(.*\);.*/\1/'
EDIT:
Ah, so there can be semicolons in the random junk. Try this:
sed 's/.*TYPE=\([^;]*\);.*/\1/'

requires GNU grep:
grep -Po '(?<=TYPE=)[^;]+'
meaning: preceded by "TYPE=", find some non-semicolon characters

One way using GNU sed:
sed -r 's/.*TYPE=([^;]+).*/\1/' file.txt

Since you also tagged this awk:
$ text='<random junk>TYPE=snp;<more random junk>'
$ echo "$text" | awk -FTYPE= '{sub(/;.*/,"",$2); print $2}'
snp
$ text='foo=bar;baz=fnu;TYPE=snp;XAI=0;XAM=0'
$ echo "$text" | awk -FTYPE= '{sub(/;.*/,"",$2); print $2}'
snp
(Only using the variable to keep the lines from wrapping.)
Or, to parse this as set of variable=value pairs rather than just a string of text:
$ echo "$text" | awk -vRS=";" -F= '$1=="TYPE" {print $2}'
snp

You can also do this in pure bash, if you want:
$ t="red=blue;TYPE=snp;XAI=0.0037843;XAM=0.0170293;XAS=0.013245;XRI=0;XRM=0"
$ t=${t#*TYPE=}
$ t=${t%%;*}
$ echo $t
snp

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Remove range of words with sed - sed

I'm trying to remove a range of words in Unix command line with sed from a file and I just can't figure it out. For example, how can I remove the words at positions 2-4? If the file contains: "This is a file created by me." I want it to be: "This created by me." Thanks a lot!

Try this with GNU sed (to print word 1 and word 5 to last word): echo "This is a file created by me." | sed 'y/ /\n/' | sed -n '1p;5,$p' | sed 'N;N;N;y/\n/ /' Output: This created by me.

You can use also use awk for this: echo "This is a file created by me." | awk '{for (i=1;i<=NF;i++) if (i<2||i>4) printf "%s ",$i;print ""}' This created by me.

This might work for you (GNU sed): sed -r 's/(\s+\S+){3}//' file

Related

How to replace consecutive symbols using only one sed command?

Matching pattern on multiple lines

Better way to fix mocha lcov output using sed

Delete first and last line or record from file using sed

sed/awk : match a pattern and return everything between the end of the pattern and a semicolon

Categories

Resources