Get a column using sed and modify it

Get a column using sed and modify it - sed

I need to modify the 5 to 9 column directly in each line from a file.
Currently i'm doing this in a while loop, getting each column by line.
For example a line looks like:
echo "m.mustermann#muster.com;surnanme;givenname;displayname;1111;2222;3333;44(#44;(5555"
line_9=$(echo $line | awk -F "[;]" '{print $9}' | sed 's/[^0-9+*,]*//g')
Is there a possibility to do that with "sed -i" instead of awk
Thanks for any help

I'm not sure it can be done generally in sed, but you could definitely do it in awk:
… | awk -F";" '{ gsub("[^0-9]*","",$9); print $9 }'
If you really want to do it with sed, the expression will look something like:
… | sed -e 's,\(^[^;]*;[^;]*;[^;]*;[^;]*;[^;]*;[^;]*;[^;]*;[^;]*;[0-9]*\)[^0-9]*\([0-9]*\)[^0-9]*\([0-9]*\)[^0-9]*\([0-9]*\)[^0-9]*\([0-9]*\)[^0-9]*\([0-9]*\)[^0-9]*\([0-9]*\)[^0-9]*\([0-9]*\)[^0-9]*\([0-9]*\)\(.*\),\1\2\3\4\5\6\7\8\9,'

For a version with sed (posix) only
line_9="$(echo $line | sed 'H;x;s/^\(.\)\(\([^;]*;\)\{8\}\)\([^;]*\)/\2\1\4\1/;h;s/\(\n\).*\1/\1/;x;s/.*\(\n\)\(.*\)\1.*/\2/;s/[^0-9+*,]*//g;G;s/\(.*\)\(\n\)\(.*\)\2/\3\1/;h;s/.*//;x' )"

Related

How to replace consecutive symbols using only one sed command?

I have a simple .csv file with lines that holds 't' values. Here is the example:
2ABC;t;t;t;tortuga;fault;t;t;bored
I want to replace them to '1' using sed.
If I make sed "s/;t;/;1;/g" I get the next result:
2ABC;1;t;1;tortuga;fault;1;t;bored
As you can see, consecutive ';t;' have been replaced through one. Yes, I can replace all ';t;' by sed -e "s/;t;/;1;/g" -e "s/;t;/;1;/g" but this is boring.
How can I make the replacement by one sed command?

If there is something to replace, branch to replace again.
sed ': again; /;t;/{ s//;1;/; b again }'
Overall, parsing cvs with sed is crude. Consider awk.
awk -F';' -v OFS=';' '{ for(i=1;i<=NF;++i) if ($i=="t") $i=1 } 1'

Lookarounds is helpful in such cases:
$ s='t;2ABC;t;t;t;tortuga;fault;t;t;bored;t'
$ echo "$s" | perl -lpe 's/(?<![^;])t(?![^;])/1/g'
1;2ABC;1;1;1;tortuga;fault;1;1;bored;1

echo '2ABC;t;t;t;tortuga;fault;t;t;bored' |
— gawk-specific solution
gawk -be '(ORS = RT)^!(NF = NF)' FS='^t$' OFS=1 RS=';'
— cross-awk-solution
{m,g,n}awk 'gsub(FS, OFS, $!(NF = NF))^_' FS=';t;' OFS=';1;' RS=
2ABC;1;1;1;tortuga;fault;1;1;bored

Matching pattern on multiple lines

I have a file as below
NAME(BOLIVIA) TYPE(SA)
APPLIC(Java) IP(192.70.xxx.xx)
NAME(BOLIVIA) TYPE(SA)
APPLIC(Java) IP(192.71.xxx.xx)
I am trying to extract the values NAME and IP using sed:
cat file1 |
sed ':a
N
$!ba
s/\n/ /g' | sed -n 's/.*\(NAME(BOLI...)\).*\(IP(.*)\).*/\1 \2/p'
However, I'm only getting the output:
NAME(BOLIVIA) IP(192.71.xxx.xx)
What I would like is:
NAME(BOLIVIA) IP(192.70.xxx.xx)
NAME(BOLIVIA) IP(192.71.xxx.xx)
Would appreciate it if someone could give me a pointer on what I'm missing.
TIA

Your first sed commands reformats the file into one long line. You could have used tr -d "\n" for this, but that is not the problem.
The problem is in the second part, where the .* greedy eats as much as possible until finding the last match.
Your solution could be "fixed" with the ugly
# Do not use this:
sed -zn 's/[^\n]*\(NAME(BOLI...)\)[^\n]*\n[^\n]*\(IP([^)]*)\)[^\n]*/\1 \2/gp' file1
Possible solutions:
cat file1 | paste -d " " - - | sed -n 's/.*\(NAME(BOLI...)\).*\(IP(.*)\).*/\1 \2/p'
# or
grep -Eo "(NAME\(BOLI...\)|IP\(.*\))" file1 | paste -d " " - -
# or
printf "%s %s\n" $(grep -Eo "(NAME\(BOLI...\)|IP\(.*\))" file1)

In case you are ok with awk could you please try following. Written and tested in link
https://ideone.com/bJDzgf with shown samples only.
awk '
match($0,/^NAME\([^)]*/){
name=substr($0,RSTART+5,RLENGTH-5)
next
}
match($0,/IP\([^)]*/){
print name,substr($0,RSTART+3,RLENGTH-3)
name=""
}
' Input_file

This might work for you (GNU sed):
sed -n '/NAME/{N;/IP/s/\s.*\s/ /p}' file
If a line contains NAME and the following line contains IP remove everything between and print the result.

An alternative shorter awk:
awk '$1 ~ /^NAME/ {nm = $1} $2 ~ /^IP/ {print nm, $2}' file
NAME(BOLIVIA) IP(192.70.xxx.xx)
NAME(BOLIVIA) IP(192.71.xxx.xx)

The issue in your script is the use .* which matches in a greedy way
so that you have only the first NAME(BOLI...) and last IP(.*)
If you can use python :
#!/bin/bash
python -c '
import re, sys
for ar in re.findall(r"(NAME\(BOLI.*?\)).*?(IP\(.*?\))", sys.stdin.read(), re.DOTALL):
print(*ar)
' < input-file

tcsh & sed: no output

I’m trying to replace the 3rd column of a file for itself plus the value of column 2 (without any space). I get the proper value for variable c and a but then sed doesn't give any output. Any clue?
#!/bin/tcsh
setenv c `cat lig_mod.pdb | awk '{print $3}'`
echo $c
setenv a `cat lig_mod.pdb | awk '{print $3=$3$2}'`
echo $a
sed -i "" 's/^'"${c}"'$/^'"${a}"'$/g' lig_mod.pdb

Even though awk is usually better for columns parsing this one-liner sed can work for you as well:
sed -i 's/ \(\w*\) \(\w*\) / \1 \2\1 /1' lig_mod.pdb
the '/1' at the end denote the instance number you desire to change which for the 2nd and 3rd columns is the first, but you could use it for any adjacent columns.

sed/awk : match a pattern and return everything between the end of the pattern and a semicolon

I have a line:
<random junk>TYPE=snp;<more random junk>
and I need to return everything between the end of TYPE= and the ; (in this case snp but it could be any of a number of text strings.
I tried various sed / awk solutions but I can't seem to get it working. I have the feeling this is a simple problem so, sorry about that.

This seems to work:
sed 's/.*TYPE=\(.*\);.*/\1/'
EDIT:
Ah, so there can be semicolons in the random junk. Try this:
sed 's/.*TYPE=\([^;]*\);.*/\1/'

requires GNU grep:
grep -Po '(?<=TYPE=)[^;]+'
meaning: preceded by "TYPE=", find some non-semicolon characters

One way using GNU sed:
sed -r 's/.*TYPE=([^;]+).*/\1/' file.txt

Since you also tagged this awk:
$ text='<random junk>TYPE=snp;<more random junk>'
$ echo "$text" | awk -FTYPE= '{sub(/;.*/,"",$2); print $2}'
snp
$ text='foo=bar;baz=fnu;TYPE=snp;XAI=0;XAM=0'
$ echo "$text" | awk -FTYPE= '{sub(/;.*/,"",$2); print $2}'
snp
(Only using the variable to keep the lines from wrapping.)
Or, to parse this as set of variable=value pairs rather than just a string of text:
$ echo "$text" | awk -vRS=";" -F= '$1=="TYPE" {print $2}'
snp

You can also do this in pure bash, if you want:
$ t="red=blue;TYPE=snp;XAI=0.0037843;XAM=0.0170293;XAS=0.013245;XRI=0;XRM=0"
$ t=${t#*TYPE=}
$ t=${t%%;*}
$ echo $t
snp

AWK/SED. How to remove parentheses in simple text file

I have a text file looking like this:
(-9.1744438E-02,7.6282293E-02) (-9.1744438E-02,7.6282293E-02) ... and so on.
I would like to modify the file by removing all the parenthesis and a new line for each couple
so that it look like this:
-9.1744438E-02,7.6282293E-02
-9.1744438E-02,7.6282293E-02
...
A simple way to do that?
Any help is appreciated,
Fred

I would use tr for this job:
cat in_file | tr -d '()' > out_file
With the -d switch it just deletes any characters in the given set.
To add new lines you could pipe it through two trs:
cat in_file | tr -d '(' | tr ')' '\n' > out_file

As was said, almost:
sed 's/[()]//g' inputfile > outputfile
or in awk:
awk '{gsub(/[()]/,""); print;}' inputfile > outputfile

This would work -
awk -v FS="[()]" '{for (i=2;i<=NF;i+=2) print $i }' inputfile > outputfile
Test:
[jaypal:~/Temp] cat file
(-9.1744438E-02,7.6282293E-02) (-9.1744438E-02,7.6282293E-02)
[jaypal:~/Temp] awk -v FS="[()]" '{for (i=2;i<=NF;i+=2) print $i }' file
-9.1744438E-02,7.6282293E-02
-9.1744438E-02,7.6282293E-02

This might work for you:
echo "(-9.1744438E-02,7.6282293E-02) (-9.1744438E-02,7.6282293E-02)" |
sed 's/) (/\n/;s/[()]//g'
-9.1744438E-02,7.6282293E-02
-9.1744438E-02,7.6282293E-02

Guess we all know this, but just to emphasize:
Usage of bash commands is better in terms of time taken for execution, than using awk or sed to do the same job. For instance, try not to use sed/awk where grep can suffice.
In this particular case, I created a file 100000 lines long file, each containing characters "(" as well as ")". Then ran
$ /usr/bin/time -f%E -o log cat file | tr -d "()"
and again,
$ /usr/bin/time -f%E -ao log sed 's/[()]//g' file
And the results were:
05.44 sec : Using tr
05.57 sec : Using sed

cat in_file | sed 's/[()]//g' > out_file
Due to formatting issues, it is not entirely clear from your question whether you also need to insert newlines.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Get a column using sed and modify it - sed

For a version with sed (posix) only line_9="$(echo $line | sed 'H;x;s/^\(.\)\(\([^;];\)\{8\}\)\([^;]\)/\2\1\4\1/;h;s/\(\n\).\1/\1/;x;s/.\(\n\)\(.\)\1./\2/;s/[^0-9+,]//g;G;s/\(.\)\(\n\)\(.\)\2/\3\1/;h;s/.*//;x' )"

Related

How to replace consecutive symbols using only one sed command?

Matching pattern on multiple lines

tcsh & sed: no output

sed/awk : match a pattern and return everything between the end of the pattern and a semicolon

AWK/SED. How to remove parentheses in simple text file

Categories

Resources

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Get a column using sed and modify it - sed

For a version with sed (posix) only line_9="$(echo $line | sed 'H;x;s/^\(.\)\(\([^;]*;\)\{8\}\)\([^;]*\)/\2\1\4\1/;h;s/\(\n\).*\1/\1/;x;s/.*\(\n\)\(.*\)\1.*/\2/;s/[^0-9+*,]*//g;G;s/\(.*\)\(\n\)\(.*\)\2/\3\1/;h;s/.*//;x' )"

Related

How to replace consecutive symbols using only one sed command?

Matching pattern on multiple lines

tcsh & sed: no output

sed/awk : match a pattern and return everything between the end of the pattern and a semicolon

AWK/SED. How to remove parentheses in simple text file

Categories

Resources

For a version with sed (posix) only line_9="$(echo $line | sed 'H;x;s/^\(.\)\(\([^;];\)\{8\}\)\([^;]\)/\2\1\4\1/;h;s/\(\n\).\1/\1/;x;s/.\(\n\)\(.\)\1./\2/;s/[^0-9+,]//g;G;s/\(.\)\(\n\)\(.\)\2/\3\1/;h;s/.*//;x' )"