sed/awk - removing text between delimiters - sed

How would I remove all text between certain delimiters.
example:
hello;you;are;nice
returns:
hello;you;nice
in sed, i know how to remove text before the first delimiter and after the last, but not sure otherwise...
thanks as always to everyone.

What about using cut -
cut -d; -f2-3

It is quite straigthforward with sed
sed "s/\w*;//3"

awk -F\; -v OFS=";" '{print $1,$2,$4}' file

Related

How to replace a fixed position character of a string?

Suppose I have a file having a string AKASHMANDAL
I want to replace 7th positioned character (whatever the character may be) with "D"
Output will looks like
AKASHMDNDAL
I tried with the following command which only add the character after 7th position
sed -E 's/^(.{7})/\1D/' file
This gives me AKASHMADNDAL
How can I replace the character instead of just adding?
Substitute any character in the 7th position using sed
$ sed 's/./D/7' input_file
AKASHMDNDAL
You can simply match one character outside of the capture group:
sed -E 's/^(.{6})./\1D/'
(notice the dot outside the parenthesis)
If you can consider an awk solution. awk can handle it better without regex and with more power to tweak based on positions:
awk '{print substr($0,1,6) "D" substr($0,8)}' file
AKASHMDNDAL
With your shown samples only, please try following awk code. Written and tested in GNU awk. Here is the Online demo for used awk code here.
awk -v RS='^.{7}' '
RT{
sub(/.$/,"",RT)
ORS=RT"D"
print
}
END{
ORS=""
print
}
' Input_file

Remove string between dash (-) and the first dot (.)

I have many web addresses which are including some special interface names, which I would like to remove. Examples:
aaaaaaa-INT1.aaaa.aaaa.com
bbbbbbb-INT2.bbbb.bbbb.com
ccccccc-INT.cccc.cccc.com
So my expected result after sed should be:
aaaaaaa.aaaa.aaaa.com
bbbbbbb.bbbb.bbbb.com
ccccccc.cccc.cccc.com
I have tried this, but it doesnt work:
sed 's/-.*^.//'
Any suggestion please?
To remove the first dash and everything before the first period:
$ sed 's/-[^.]*//' file
aaaaaaa.aaaa.aaaa.com
bbbbbbb.bbbb.bbbb.com
ccccccc.cccc.cccc.com
Solution 1st: Following sed may help you on same too.
sed 's/\([^-]*\)-\([^.]*\)\(.*\)/\1\3/' Input_file
Solution 2nd: With awk.
awk -F"." '{sub(/-.*/,"",$1)} 1' OFS="." Input_file

Remove a hyphen from a specific line in a file

I have a data file that needs to have several uniq identifiers stripped of hyphens.
So I have:
(Special_Section "data-values")
and I want to have it replaced with:
(Special_Section "datavalues")
I wanted to use a simple sed find/replace, but the data and values are different each time. Preferably, I'd run this in-place since the file has a lot of other information I want to keep in tact.
Does sed or awk have a way to remove the hyphen from the matched portion only?
Currently I can match with: sed -i 's/Special_Section "[a-zA-Z0-9]*-[a-zA-Z0-9]*"/&/g *myfiles*
But I would like to then run s/-// on & if it's possible.
You seems to be using GNU sed, so something like this might work:
sed -ri '
s/(Special_Section [^-]*)-([^)]*)/\1\2/g
' <your_filename_glob>
does this work?
sed -i '/(Special_Section ".*-.*")/{s/-//}' yourFile
Close - scan for the lines and then substitute on those that match:
sed -i '/Special_Section "[a-zA-Z0-9]*-[a-zA-Z0-9]*"/s/\( "[a-zA-Z0-9]*\)-\([a-zA-Z0-9]*\)"/\1\2/' *myfiles*
You can split that over several lines to avoid the scroll bar in SO:
sed -i '/Special_Section "[a-zA-Z0-9]*-[a-zA-Z0-9]*"/{
s/\( "[a-zA-Z0-9]*\)-\([a-zA-Z0-9]*\)"/\1\2/
}' *myfiles*
And on further thoughts, you can also do:
sed -i 's/\(Special_Section "[a-zA-Z0-9]*\)-\([a-zA-Z0-9]*"\)/\1\2/' *myfiles*
This is more compact. You can add the g qualifier if you need it. Both solutions use the special \(...\) notation to capture parts of the regular expression.

sed remove multiple characters surrounded by digits

I have a file with following contents:
EMAIL|TESTNUMBER|DATE
somemail#address.com|123456789|2011-02-08T16:36:02Z
How do I remove capital letters T between the date and time and Z at the end of the line using sed?
Thanks!
If the format is fixed and each line always matches T\d\d:\d\d:\d\dZ, then you could try the simple:
$ sed 's/T\(..:..:..\)Z$/ \1/'
(Untested)
Perhaps there's a fancier way, but the following script works for me:
s/\(....-..-..\)T\(.*\)/\1 \2/
s/Z$//
Example...in-bound file:
somemail#address.com|123456789|2011-02-08A16:36:02X
somemail#address.com|123456789|2011-02-08T16:36:02Z
somemail#address.com|123456789|2011-02-08B16:36:02Y
Output:
D:\>sed -f sedscr testfile
somemail#address.com|123456789|2011-02-08A16:36:02X
somemail#address.com|123456789|2011-02-08 16:36:02
somemail#address.com|123456789|2011-02-08B16:36:02Y
Cat it through:
sed 's/\([0-9]+\)T\([0-9]+\)/\1\2//' | sed 's/Z$//'
Edit
Oh my! I've just realized (thanks #Fredrik) that for a long time I wasted processes! Shame on me! Now I'm Church of The One Process convert. Here is the blessed version of the above abominated oneliner:
sed 's/\([0-9]+\)T\([0-9]+\)/\1\2//; s/Z$//' the_file.txt

Using AWK to treat files

I want to create a batch file with awk, grep or sed that keeps all lines beginning with 'INSERT' and deletes the other lines.
After this, I want to replace a string "change)" by "servicechange)" when the 3rd word in the treated line is "donextsit".
Can someone explain how to do this?
awk '/INSERT/{
if ($3=="donextsit"){
gsub("change","servicechange");
print
}
}' file
since this is homework, something is still not working..you should find out for yourself
sed '
/^INSERT/ ! d;
/^ *[^ ]\+ *[^ ]\+ *donextsit / s/change)/servicechange)/g;
' -i file
Edit: Incorporated Jonathan Leffler's suggestions.