sed remove multiple characters surrounded by digits - sed

I have a file with following contents:
EMAIL|TESTNUMBER|DATE
somemail#address.com|123456789|2011-02-08T16:36:02Z
How do I remove capital letters T between the date and time and Z at the end of the line using sed?
Thanks!

If the format is fixed and each line always matches T\d\d:\d\d:\d\dZ, then you could try the simple:
$ sed 's/T\(..:..:..\)Z$/ \1/'
(Untested)

Perhaps there's a fancier way, but the following script works for me:
s/\(....-..-..\)T\(.*\)/\1 \2/
s/Z$//
Example...in-bound file:
somemail#address.com|123456789|2011-02-08A16:36:02X
somemail#address.com|123456789|2011-02-08T16:36:02Z
somemail#address.com|123456789|2011-02-08B16:36:02Y
Output:
D:\>sed -f sedscr testfile
somemail#address.com|123456789|2011-02-08A16:36:02X
somemail#address.com|123456789|2011-02-08 16:36:02
somemail#address.com|123456789|2011-02-08B16:36:02Y

Cat it through:
sed 's/\([0-9]+\)T\([0-9]+\)/\1\2//' | sed 's/Z$//'
Edit
Oh my! I've just realized (thanks #Fredrik) that for a long time I wasted processes! Shame on me! Now I'm Church of The One Process convert. Here is the blessed version of the above abominated oneliner:
sed 's/\([0-9]+\)T\([0-9]+\)/\1\2//; s/Z$//' the_file.txt

Related

Substring file name in Unix using sed command

I want to substring the File name in unix using sed command.
File name : Test_Test1_Test2_10082019_030013.csv.20191008-075740
I want the characters after the 3rd underscore or (all the characters after Test2 ) i need to be printed .
Can this be done using sed command?
I have tried this command
sed 's/^.*_\([^_]*\)$/\1/' <<< 'Test_Test1_Test2_10082019_030013.csv.20191008-075740'
but this is giving result as 030013.csv.20191008-075740
I need it from 10082019_030013.csv.20191008-075740
Thanks
Neha
To remove from the beginning up to including the 3rd underscore you can use
sed 's/^\([^_]*_\)\{3\}//' <<< 'Test_Test1_Test2_10082019_030013.csv.20191008-075740'
This removes the initial part that consists of 3 groups of (any number of non-underscore characters followed by an underscore). The result is
10082019_030013.csv.20191008-075740
If you use GNU sed you can switch it to extended regular expressions and omit the backslashes.
sed -r 's/^([^_]*_){3}//' <<< 'Test_Test1_Test2_10082019_030013.csv.20191008-075740'
Could you please try following.
sed 's/\([^_]*\)_\([^_]*\)_\([^_]*\)_\(.*\)/\4/' Input_file
Or as per Bodo's nice suggestion:
sed 's/[^_]*_[^_]*_[^_]_\(.*\)/\1/' Input_file
This might work for you (GNU sed):
sed 's/_/\n/3;s/.*\n//;t;s/Test2/\n/;s/.*\n//;t;d' file
Replace the third _ by a newline and then remove everything upto and including the first newline. If this succeeds, bail out and print the result. Otherwise, try the same method with Test2 and if this fails delete the entire line.

How to replace strings in all files using sed?

I want to replace below line with next line in all the files. So what sed pattern is used for this. I have tried lot but not figured that out..
checkToken($token['token'])
checkToken($token)
This is what I have tried
sed -i -- 's/checkToken\(\$token\['token'\]\)/checkToken\(\$token\)/g' get_officers_v2.php
You just need to get your escape-characters (\) on the right place like:
sed -ie "s/\(checkToken(\$token\)\['token'\])/\1)/" get_officers_v2.php

Remove string between dash (-) and the first dot (.)

I have many web addresses which are including some special interface names, which I would like to remove. Examples:
aaaaaaa-INT1.aaaa.aaaa.com
bbbbbbb-INT2.bbbb.bbbb.com
ccccccc-INT.cccc.cccc.com
So my expected result after sed should be:
aaaaaaa.aaaa.aaaa.com
bbbbbbb.bbbb.bbbb.com
ccccccc.cccc.cccc.com
I have tried this, but it doesnt work:
sed 's/-.*^.//'
Any suggestion please?
To remove the first dash and everything before the first period:
$ sed 's/-[^.]*//' file
aaaaaaa.aaaa.aaaa.com
bbbbbbb.bbbb.bbbb.com
ccccccc.cccc.cccc.com
Solution 1st: Following sed may help you on same too.
sed 's/\([^-]*\)-\([^.]*\)\(.*\)/\1\3/' Input_file
Solution 2nd: With awk.
awk -F"." '{sub(/-.*/,"",$1)} 1' OFS="." Input_file

Append text to a line on multiple conditions

I am very new to sed so please bear with me... I have a file with contents like
a=1
b=2,3,4
c=3
d=8
.
.
I want to append 'x' to a line which starts with 'c=' and does not contain an 'x'. What I am using right now is
sed -i '/^c=/ s/$/x/'
but this does not cover the second part of my explanation, the 'x' should only be appended if the line did not have it already and hence if I run the command twice it makes the line "c=3xx" which I do not want.
Any help here would be highly appreciated and I know there are a lot of sharp heads around here :) I understand that this can be handled pretty easily through bash but using sed here is a hard requirement.
You can do something like this:
sed -i '/^c=/ {/x/b; s/$/x/}'
Curly brackets are used for grouping. The b command branches to the end of the script (stops the processing of the current line).
b label
Branch to label; if label is omitted, branch to end of script.
Edit: as William Pursell suggests in the comment, a shorter version would be
sed -i '/^c=/ { /x/ !s/$/x/ }'
awk is probably a better choice here as you can easily combine regular expression matches with logical operators. Given the input:
$ cat file
a=1
b=2,3,4
c=3
c=x
c=3
d=8
The command would be:
$ awk '/^c=/ && !/x/ {$0=$0"x"; print $0}' file
a=1
b=2,3,4
c=3x
c=x
c=3x
d=8
Where $0 is the awk variable that contains the current line being read.
This might work for you (GNU sed):
sed -i '/^c=[^x]*$/s/$/x/' file
or:
sed -i 's/^c=[^x]*$/&x/' file

One-liners to remove lines in which a specific character appears more than x times

I think the title says it all, I'm looking for a one-liner to remove lines of a file in which a specific character, let's say /, appears more than x times - 5, for instance.
Start:
/Bo/byl/apointe
S/ta/ck/ov/er/flo/w
M/oon/
Expected result:
/Bo/byl/apointe
M/oon/
Thank you for your suggestions !
You can use gsub function of awk. gsub return number of successful substitution made. So you can use that as reference to identify number of occurrences of particular character.
awk 'gsub(/\//,"&")<5' file
Updated Based on Ed Morton's suggestion.
This might work for you (GNU sed):
sed 's|/|&|5;T;d' file
All you need is:
awk -F/ 'NF<6' file
Look:
$ cat file
/Bo/byl/apointe
S/ta/ck/ov/er/flo/w
M/oon/
$ awk -F/ 'NF<6' file
/Bo/byl/apointe
M/oon/
I believe sed would be sufficient here. You'll want to look into //d and supply the correct condition. I'm going to try something and update when I have better ideas, you should too :)
Once you find it sed -i /{blah}/d will be enough to change it in the file, but you might want to run it without the -i and pipe it through less first to confirm it's doing what you think it's doing.
This would do :
sed -r '/(\/.*){5}\//d' file