How do I remove spaces from a processed target string using sed? - sed

I search in .cpp files and remove everything after the “delimiter string” up to and including the first “,” using this sed command.
sed -re 's/(GetValue[(])[^,]*,/\1/' *.cpp
Thus…
abc.GetValue(SomeString, SecondParam);
Becomes…
abc.GetValue( SecondParam);
Question:
But, how do I remove these spaces in the strings I find without removing all other spaces in the file ?
This removes spaces from the whole file e.g. sed -re 's/(GetValue[(])[^,]*,/\1/;s/ //g ' *.cpp

You just need to match the spaces in your pattern. You are already matching everything else.
Just add * or \s* or whatever appropriate match to your pattern.
sed -re 's/(GetValue[(])[^,]*, */\1/' *.cpp

Related

Substring file name in Unix using sed command

I want to substring the File name in unix using sed command.
File name : Test_Test1_Test2_10082019_030013.csv.20191008-075740
I want the characters after the 3rd underscore or (all the characters after Test2 ) i need to be printed .
Can this be done using sed command?
I have tried this command
sed 's/^.*_\([^_]*\)$/\1/' <<< 'Test_Test1_Test2_10082019_030013.csv.20191008-075740'
but this is giving result as 030013.csv.20191008-075740
I need it from 10082019_030013.csv.20191008-075740
Thanks
Neha
To remove from the beginning up to including the 3rd underscore you can use
sed 's/^\([^_]*_\)\{3\}//' <<< 'Test_Test1_Test2_10082019_030013.csv.20191008-075740'
This removes the initial part that consists of 3 groups of (any number of non-underscore characters followed by an underscore). The result is
10082019_030013.csv.20191008-075740
If you use GNU sed you can switch it to extended regular expressions and omit the backslashes.
sed -r 's/^([^_]*_){3}//' <<< 'Test_Test1_Test2_10082019_030013.csv.20191008-075740'
Could you please try following.
sed 's/\([^_]*\)_\([^_]*\)_\([^_]*\)_\(.*\)/\4/' Input_file
Or as per Bodo's nice suggestion:
sed 's/[^_]*_[^_]*_[^_]_\(.*\)/\1/' Input_file
This might work for you (GNU sed):
sed 's/_/\n/3;s/.*\n//;t;s/Test2/\n/;s/.*\n//;t;d' file
Replace the third _ by a newline and then remove everything upto and including the first newline. If this succeeds, bail out and print the result. Otherwise, try the same method with Test2 and if this fails delete the entire line.

Sed command to remove all lines not containing punctuation

I'm struggling to find a sed command to remove all lines in a text file that do not contain punctuation (of any kind) without doing each manually.
For example:
111.222.123.234
222.11.34.54
word # To remove
www.facebook.com
www.stackoverflow.com
another # To remove
random#email.com
Does such a command exist?
You can use the [:punct:] character class, which corresponds to
[!"#$%&'()*+,-./:;<=>?#[\]^_`{|}~]
and negate it:
$ sed '/[[:punct:]]/!d' infile
111.222.123.234
222.11.34.54
www.facebook.com
www.stackoverflow.com
random#email.com
Or, instead of the negated match, negate the character class directly:
sed '/[^[:punct:]]/d'
Or don't print anything unless a line does contain a punctuation character:
sed -n '/[[:punct:]]/p'
Or use grep instead of sed:
grep '[[:punct:]]' infile

Matching strings even if they start with white spaces in SED

I'm having issues matching strings even if they start with any number of white spaces. It's been very little time since I started using regular expressions, so I need some help
Here is an example. I have a file (file.txt) that contains two lines
#String1='Test One'
String1='Test Two'
Im trying to change the value for the second line, without affecting line 1 so I used this
sed -i "s|String1=.*$|String1='Test Three'|g"
This changes the values for both lines. How can I make sed change only the value of the second string?
Thank you
With gnu sed, you match spaces using \s, while other sed implementations usually work with the [[:space:]] character class. So, pick one of these:
sed 's/^\s*AWord/AnotherWord/'
sed 's/^[[:space:]]*AWord/AnotherWord/'
Since you're using -i, I assume GNU sed. Either way, you probably shouldn't retype your word, as that introduces the chance of a typo. I'd go with:
sed -i "s/^\(\s*String1=\).*/\1'New Value'/" file
Move the \s* outside of the parens if you don't want to preserve the leading whitespace.
There are a couple of solutions you could use to go about your problem
If you want to ignore lines that begin with a comment character such as '#' you could use something like this:
sed -i "/^\s*#/! s|String1=.*$|String1='Test Three'|g" file.txt
which will only operate on lines that do not match the regular expression /.../! that begins ^ with optional whiltespace\s* followed by an octothorp #
The other option is to include the characters before 'String' as part of the substitution. Doing it this way means you'll need to capture \(...\) the group to include it in the output with \1
sed -i "s|^\(\s*\)String1=.*$|\1String1='Test Four'|g" file.txt
With GNU sed, try:
sed -i "s|^\s*String1=.*$|String1='Test Three'|" file
or
sed -i "/^\s*String1=/s/=.*/='Test Three'/" file
Using awk you could do:
awk '/String1/ && f++ {$2="Test Three"}1' FS=\' OFS=\' file
#String1='Test One'
String1='Test Three'
It will ignore first hits of string1 since f is not true.

Sed to delete lines containg n charcters

I have many lines in a file which only contain '--' on each line which i want to rmeove. But there are many other lines in the file that contain 'SOMETEXT--SOMETEXT'.
sed -i "/--/d" will remove all instances of '--' but I only want to remove all lines that contain only '--'.
You can use ^ and $ to indicate beginning and end of line
sed -i '/^--$/d'
A line containing only -- would match the regex ^--$
If you want to include lines with leading/trailing whitespaces, it could be extended to
^\s*--\s*$
sed -i '/^--$/' file
The ^ and $ chars "anchor" the search to the beginning and end of the line, respectively.
OR if there can be spaces at the front or back AND assuming an modernish sed
sed -i '/^[[:space:]]*--[[:space:]]*$/' file
where [:space:] will find space chars and tabs.
ELSE a total retro sed should handle
sed '/^[ ]*--[ ]*$/' file > newFile && mv newFile file
and if there could be tabs, then just include a tab char along with the space char, i.e.\
[<Space><TAB>]
but not spelled, out, just typing a space char and a tab char will do it.
IHTH

Insert newline after pattern with changing number in sed

I want to insert a newline after the following pattern
lcl|NC_005966.1_gene_750
While the last number(in this case the 750) changes. The numbers are in a range of 1-3407.
How can I tell sed to keep this pattern together and not split them after the first number?
So far i found
sed 's/lcl|NC_005966.1_gene_[[:digit:]]/&\n/g' file
But this breaks off, after the first digit.
Try:
sed 's/lcl|NC_005966.1_gene_[[:digit:]]*/&\n/g' file
(note the *)
Alternatively, you could say:
sed '/lcl|NC_005966.1_gene_[[:digit:]]/G' file
which would add a newline after the specified pattern is encountered.
sed 's/lcl|NC_005966\.1_gene_[[:digit:]][[:digit:]]*/&\
/g' file
You need to escape . as it's an RE metacharacter, and you need [[:digit:]][[:digit:]]* to represent 1-or-more digits and you need to use \ followed by a literal newline for portability across seds.