sed error - unterminated substitute pattern - sed

I am in directory with files consisting of many lines of lines like this:
98.684807 :(float)
52.244898 :(float)
46.439909 :(float)
and then a line that terminates:
[chuck]: cleaning up...
I am trying to eliminate :(float) from every file (but leave the number) and also remove that cleaning up... line.
I can get:
sed -ie 's/ :(float)//g' *
to work, but that creates files that keeps the old files. Removing the -e flag results in an unterminated substitute pattern error.
Same deal with:
sed -ie 's/[chuck]: cleaning up...//g' *
Thoughts?

sed -i '' -e 's/:(float)//' -e '/^.chuck/d' *
This way you are telling sed not to save a copy (null length backup extention to -i) and separately specifying the sed commands.

sed -ie expression [files...]
is equivalent to:
sed -ie -e expression [files...]
and both mean apply expression to files, overwriting the files, but saving the old files with an "e" as the backup suffix.
I think you want:
sed -i -e expression [files...]
Now if you're getting an error from that there must be something wrong with your expression.

your numbers are separated with (float) by the : character. Therefore, you can use awk/cut to get your numbers. Its simpler than a regex
$ head -n -1 file | awk -F":" '{print $1}'
98.684807
52.244898
46.439909
$ head -n -1 file | cut -d":" -f1
98.684807
52.244898
46.439909

Solution :
sed -i '' 's/ :(float)//g' *
sed -i '' 's/[chuck]: cleaning up...//g' *
Explanation :
I can get:
sed -ie 's/ :(float)//g' *
to work, but that creates files that keeps the old files.
That's because sed's i flag is supposed to work that way
-i extension
Edit files in-place, saving backups with the specified extension. If a zero-length extension is given, no backup will be saved.
In this case e is being interpreted as the extension you want to save your backups with. So all your original files will be backed up with an e appended to their names.
In order to provide a zero-length extension, you need to use -i ''.
Note: Unlike -i<your extension>, -i'' won't work. You need to have a space character between -i and '' in order for it to work.
Removing the -e flag results in an unterminated substitute pattern error.
When you remove the e immediately following -i, i.e.
sed -i 's/ :(float)//g' *
s/ :(float)//g will now be interpreted as the extension argument to i flag. And the first file in the list of files produced by shell expansion of * is interpreted as a sed function (most probably s/regular expression/replacement/flags function) You can verify this by checking the output of
sedfn=$(echo * | cut -d' ' -f1); [[ ${sedfn:0:1} == "s" ]]; echo $?
If the output of the above chain of commands is 0, our assumption is validated.
Also in this case, if somehow the first filename qualifies as a valid s/regular expression/replacement/flags sed function, the other filenames will be interpreted as regular files for sed to operate on.

sed -i -e 's/ :(float)//g' *

Check to see if you have any odd filenames in the directory.
Here is one way to duplicate your error:
$ touch -- "-e s:x:"
$ ls
-e s:x:
$ sed -i "s/ :(float)//g' *
sed: -e expression #1, char 5: unterminated `s' command
One way to protect against this is to use a double dash to terminate the options to sed when you use a wild card:
$ sed -i "s/ :(float)//g' -- *
You can do the same thing to remove the file:
$ rm "-e s:x:"
rm: invalid option -- 'e'
$ rm -- "-e s:x:"

Related

Using a single sed call to split and grep

This is mostly by curiosity, I am trying to have the same behavior as:
echo -e "test1:test2:test3"| sed 's/:/\n/g' | grep 1
in a single sed command.
I already tried
echo -e "test1:test2:test3"| sed -e "s/:/\n/g" -n "/1/p"
But I get the following error:
sed: can't read /1/p: No such file or directory
Any idea on how to fix this and combine different types of commands into a single sed call?
Of course this is overly simplified compared to the real usecase, and I know I can get around by using multiple calls, again this is just out of curiosity.
EDIT: I am mostly interested in the sed tool, I already know how to do it using other tools, or even combinations of those.
EDIT2: Here is a more realistic script, closer to what I am trying to achieve:
arch=linux64
base=https://chromedriver.storage.googleapis.com
split="<Contents>"
curl $base \
| sed -e 's/<Contents>/<Contents>\n/g' \
| grep $arch \
| sed -e 's/^<Key>\(.*\)\/chromedriver.*/\1/' \
| sort -V > out
What I would like to simplify is the curl line, turning it into something like:
curl $base \
| sed 's/<Contents>/<Contents>\n/g' -n '/1/p' -e 's/^<Key>\(.*\)\/chromedriver.*/\1/' \
| sort -V > out
Here are some alternatives, awk and sed based:
sed -E "s/(.*:)?([^:]*1[^:]*).*/\2/" <<< "test1:test2:test3"
awk -v RS=":" '/1/' <<< "test1:test2:test3"
# or also
awk 'BEGIN{RS=":"} /1/' <<< "test1:test2:test3"
Or, using your logic, you would need to pipe a second sed command:
sed "s/:/\n/g" <<< "test1:test2:test3" | sed -n "/1/p"
See this online demo. The awk solution looks cleanest.
Details
In sed solution, (.*:)?([^:]*1[^:]*).* pattern matches an optional sequence of any 0+ chars and a :, then captures into Group 2 any 0 or more chars other than :, 1, again 0 or more chars other than :, and then just matches the rest of the line. The replacement just keeps Group 2 contents.
In awk solution, the record separator is set to : and then /1/ regex is used to only return the record having 1 in it.
This might work for you (GNU sed):
sed 's/:/\n/;/^[^\n]*1/P;D' file
Replace each : and if the first line in the pattern space contains 1 print it.
Repeat.
An alternative:
sed -Ez 's/:/\n/g;s/^[^1]*$//mg;s/\n+/\n/;s/^\n//' file
This slurps the whole file into memory and replaces all colons by newlines. All lines that do not contain 1 are removed and surplus newlines deleted.
An alternative to the really ugly sed is: grep -o '\w*2\w*'
$ printf "test1:test2:test3\nbob3:bob2:fred2\n" | grep -o '\w*2\w*'
test2
bob2
fred2
grep -o: only matching
Or: grep -o '[^:]*2[^:]*'
echo -e "test1:test2:test3" | sed -En 's/:/\n/g;/^[^\n]*2[^\n]*(\n|$)/P;//!D'
sed -n doesn't print unless told to
sed -E allows using parens to match (\n|$) which is newline or the end of the pattern space
P prints the pattern buffer up to the first newline.
D trims the pattern buffer up to the first newline
[^\n] is a character class that matches anything except a newline
// is sed shorthand for repeating a match
//! is then matching everything that didn't match previously
So, after you split into newlines, you want to make sure the 2 character is between the start of the pattern buffer ^ and the first newline.
And, if there is not the character you are looking for, you want to D delete up to the first newline.
At that point, it works for one line of input, with one string containing the character you're looking for.
To expand to several matches within a line, you have to ta, conditionally branch back to label :a:
$ printf "test1:test2:test3\nbob3:bob2:fred2\n" | \
sed -En ':a s/:/\n/g;/^[^\n]*2[^\n]*(\n|$)/P;D;ta'
test2
bob2
fred2
This is simply NOT a job for sed. With GNU awk for multi-char RS:
$ echo "test1:test2:test3:test4:test5:test6"| awk -v RS='[:\n]' '/1/'
test1
$ echo "test1:test2:test3:test4:test5:test6"| awk -v RS='[:\n]' 'NR%2'
test1
test3
test5
$ echo "test1:test2:test3:test4:test5:test6"| awk -v RS='[:\n]' '!(NR%2)'
test2
test4
test6
$ echo "foo1:bar1:foo2:bar2:foo3:bar3" | awk -v RS='[:\n]' '/foo/ || /2/'
foo1
foo2
bar2
foo3
With any awk you'd just have to strip the \n from the final record before operating on it:
$ echo "test1:test2:test3:test4:test5:test6"| awk -v RS=':' '{sub(/\n$/,"")} /1/'
test1

SED inplace file change inside make - How?

sed inplace change on a file is not working inside Make object.
I want to replace a line in a file with sed called in a make object. But it does not seem to be working. How can I fix this?
change_generics:
ifeq ($(run_TESTNAME), diagnostics)
ifeq ($(run_TESTCASE), 1)
sed -i -e "s/SIM_MULTI\==[a-z,A-Z]*/SIM_MULTI=TRUE/" ./generics.f
else ifeq ($(TESTCASE), 2)
sed -i -e "s/SIM_MISSED\==[a-z,A-Z]*/SIM_MISSED=TRUE/" ./generics.f
endif
endif
I would like the generics.f file changed with that one line change. But it remains the same as the original. The sed command works outside make.
I can't reproduce this using GNU sed 4.2.2 and GNU make 3.82, or at least, I can't reproduce any scenario where the same sed command works from the command line but not in a Makefile.
Simpler Makefile:
all:
# Contrived just so I can test your 2 sed commands.
sed -i -e "s/SIM_MULTI\==[a-z,A-Z]*/SIM_MULTI=TRUE/" ./generics.f
sed -i -e "s/SIM_MISSED\==[a-z,A-Z]*/SIM_MISSED=TRUE/" ./generics.f
Sample file content in generics.f:
SIM_MULTI=foo
SIM_MISSED=bar
Testing:
$ make all
sed -i -e "s/SIM_MULTI\==[a-z,A-Z]*/SIM_MULTI=TRUE/" ./generics.f
sed -i -e "s/SIM_MISSED\==[a-z,A-Z]*/SIM_MISSED=TRUE/" ./generics.f
Confirmed that both sed commands fail to edit a file with this content.
To fix:
Probably, you need to simply remove the \= from your regular expression. The backslash there has no effect, and causes your regex to simply match two equals signs ==. Thus this works:
all:
sed -i 's/SIM_MULTI=[a-zA-Z]*/SIM_MULTI=TRUE/' ./generics.f
sed -i 's/SIM_MISSED=[a-zA-Z]*/SIM_MISSED=TRUE/' ./generics.f
Testing:
$ make all
sed -i 's/SIM_MULTI=[a-zA-Z]*/SIM_MULTI=TRUE/' ./generics.f
sed -i 's/SIM_MISSED=[a-zA-Z]*/SIM_MISSED=TRUE/' ./generics.f
$ cat generics.f
SIM_MULTI=TRUE
SIM_MISSED=TRUE
Further explanation:
There is no need to specify -e there.
There is no need to enclose the script in double quotes, which is riskier because it allows the contents to be modified by the shell.
The bug appears to be \= and I deleted those characters, as mentioned above.
Note that I removed the comma , as well in [a-z,A-Z]. I think that probably isn't what you meant, and it would cause a class of characters including a-z, A-Z and a comma , to be matched by the regex. (And if it is what you mean, you might consider writing it as [a-zA-Z,] as that would be less confusing.)
If this has not resolved your issue, I would need to know things like:
What is the version of your sed.
What is the contents in generics.f.
POSIX/GNU sed have c for "change":
sed -i '/SIM_MULTI=/c\SIM_MULTI=TRUE'
sed -i '/SIM_MISSED=/c\SIM_MISSED=TRUE'

sed to copy part of line to end

I'm trying to copy part of a line to append to the end:
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1_IonXpress_024_genomic.fna.gz
becomes:
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1/GCA_900169985_IonXpress_024_genomic.fna.gz
I have tried:
sed 's/\(.*(GCA_\)\(.*\))/\1\2\2)'
$ f1=$'ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1_IonXpress_024_genomic.fna.gz'
$ echo "$f1"
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1_IonXpress_024_genomic.fna.gz
$ sed -E 's/(.*)(GCA_.[^.]*)(.[^_]*)(.*)/\1\2\3\/\2\4/' <<<"$f1"
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1/GCA_900169985_IonXpress_024_genomic.fna.gz
sed -E (or -r in some systems) enables extended regex support in sed , so you don't need to escape the group parenthesis ( ).
The format (GCA_.[^.]*) equals to "get from GCA_ all chars up and excluding the first found dot" :
$ sed -E 's/(.*)(GCA_.[^.]*)(.[^_]*)(.*)/\2/' <<<"$f1"
GCA_900169985
Similarly (.[^_]*) means get all chars up to first found _ (excluding _ char). This is the regex way to perform a non greedy/lazy capture (in perl regex this would have been written something like as .*_?)
$ sed -E 's/(.*)(GCA_.[^.]*)(.[^_]*)(.*)/\3/' <<<"$f1"
.1
Short sed approach:
s="ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1_IonXpress_024_genomic.fna.gz"
sed -E 's/(GCA_[^._]+)\.([^_]+)/\1.\2\/\1/' <<< "$s"
The output:
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1/GCA_900169985_IonXpress_024_genomic.fna.gz

Sed remove matching lines script

I'm requesting help with a very simple script...
#!/usr/bin/sed -f
sed '/11,yahoo/d'
sed '/2506,stackover flow/d'
sed '/2536,reddit/d'
Just need it to remove three matches that account for 18408 in my file, data.csv
% sed -f remove.sed < data.csv
sed: 3: remove.sed: unterminated substitute pattern
Doing these same lines individually is no problem at all, so what am I doing wrong with this?
Using freeBSD 10.1 and its implementation of sed, if that matters.
This, being a sed script, should not have "sed" at each line.
Either change it to:
#!/usr/bin/sed -f
/11,yahoo/d
/2506,stackover flow/d
/2536,reddit/d
Or to
#!/bin/sh
sed -e /11,yahoo/d \
-e /2506,stackover flow/d \
-e /2536,reddit/d

how to find replace value with whitespace using sed in a bash script

I have values in a file like this ' value-to-remove '(without the ' characters). I want to use sed to run through the file and replace the values including the space before and after. I am running this via a bash script.
How can I do this?
The sed command I'm using at the moment replaces the values but leaves behind the two spaces.
sed -i 's/ '$value' / /g' test.conf
In script I have
sed -i -e 's/\s'$DOMAIN'-'$SITE'\s/\s/g' gitosis.conf
echoed as
sed -i -e s/\sffff.com-eeee\s/\s/g test.conf
Not working though.
IMHO your sed does not know '\s', so use [ \t], and use double quotes, otherwise your variables will not expand. e.g.:
sed -i -e "s/[ \t]'$DOMAIN'-'$SITE'[ \t]/ /g" gitosis.conf
Let me know if this is what you need
echo 'Some values to remove value-to-remove and more' | sed -e 's/\svalue-to-remove\s/CHANGED/g'
output: Some values to removeCHANGEDand more