why can't match the content between # and end of line in sed? - sed

$ echo "haha#nihao" | sed "s/#.+$/end/"
haha#nihao
I want to match contents between character # and the end of line.why can't i get it?
:%!sed "s/#.\+$/end/"
E194:No alternate file name to substitute for '#'
problem 1:
why i can't use it in sed of vim?
problem 2:
how to see the EORROR 194 ?

problem 1: why i can't use it in sed of vim?
Because by default sed uses BRE, basic regular expressions:
/.+/ this matches any character followed by a "+"
/.\+/ this matches one or more occurrences of any character
You can tell sed to use extended regular expressions with the -r flag in GNU implementations and -E flag with BSD implementations:
$ echo "haha#nihao" | sed -r "s/#.+$/end/"
hahaend
problem 2: how to see the EORROR 194 ?
You get this error because # has a special meaning in vim when you run commands with !: # marks on the command line are replaced with the alternate file. It should work if you escape the #:
%!sed "s/\#.\+$/end/"
You can read about this error with the :help E194 command, and about alternative file with :help alternate-file.

use sed -r
echo "haha#nihao" | sed -r "s/#.+$/end/"
hahaend
from man sed
-r, --regexp-extended
use extended regular expressions in the script.

Related

sed remove line if neither pattern provided don't match

I am trying to create a filter command to reduce the lines from a log file, assume each line contains partition made of date,
/iamthepath01/20200301/file01.txt
/iamthepath02/20200302/file02.txt
....
/iamthepathxx/20210619/filexx.txt
then from thousands of lines I only want to keep the ones with two string in the path
/202106
/202105
and remove any other lines
I have tried following command
sed -i -e '\(/202105\|/202106\)!d' ~/log.txt
above command threw
sed: -e expression #1, char 24: unterminated address regex
You can use
sed -i '/\/20210[56]/!d' ~/log.txt
Or, if you need to use more specific alternatives and further enhance the pattern:
sed -i -E '/\/(202105|202106)/!d' ~/log.txt
Details:
-i - GNU sed option for inline file replacement
-E - option enabling POSIX ERE regex syntax
/\/20210[56]/ - regex that matches /20210 and then either 5 or 6
\/(202105|202106) - the POSIX ERE pattern that matches / and then either 202105 or 202106
!d - removes the lines not matching the pattern.
See the online demo:
#!/bin/bash
s='/iamthepath01/20200301/file01.txt
/iamthepath02/20200302/file02.txt
/iamthepathxx/20210619/filexx.txt'
sed '/\/20210[56]/!d' <<< "$s"
Output:
/iamthepathxx/20210619/filexx.txt
sed is the wrong tool for this. If you want a script that's as fragile as the sed one then use grep as it's the tool that exists solely to do a simple g/re/p (hence the name) like you're doing:
$ grep '/20210[56]' file
/iamthepathxx/20210619/filexx.txt
or if you want a more robust solution that focuses just on the part of the line you want to match and so will avoid false matches, then use awk:
$ awk -F '/' '$3 ~ /^20210[56]/' file
/iamthepathxx/20210619/filexx.txt
This might work for you (GNU sed):
sed -ni '\#/20210[56]#p' file
This uses seds -n grep-like option to turn off implicit printing and -i option to edit the file in place.
Normally sed uses the /.../ to match but other delimiters may be used if the first is escaped e.g. \#...#.
So the above solution will filter the existing file down to lines that contain either /202105 or /202106.
N.B. grep will almost certainly be faster in finding the above lines however the use of the -i option may be the ultimate reason for choosing sed (although the same outcome can be achieved by tacking on the > tmpFile && mv tmpFile file to a grep solution).

Finding it difficult to extract digits from string using sed

I am trying to extract the version information a string using sed as follows
echo "A10.1.1-Vers8" | sed -n "s/^A\([0-9]+\)\.\([0-9]\)\.[0-9]+-.*/\1/p"
I want to extract '10' after 'A'. But the above expression doesn't give the expected information. Could some one please give some explanation on why this statement doesn't work ?
I tried the above command and changed options os sed but nothing works. I think this is some syntax error
echo "A10.1.1-Vers10" | sed -n "s/^X\([0-9]+\)\.\([0-9]\)\.[0-9]+-.*/\1/p"
Expected result is '10'
Actually result is None
$ echo "A10.1.1-Vers8" | sed -r 's/^A([[:digit:]]+)\.(.*)$/\1/g'
10
Search for string starting with A (^A), followed by multiple digits (I am using POSIX character class [[:digit:]]+) which is captured in a group (), followed by a literal dot \., followed by everything else (.*)$.
Finally, replace the whole thing with the Captured Group content \1.
In GNU sed, -r adds some syntactic sugar, in the man page, it is called as --regexp-extended
GNU grep is an alternative to sed:
$ echo "A10.1.1-Vers10" | grep -oP '(?<=^A)[0-9]+'
10
The -o option tells grep to print only the matched characters.
The -P option tells grep to match Perl regular expressions, which enables the (?<= lookbehind zero-length assertion.
The lookbehind assertion (?<=^A) ensures there is an A at the beginning of the line, but doesn't include it as part of the match for output.
If you need to match more of the version string, you can use a lookforward assertion:
$ echo "A10.1.1-Vers10" | grep -oP '(?<=^A)[0-9]+(?=\.[0-9]+\.[0-9]+-.*)'
10

Escape backslash character in sed

I need to modify some Windows paths.
For instance,
D:\usr
to
D:\first\usr
So, I have created a variable.
$path = "first\usr"
then used the following command:
sed -i -e 's!\\usr!${path}/g;' test.txt
However, this ends up with the following:
D:\firstSr
How do I escape \u in sed?
Assuming your path variable was assigned properly (without spaces in the assignment: path='first\usr'), fixing step by step for an input file test.txt with one example path:
$ cat test.txt
D:\usr
Your original command
$ sed 's!\\usr!${path}/g;' test.txt
sed: -e expression #1, char 18: unterminated `s' command
doesn't do much, as you've mixed ! and / as the delimiter.
Fixing delimiters:
$ sed 's!\\usr!${path}!g;' test.txt
D:${path}
Now no interpolation happens at all because of the single quotes. I suspect these are just copy-paste mistakes, as you obviously got some output.
Double quotes:
$ sed "s!\\usr!${path}!g" test.txt
bash: !\\usr!${path}!g: event not found
Now this clashes with history expansion. We could escape the !, or use a different delimiter.
/ as delimiter:
$ sed "s/\\usr/${path}/g" test.txt
D:\firstSr
Now we're where the question actually started. ${path} expands to first\usr, but \u has a special meaning in GNU sed in the replacement string: it uppercases the following character, hence the S.
Even without the special meaning, \u would most likely just expand to u and the backslash would be gone.
Escaping the backslash:
$ path='first\\usr'
$ sed "s/\\usr/${path}/g" test.txt
D:\first\usr
This works.
Depending on which shell you are using, you may be able to use parameter expansion to double \ in your substitution string and prevent the \u interpretation:
path="first\usr"
sed -e "s/\\usr/${path//\\/\\\\}/g" <<< "D:\usr"
The syntax for replacing a pattern with the shell parameter expansion is ${parameter/pattern/string} (one replacement) or ${parameter//pattern/string} (replace all matches).
This substitution is not specified by POSIX, but is available in Bash.
Where it is not available, you may need to filter $path through a process:
path=$(echo "$path" | sed 's/[][\\*.%$]/\\&/g')
(N.B. I have also quoted other sed metacharacters in this filter).

How to use sed in order to search ^A and replace it

I would like to use (GNU) sed to do a simple search and replace. The issue is that I'm searching for a special character and it might be the reason it failed for me.
The input is:
^A9=139^A35=V^A34=9^A49=xxxx^A52=20140527-06:18:43.759^A5
and I want to replace the ^A with ;. I used:
sed -i '/s/^A/;/g' file.log
but I didn't get anything.
Your command should be,
sed -i 's/\^A/;/g' file
Command you tried,
sed -i '/s/^A/;/g' file.log
| |
| |______________You have to escape this special character. Because in general(regex) it means the starting point.
[No need to use `/` before s]
Example:
$ sed 's/\^A/;/g' file
;9=139;35=V;34=9;49=xxxx;52=20140527-06:18:43.759;5
^ has a special meaning with regular expressions. Use \^ (or potentially \\^, depending on how bash escapes things, I never quite remember it).

What is the purpose of the "-" in sh script line: ext="$(echo $ext | sed 's/\./\\./' -)"

I am porting a sh script that was apparently written using GNU implementation of sed to BSD implementation of sed. The exact line in the script with the original comment are:
# escape dot in file extension to grep it
ext="$(echo $ext | sed 's/\./\\./' -)"
I am able to reproduce a results with the following (obviously I am not exhausting all possibilities values for ext) :
ext=.h; ext="$(echo $ext | sed 's/\./\\./' -)"; echo [$ext]
Using GNU's implementation of sed the following is returned:
[\.h]
Using BSD's implementation of sed the following is returned:
sed: -: No such file or directory
[]
Executing ext=.h; ext="$(echo $ext | sed 's/\./\\./')"; echo [$ext] returns [\.h] for both implementation of sed.
I have looked at both GNU and BSD's sed's man page have not found anything about the trailing "-". Googling for sed with a "-" is not very fruitful either.
Is the "-" a typo?
Is the "-" needed for some an unexpected value of $ext?
Is the issue not with sed, but rather with sh?
Can someone direct me to what I should be looking at, or even better, explain what the purpose of the "-" is?
On my system, that syntax isn't documented in the man page, but it is in the
'info' page:
sed OPTIONS... [SCRIPT] [INPUTFILE...]
If you do not specify INPUTFILE, or if INPUTFILE is -',sed'
filters the contents of the standard input.
Given that particular usage, I think you could leave off the '-' and it should
still work.
You got your specific question answered BUT your script is all wrong. Take a look at this:
# escape dot in file extension to grep it
ext="$(echo $ext | sed 's/\./\\./')"
The main problems with that are:
You're not quoting your variable ($ext) so it will go through file name expansion plus if it contains spaces will be passed to echo as multiple arguments instead of 1. Do this instead:
ext="$(echo "$ext" | sed 's/\./\\./')"
You're using an external command (sed) and a pipe to do something the shell can do trivially itself. Do this instead:
ext="${ext/./\.}"
Worst of all: You're escaping the RE meta-character (.) in your variable so you can pass it to grep to do an RE search on it as if it were a string - that doesn't make any sense and becomes intractable in the general case where your variable could contain any combination of RE metacharacters. Just do a string search instead of an RE search and you don't need to escape anything. Don't do either of the above substitution commands and then do either of these instead of grep "$ext" file:
grep -F "$ext" file
fgrep "$ext" file
awk -v ext="$ext" 'index($0,ext)' file