Find string tag and replace with a hyperlink using SED - sed

New to SED and trying to use it to find a custom string tag and replace with an html hyperlink, but can't get the following SED format to work correctly. Thanks for your help.
Summary:
Find DEV-XXXX in string and replace w/ an html hyperlink, the DEV- string tag will always remain the same but the XXXX digit reference can vary for different strings.
"This is a test of DEV-1212"
"This is a test of DEV-1213 more text"
Expected results:
"This is a test of DEV-1212"
"This is a test of DEV-1213 more text"
This is the SED syntax I've been working with, but haven't been able to make it work correctly.
$ echo "This is a test DEV-1212" | sed -r 's/DEV-^[^0-9]*([0-9]+).*/&/'
**Produces the following error. **
sed: -e expression #1, char 43: unknown option to `s'

Apart from not escaping the \/ when using / as the delimiter, your pattern does not match because the caret ^ asserts the start of the string which will not match.
DEV-^[^0-9]*([0-9]+).*
----^
If there can only be digits after DEV- you could write the pattern as:
echo "This is a test of DEV-1212" | sed -r 's~DEV-[0-9]+~&~'
Or else keep matching non digits and the rest of the line:
echo "This is a test of DEV-1212" | sed -r 's~DEV-[^0-9]*[0-9].*~&~'
Output
This is a test of DEV-1212
Edit
If you want to match only digits:
echo "This is a test of DEV-1212 with more data" | sed -r 's~DEV-[^0-9]*[0-9]+~&~'
Output
This is a test of DEV-1212 with more data

You did not escape /. Why escape ".
echo "This is a test DEV-1212" | sed -r 's/DEV-^[^0-9]*([0-9]+).*/<a href="https:\/\/devtest.net\/&">&<\/a>/'
but use a different delimieter:
echo "This is a test DEV-1212" | sed -r 's|DEV-^[^0-9]*([0-9]+).*|&|'

Related

extract substring with sed report error message

I would like to extract substring with sed as below:
#!/bin/bash
txt="[audio.sys.offload.pstimeout.secs]: [3]"
echo $txt|sed -r -e 's/\[[a-zA-Z0-9_.]+\].*/\1/'
expected output is:
audio.sys.offload.pstimeout.secs
Error message:
sed: -e expression #1, char 26: invalid reference \1 on `s' command's RHS
#!/bin/bash
txt="[audio.sys.offload.pstimeout.secs]: [3]"
echo $txt | sed -r -e 's/^\[(.*)\]:.*/\1/'
we're grabbing all the characters from the 1st [ until the last ]: and putting them in a capture group.
Would you like the regex to remain mostly like yours?
by the way - with lazy matching (which isn't supported by sed),
the regex could be cleaner, simply ^\[(.*?\])

Finding it difficult to extract digits from string using sed

I am trying to extract the version information a string using sed as follows
echo "A10.1.1-Vers8" | sed -n "s/^A\([0-9]+\)\.\([0-9]\)\.[0-9]+-.*/\1/p"
I want to extract '10' after 'A'. But the above expression doesn't give the expected information. Could some one please give some explanation on why this statement doesn't work ?
I tried the above command and changed options os sed but nothing works. I think this is some syntax error
echo "A10.1.1-Vers10" | sed -n "s/^X\([0-9]+\)\.\([0-9]\)\.[0-9]+-.*/\1/p"
Expected result is '10'
Actually result is None
$ echo "A10.1.1-Vers8" | sed -r 's/^A([[:digit:]]+)\.(.*)$/\1/g'
10
Search for string starting with A (^A), followed by multiple digits (I am using POSIX character class [[:digit:]]+) which is captured in a group (), followed by a literal dot \., followed by everything else (.*)$.
Finally, replace the whole thing with the Captured Group content \1.
In GNU sed, -r adds some syntactic sugar, in the man page, it is called as --regexp-extended
GNU grep is an alternative to sed:
$ echo "A10.1.1-Vers10" | grep -oP '(?<=^A)[0-9]+'
10
The -o option tells grep to print only the matched characters.
The -P option tells grep to match Perl regular expressions, which enables the (?<= lookbehind zero-length assertion.
The lookbehind assertion (?<=^A) ensures there is an A at the beginning of the line, but doesn't include it as part of the match for output.
If you need to match more of the version string, you can use a lookforward assertion:
$ echo "A10.1.1-Vers10" | grep -oP '(?<=^A)[0-9]+(?=\.[0-9]+\.[0-9]+-.*)'
10

sed with multiple word string initialized with read command

I 've got a file named file.conf containing:
this is the configuration text and this is the WORD to change.
Running:
sed -i 's/WORD/"ONE TWO"/g' file.conf
I will have file.conf modified:
this is the configuration text and this is the "ONE TWO" to change.
now if I make a script, using read:
read -p 'word to change' TEXT -> "ONE TWO"
echo $TEXT -> "ONE TWO"
sed -i 's/WORD/'$TEXT'/g' file.conf
it does not work with error message:
sed: -e expression #1, char 11: unterminated `s' command
file.conf is not modified in this case.
but it works if I read $TEXT with only one word without spaces: "ONE" for instance.
Thanx folks.
Double quote variable like this:
sed -i 's/WORD/'"$TEXT"'/g' file.conf
Even safer:
sed -i 's/WORD/'"${TEXT}"'/g' file.conf

Escape line beginning and end in bracket expressions in sed

How do you escape line beginning and line end in bracket expressions in sed?
For example, let's say I want to replace both comma, line beginning, and line end in each line with pipe:
echo "a,b,c" | sed 's/,/|/g'
# a|b|c
echo "a,b,c" | sed 's/^/|/g'
# |a,b,c
echo "a,b,c" | sed 's/$/|/g'
# a,b,c|
echo "a,b,c" | sed 's/[,^$]/|/g'
# a|b|c
I would expect the last command to produce |a|b|c|. I also tried escaping the line beginning and line end via backslash, with no change.
With GNU sed with extended regular expressions, you can do:
$ echo "a,b,c" | /opt/gnu/bin/sed -E 's/^|,|$/|/g'
|a|b|c|
$
The -E option enables the extended regular expressions, as does -r, but -E is also used by other sed variants for the same purpose, unlike -r.
However, for reasons which elude me, the BSD (macOS) variant of sed produces:
$ echo "a,b,c" | sed -E 's/^|,|$/|/g'
|a|b|c
$
I can't think why.
If this variability is unacceptable, go with the three-substitution solution:
$ echo "a,b,c" | sed -e "s/^/|/" -e "s/$/|/" -e "s/,/|/g"
|a|b|c|
$
which should work with any variant of sed. However, note that echo "" | sed …3 subs… produces || whereas the -E variant produces |. I'm not sure if there's an easy fix for that.
You tried this, but it didn't do what you wanted:
$ echo "a,b,c" | sed 's/[,^$]/|/g'
a|b|c
$
This is what should be expected. Inside character classes, most special characters lose their special-ness. There is nothing special about $ (or , but it isn't a metacharacter anyway) in a character class; ^ is only special at the start of the class and it negates the character class. That means that what follows shows the correct, expected behaviour from this permutation of the contents of your character class:
$ echo "a,b\$\$b,c" | sed 's/[^,$]/|/g'
|,|$$|,|
$
It mapped all the non-comma, non-dollar characters to pipes. I should be using single quotes around the echo; then the backslashes wouldn't be necessary. I just followed the question's code quietly.
Following sed may help you in same.
echo "a,b,c" | sed 's/^/|/;s/,/|/g;s/$/|/'
Output will be as follows.
|a|b|c|

How do I push `sed` matches to the shell call in the replacement pattern?

I need to replace several URLs in a text file with some content dependent on the URL itself. Let's say for simplicity it's the first line of the document at the URL.
What I'm trying is this:
sed "s/^URL=\(.*\)/TITLE=$(curl -s \1 | head -n 1)/" file.txt
This doesn't work, since \1 is not set. However, the shell is getting called. Can I somehow push the sed match variables to that subprocess?
The accept answer is just plain wrong. Proof:
Make an executable script foo.sh:
#! /bin/bash
echo $* 1>&2
Now run it:
$ echo foo | sed -e "s/\\(foo\\)/$(./foo.sh \\1)/"
\1
$
The $(...) is expanded before sed is run.
So you are trying to call an external command from inside the replacement pattern of a sed substitution. I dont' think it can be done, the $... inside a pattern just allows you to use an already existent (constant) shell variable.
I'd go with Perl, see the /e option in the search-replace operator (s/.../.../e).
UPDATE: I was wrong, sed plays nicely with the shell, and it allows you do to that. But, then, the backlash in \1 should be escaped. Try instead:
sed "s/^URL=\(.*\)/TITLE=$(curl -s \\1 | head -n 1)/" file.txt
Try this:
sed "s/^URL=\(.*\)/\1/" file.txt | while read url; do sed "s#URL=\($url\)#TITLE=$(curl -s $url | head -n 1)#" file.txt; done
If there are duplicate URLs in the original file, then there will be n^2 of them in the output. The # as a delimiter depends on the URLs not including that character.
Late reply, but making sure people don't get thrown off by the answers here -- this can be done in gnu sed using the e command. The following, for example, decrements a number at the beginning of a line:
echo "444 foo" | sed "s/\([0-9]*\)\(.*\)/expr \1 - 1 | tr -d '\n'; echo \"\2\";/e"
will produce:
443 foo