sed find replace (inplace replace) using regex

sed find replace (inplace replace) using regex - sed

I need to find and replace certain text in many files. I am trying to use sed to do the replacement. Here is what I am trying to do:
Find:
<font size="4" face="verdana, arial,geneva"><b>([^<]*)</b></font>
replace with:
<font size="4" face="verdana, arial,geneva"><b><title>$1</title></b></font>
Esentially I want to add a <title></title> tag around what ever I find.
e.g. if the text is like:
<font size="4" face="verdana, arial,geneva"><b>THIS IS MY TITLE</b></font>
I want to replace it with:
<font size="4" face="verdana, arial,geneva"><b><title>THIS IS MY TITLE</title></b></font>
I have tried various commands, but it does not seems to work. Here aare the commands that I have tried so far:
sed -e 's/<font size="4" face="verdana, arial,geneva"><b>\([^<]*\)<\/b><\/font>/<font size="4" face="verdana, arial,geneva"><b><title>\1<\/title><\/b><\/font>/g'
sed -r 's/<font size="4" face="verdana, arial,geneva"><b>([^<]*)<\/b><\/font>/<font size="4" face="verdana, arial,geneva"><b><title>\1<\/title><\/b><\/font>/g'
sed -E 's/<font size="4" face="verdana, arial,geneva"><b>([^<]*)<\/b><\/font>/<font size="4" face="verdana, arial,geneva"><b><title>\1<\/title><\/b><\/font>/g'

For me this works
sed '/font *size *= *"4" *face/s|<b>\([^<]*\)</b>|<b><title>\1</title></b>|g'
my idea is to avoid as much escapes as possible and break matching and substitution in two steps

a sed line was basically built from copy & paste ^_^. please try it:
kent$ (master|✔) echo '<font size="4" face="verdana, arial,geneva"><b>THIS IS MY TITLE</b></font>'|sed -r 's#(<font size="4" face="verdana, arial,geneva"><b>)([^<]*)(</b></font>)#\1<title>\2</title>\3#'
<font size="4" face="verdana, arial,geneva"><b><title>THIS IS MY TITLE</title></b></font>

Related

remove everything between two characters with sed

I'd like to remove any characters between including them also
<img src=\"/wp-content/uploads/9e580e68ed249dec8fc0e668da78d170.jpg\" / hspace=\"5\" vspace=\"0\" align=\"left\">
I was trying
sed -i -e 's/<img src.*align=\\"left\\">//g' file

You do not say what version of sed you are using, or what shell.
With GNU sed and bash, your attempt was almost there. Try:
sed -i 's/<img src[^>]*align=\\"left\\">//g' file
Explanation:
s/<img src[^>]*align=\\"left\\">/ search for <img src_STUFF_align=\"left\">, where _STUFF_ cannot contain any >
// and replace it with nothing
/g and continue
-i and modify the file
I believe this should work with most version of sed (except for the -i).

Can I prepend a line without creating a new line?

If I have a text file containing:
This is a line
Using sed, how can I do this:
<p>This is a line</p>
I have tried the following script:
i\<p> a\</p>
but this gives me
<p>
This is a line
</p>
How can I achieve this?

Use s/// not append or insert.
$ echo 'This is a line' | sed 's~.*~<p>&</p>~'
<p>This is a line</p>
& at the replacement part refers the whole match.
OR
You could also do like this,
$ echo 'This is a line' | sed 's~^~<p>~;s~$~</p>~'
<p>This is a line</p>

You can also use awk:
echo 'This is a line' | awk '$0="<p>"$0"</p>"'
<p>This is a line</p>
Or more robust:
echo 'This is a line' | awk '{$0="<p>"$0"</p>"}1'
<p>This is a line</p>

Retrieve information Text/Word from HTML code using awk/sed

awk/sed newbie here. I have a HTML file and from that file and I would like to retrieve a text word.
<font face=arial size=-1><li><a href=/value_for_clients/Tokyo/abc_process.txt>abc</a> NDK Version: 4.0 </li>
<font face=arial size=-1><li><a href=/value_for_clients/Tokyo/abc01_process.txt>abc01</a> NDK Version: 4.0 </li>
<font face=arial size=-1><li><a href=/value_for_clients/Tokyo/abc045_process.txt>abc045</a> NDK Version: 4.0 </li>
<font face=arial size=-1><li><a href=/value_for_clients/Tokyo/cdf_process.txt>cdf</a> NDK Version: 4.0 </li>
<font face=arial size=-1><li><a href=/value_for_clients/Tokyo/Manhattan_process.txt>Manhattan</a> NDK Version: 4.0 </li>
For eg. From the 1st line I would like to retrieve abc placed between: .txt>abc/a>
I have used the following command but as you can see that number of letters in the word keeps changing abc, abc01, abc045, cdf, Manhattan.
awk -F\/ '{print substr($4,0,3)}' list.html
So this command is getting the output for only the 3 letter word. However I want to extract the same information (abc01, abc045, cdf, Manhattan) from all the lines in the HTML code. Please help.

Using awk:
awk -F'[<>]' '{print $7}' urls
abc
abc01
abc045
cdf
Manhattan

You could try:
perl -nE '/<a href.*?>(.*?)<\/a>/; say $1' file
Output:
abc
abc01
abc045
cdf
Manhattan

$ sed -n 's/.*txt>\([[:alnum:]]\+\)<.*/\1/p' list.html
abc
abc01
abc045
cdf
Manhattan
Or:
$ awk -F'(txt>|</a)' '{print $2}' list.html
abc
abc01
abc045
cdf
Manhattan

I use command sed or awk to extract it. Here, I save origin data into file /tmp/html.txt.
Both of them utilize regular expression and back reference
Via sed
flying#lempstacker:~$ sed -r -n 's#.*<a [^>]*>(.*)</a>.*#\1#p' /tmp/html.txt
abc
abc01
abc045
cdf
Manhattan
flying#lempstacker:~$
Via awk
using function gensub
flying#lempstacker:~$ awk '{print gensub(/.*<a [^>]*>(.*)<\/a>.*/,"\\1"," ",$0)}' /tmp/html.txt
abc
abc01
abc045
cdf
Manhattan
flying#lempstacker:~$

Using gnu grep
grep -Po "<a href.*?>\K[^<]*" file

How to add new line using sed on MacOS?

I wanted to add a new line between </a> and <a><a>
</a><a><a>
</a>
<a><a>
I did this
sed 's#</a><a><a>#</a>\n<a><a>#g' filename but it didn't work.

Powered by mac in two Interpretation:
echo foo | sed 's/f/f\'$'\n/'
echo foo | gsed 's/f/f\n/g'

Some seds, notably Mac / BSD, don't interpret \n as a newline, you need to use an actual newline, preceded by a backslash:
$ echo foo | sed 's/f/f\n/'
fnoo
$ echo foo | sed 's/f/f\
> /'
f
oo
$
Or you can use:
echo foo | sed $'s/f/f\\\n/'

...or you just pound on it! worked for me on insert on mac / osx:
sed "2 i \\\n${TEXT}\n\n" -i ${FILE_PATH_NAME}
sed "2 i \\\nSomeText\n\n" -i textfile.txt

Find and replace string with sed

I need to do a multi-file find and replace with nothing (delete) using sed. I have the line:
So replace the line:
<meta name="keywords" content="there could be anything here">
With '' (nothing) in all files in and under the current dir.
I have got this so far:
sed -e 's/<meta name="keywords" content=".*>//g' myfile.html'
But I know this is only going to remove the < or > tags. How can I match against
<meta name="keywords" content="
and delete everything from that to the next
>
I also need to do it for all files in and under (recursively) the current directory.
Thanks in advance!

sed has the delete directive try using
sed -e '/<meta name="keywords"/d' myfile.html

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

sed find replace (inplace replace) using regex - sed

For me this works sed '/font size = "4" face/s|<b>\([^<]*\)</b>|<b><title>\1</title></b>|g' my idea is to avoid as much escapes as possible and break matching and substitution in two steps

Related

remove everything between two characters with sed

Can I prepend a line without creating a new line?

Retrieve information Text/Word from HTML code using awk/sed

How to add new line using sed on MacOS?

Find and replace string with sed

Categories

Resources

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

sed find replace (inplace replace) using regex - sed

For me this works sed '/font *size *= *"4" *face/s|<b>\([^<]*\)</b>|<b><title>\1</title></b>|g' my idea is to avoid as much escapes as possible and break matching and substitution in two steps

Related

remove everything between two characters with sed

Can I prepend a line without creating a new line?

Retrieve information Text/Word from HTML code using awk/sed

How to add new line using sed on MacOS?

Find and replace string with sed

Categories

Resources

For me this works sed '/font size = "4" face/s|<b>\([^<]*\)</b>|<b><title>\1</title></b>|g' my idea is to avoid as much escapes as possible and break matching and substitution in two steps