Hi I'm trying to append a few lines of codes to a couple thousand html files in a directory (and sub-directories). What I'm trying to do is add xxx lines of code to all html files following the tag. I've tried to explore sed but I'm having issues with having the / sign inside the search and adding the several lines of codes to the sed command.
I'm thinking of adding the lines I want to add in a txt file and use sed to place all content in that txt file after the tag.
Much appreciate any help.
Say sample.html contains this:
<html>
<head>
</head>
<h1>Title</h1>
<body>
etc
I want to add this after the </h1> element:
<script>
etc.
</script>
<iframe>
</iframe>
to Produce this:
<html>
<head>
</head>
<h1>Title</h1>
<script>
etc.
</script>
<iframe>
</iframe>
<body>
etc
Assuming you want to place the text after the H1 end tag, and that end tag enter code here:
sed -i '/<\/h1>/r new_text.html' sample.html
Another solution:
Content of script.sed
/<\/h1>/ {
a\
<script>\
etc.\
</script>\
<iframe>\
</iframe>
}
Run it like:
sed -i -f script.sed sample.html
Related
I want to add new line in a html file by using sed command
The line I want to add is
<link href="https://newvalue.css" rel="test1" id="test2">
After
<link href="test.css" rel="test1" id="test2">
in a html file.
Can anyone help ?
Use sed and a for append and so:
sed -i '/<link href="test.css" rel="test1" id="test2">/a<link href="https://newvalue.css" rel="test1" id="test2">' file
Search for the line by using /.../ and then use a for append followed by the string to add.
I want to remove script calls from the HTML with following script.
var=$(sed -e '/^<script.*</script>$/d' -e '/.js/!d' testFile.html)
sed -i -e "/$var/d" testFile.html
Sample input file:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>JavaScript</title>
<script type="text/javascript" src="script.js" language="javascript">
</script>
<script>
// script code
</script>
</head>
<body>
</body>
</html>
Sample output file:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>JavaScript</title>
</script>
<script>
// script code
</script>
</head>
<body>
</body>
</html>
But, it gives the following error..
sed: -e expression #1, char 23: unterminated `s' command
Thanks in advance
trying
root#isadora:~/temp# sed -e '/^<script/,/<\/script>/d' aaaa.html
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>JavaScript</title>
</script>
</head>
<body>
</body>
</html>
root#isadora:~/temp#
Att.
It is unclear why you break this up into two separate scripts or what you hope for the variable to contain. This can be performed trivially with a single script.
The immediate problem is that you cannot use a literal unescaped slash in a regex if you use slash as the regex separator. Either use a different separator, or backslash-escape any literal slashes.
sed -i -e '\#^<script.*</script>$#d' -e '/\.js/!d' testFile.html
Notice also the backslash before the dot (an unescaped dot in a regex matches any character, so /.js/ matches e.g. the string notjs.)
Assume the following code snippet:
<head>
<script>....</script>
<script>....</script>
</head>
<body>
<script>
some stuff
a change
more stuff
more changes
more stuff
}
}
}
}
final changes
</script>
</body>
I need to add something right before the last </script>, what's stated as final changes. How can I tell sed to match that one? final changes doesn't exist, the last lines of the script are like four or five }, so it would be the scenario, I'd need to match multiple lines.
All the other changes were replaced by matching the line, then replacing with the line + the changes. But I don't know how to match the multi line to replace</script></body> with final changes </script></body>.
I tried to use the same tactic I use for replacing with multiple lines, but it didn't work, keep reporting unterminated substitute pattern.
sed 's|</script>\
</body>|lalalalala\
</script>\
</body>|' file.hmtl
I've read this question Sed regexp multiline - replace HTML but it doesn't suit my particular case because it matches everything between the search options. I need to match something, then add something before the first search operator.
sed, grep, awk etc. are NOT for XML/HTML processing.
Use a proper XML/HTML parsers.
xmlstarlet is one of them.
Sample file.html:
<html>
<head>
<script>....</script>
<script>....</script>
</head>
<body>
<script>
var data = [0, 1, 2];
console.log(data);
</script>
</body>
</html>
The command:
xmlstarlet ed -O -P -u '//body/script' -v 'alert("success")' file.htm
The output:
<html>
<head>
<script>....</script>
<script>....</script>
</head>
<body>
<script>alert("success")</script>
</body>
</html>
http://xmlstar.sourceforge.net/doc/UG/xmlstarlet-ug.html
Finally got this following xara's answer in https://unix.stackexchange.com/questions/26284/how-can-i-use-sed-to-replace-a-multi-line-string
In summary, instead of trying to do magic with sed, replace the newlines with a character which sed understands (like \r), do the replace and then replace the character with newline again.
For example;
I'd love to replace /test src path only within <img> tag.
However <p>test</p> should not be touched.
$ cat test.html
<img src="/test" width="18" alt="" /><br>
<p>test</p>
For now I could execute something like;
sed -i '/test'|/hoge|g' test.html
However it changes the word globally.
sed '/<img/s|/test|/hoge|g' test.html would work for one line <img tags
Sed allows the s///g replacement to be prefixed with another /PATTERN/ to restrict the replacement to lines matching PATTERN.
But you should really use an xml parser to be safe.
Another approach with sed:
sed -i 's|\(<img *src="/\)test|\1hoge|' test.html
<img *src="/ is captured and backreferenced using \1 in substitution string.
Following string(test) is replaced with hoge.
I have two web pages, one page has been created by hand, the other has been published with visual studio 2010 (.aspx). I want to modify the content of these files, replacing a bunch of script tags by a single script tag. To achieve this goal, I simply run some Perl code from a batch file. Here is the Perl code and the HTML before and after substitution :
Perl in a batch :
perl -pi.backup -e "s/<!--\s*<pack>\s*-->.*?<!--\s*<\/pack>\s*-->/<script src=\"pack.js\"><\/script>/s" file.aspx
HTML input :
<!-- <pack> -->
<script src="file1.js" type="text/javascript"></script>
<script src="file2.js" type="text/javascript"></script>
<!-- </pack> -->
HTML output :
<script src="pack.js"></script>
Everything works fine for the hand created file, while the generated file is not updated unless all lines are gathered into one. I guess the issue comes from linebreaks but I can't figure out why it does work only for the first file since the code is exactly the same.
Your problem is that running Perl with the -p switch causes it to execute the code for each line and print the result. Thus the regex is only seeing one line of the file at a time, and is never able to match the entire pattern.
You could do something like this:
perl -i.backup -e "undef $/; $_=<>; s/<!--\s*<pack>\s*-->.*?<!--\s*<\/pack>\s*-->/<script src=\"pack.js\"><\/script>/s; print" file.aspx
It slurps the whole file into $_, then performs your substitution and prints the result to the same file.