Minify HTML files in text/html templates - sed

I use mustache/handlebar templates.
eg:
<script id="contact-detail-template" type="text/html">
<div>... content to be compressed </div>
</script>
I am looking to compress/minify my HTML files in the templates for the best compression.
YUIcompressor, closure does not work as they think that it is script and gives me script errors.
HTMLCompressor does not touch them even as it thinks that it is a script.
How do I minify the content in the script tags with type text/html?
Can I use a library?
If not, is sed or egrep a preferable way? Do you have sed/egrep syntax to remove empty lines (with just spaces or tabs), remove all tabs, trim extra spaces?
Thanks.

sed -e "s/^[ \t]*//g" -e "/^$/d" yourfile This will remove all the extra spaces and tabs from the begining, and remove all empty lines.
sed -e "s/^[ \t]*//g" -e ":a;N;$!ba;s/\n//g" yourfile This will remove all the extra spaces and tabs from the begining, and concatenate all your code.
Sorry if i missed something.

Use sed ':a;N;$!ba;s/>\s*</></g' file, it enables to you remove whitespaces and newlines where unneeded. Unlike ghaschel example, this doesn't remove those useful whitespaces in the beginning of the line as it preserves <pre> and <p> tags.
This is useful as you can remove whitespaces between > and < which is a common method to enlarge a html file. This example could also be used for a XML file like atom feed and rss feed for example.
I personally use this as a pipe in my site generator, this can reduce a normaly file size and can be use in conjunction with gzip.

Try using Pretty Diff to minify this kind of code. It will only assume the stuff inside script tags is JavaScript if there is no mime type or if the type is one of the various JavaScript types. It is also intelligent enough to know which white space is okay to remove without corrupting the output of content or the recursive beautification of code later.

Related

Prevent newline in (.md) files

How do I prevent newlines in the readme.md files (GitHub)?
We can always write the whole thing in one line to prevent it. But is there an exclusive tag/option to prevent the same, especially for tags that create newlines (headings) like span in html?
Doesn't a space followed by a backslash do the concatenation you want? It does for me. That way I can break a paragraph into one sentence per line.

Is it possible to break a line on gist?

When writing in gist, and using mark down mode, we need to enter a blank line between two lines if we want to break a line (add a new line).
Is it possible to break a line without the need of a space (blank line between them)?
As Ryan said, in markdown the most conventional way of creating a linebreak is by adding two spaces at the end of a line. Though GFM also supports the use of basic HTML blocks, so <br> can also be used to create linebreaks, which can be helpful where multiple linebreaks are needed.

How to add empty spaces into MD markdown readme on GitHub?

I'm struggling to add empty spaces before the string starts to make my GitHub README.md looks something like this:
Right now it looks like this:
I tried adding <br /> tag to fix the new string start, now it works, but I don't understand how to add spaces before the string starts without changing everything to . Maybe there's a more elegant way to format it?
You can use <pre> to display all spaces & blanks you have typed. E.g.:
<pre>
hello, this is
just an example
....
</pre>
Markdown really changes everything to html and html collapses spaces so you really can't do anything about it. You have to use the for it. A funny example here that I'm writing in markdown and I'll use couple of here.
Above there are some without backticks
Instead of using HTML entities like and   (as others have suggested), you can use the Unicode em space (8195 in UTF-8) directly. Try copy-pasting the following into your README.md. The spaces at the start of the lines are em spaces.
The action of every agent <br />
  into the world <br />
starts <br />
  from their physical selves. <br />
I'm surprised no one mentioned the HTML entities   and   which produce horizontal white space equivalent to the characters n and m, respectively. If you want to accumulate horizontal white space quickly, those are more efficient than .
no space
 
  
  
Along with <space> and  , these are the five entities HTML provides for horizontal white space.
Note that except for , all entities allow breaking. Whatever text surrounds them will wrap to a new line if it would otherwise extend beyond the container boundary. With it would wrap to a new line as a block even if the text before could fit on the previous line.
Depending on your use case, that may be desired or undesired. For me, unless I'm dealing with things like names (John Doe), addresses or references (see eq. 5), breaking as a block is usually undesired.
Markdown gets converted into HTML/XHMTL.
John Gruber created the Markdown language in 2004 in collaboration with Aaron Swartz on the syntax, with the goal of enabling people to write using an easy-to-read, easy-to-write plain text format, and optionally convert it to structurally valid HTML (or XHTML).
HTML is completely based on using for adding extra spaces if it doesn't externally define/use JavaScript or CSS for elements.
Markdown is a lightweight markup language with plain text formatting syntax. It is designed so that it can be converted to HTML and many other formats using a tool by the same name.
If you want to use »
only one space » either use or just hit Spacebar (2nd one is good choice in this case)
more than one space » use +space (for 2 consecutive spaces)
eg. If you want to add 10 spaces contiguously then you should use
space space space space space
instead of using 10 one after one as the below one
For more details check
Adding multiple spaces between text in Markdown,
How to create extra space in HTML or web page.
After different tries, I end up to a solution since most markdown interpreter support Math environment.
The following adds one white space :
$~$
And here ten:
$~~~~~~~~~~~$
As a workaround, you can use a code block to render the code literally. Just surround your text with triple backticks ```. It will look like this:
2018-07-20 Wrote this answer
Can format it without
Also don't need <br /> for new line
Note that using <pre> and <code> you get slightly different behaviour: &nbsp and <br /> will be parsed rather than inserted literally.
<pre>:
2018-07-20 Wrote this answer
Can format it without
Also don't need for new line
<code>:
2018-07-20 Wrote this answer
Can format it without
Also don't need for new line
You can also use spaces from the known list:
  &hairsp;
'6-per-em space'  
'narrow no-break space'  
'thin space'    
'4-per-em space'   &emsp14;
'no breaking space'  
'punctuation space'   &puncsp;
'3-per-em space'   &emsp13;
'en space'    
'figure space'   &numsp;
'em space'    
I have tried so many methods on Github markdown.
Only starting the line with </br> with a normal empty line underneath works for me.
(so two line in total; one just </br> and one is empty)
One line of </br> will do the line break. The reason for the empty line underneath is that it won't mess up the formats of the content coming up.

Silverstripe CMS wysiwyg incorrectly decoding/encoding quotes

I've encountered an issue that I've never come across before with Silverstripe when saving content in the CMS.
When saving in the Content wysiwyg (or any other fields I've added), it is escaping the quotes and apostrophies ie. When applying a style to a piece of content through the style dropdown to make the underlying HTML:
<p class="my-style">lorem ipsum</p>
When I press save, the page reloads and the style is not shown. When inspecting the HTML put back into the wysiwyg, I am getting:
<p class="\"my-style\"">lorem ipsum</p>
Initially, my thoughts were that the content field was maybe set to Text rather than HTMLText but I've checked and found this not to be the case.
Anyone have any ideas? I've built numerous sites in Silverstripe previously and this is the first time I've encountered this behaviour.
I'm using 3.1.0
Cheers
As I've mentioned I think this is a PHP issue and an issue of escaping double/single quotes. It's a symptom of magic quotes.
Magic Quotes was a "(annoying) security feature" enabled by default in PHP < 5.3.0 and deprecated in > 5.4.0
In a jist here's what magic quotes does (taken from php website)
When on, all ' (single-quote), " (double quote), \ (backslash) and NULL characters are escaped with a backslash automatically. This is identical to what addslashes() does.
This may be what you are experiencing.
Disabling Magic Quotes
On to the solution.
If you have access to your main php.ini, just turn it off like so:
; Magic quotes
;
; Magic quotes for incoming GET/POST/Cookie data.
magic_quotes_gpc = Off
; Magic quotes for runtime-generated data, e.g. data from SQL, from exec(), etc.
magic_quotes_runtime = Off
; Use Sybase-style magic quotes (escape ' with '' instead of \').
magic_quotes_sybase = Off
If you don't have have access to the main php.ini:
add this line to the root .htaccess file (if you're using apache mod_php)
php_flag magic_quotes_gpc Off
or if you're running PHP as a CGI, create a php.ini file on your document root and put the previously mentioned snippet for php.ini.
Hope this helps!
Very peculiar...
I read the symtpoms as a double double quote, if you parse
<p class="\"my-style\"">lorem ipsum</p>
it's going to appear as
<p class=""my-style"">lorem ipsum</p>
I usually define my styles in the typography.css file and it automatically appears in the "Styles" drop-down in the WYSIWYG editor.
Can you try this out and let me know if it helps?
Thanks!

regex_replace to replace certain html tags

Is there a way to convert BR tags and/or DIV tags to new lines so it will format correctly when I use an in a mailto? I was thinking I should look for any P, DIV, and BR tags and replace them with a new line character. So anywhere there is a closing tag put the new line character and remove the opening tag. After I do the above I will remove the rest of the html with remove_html="1" but I want to keep the paragraph format.
I thought it can be done using regex_replace but I'm not sure how to write it. Anyone know?
Do not parse HTML files using regex, use HTML parser (HTML::TreeBuilder or something similar that can do in line changes) module, or in this case, even better use XSLT transformations.