dealing with itextsharp XMLWorkerHelper.ParseXHTML strict behavior - itext

While trying to use XMLWorkerHelper.GetInstance().ParseXHTML() i find that it is really strict. Any wrong order of tags or unclosed tags will cause it to throw exception.
I am converting HTML that I have no control over.
Are there any flags to make it less strict? An input callback interface to handle funny markup? Anything in the itextsharp.tools.xml.html? Or an entirely new library compatible with itextsharp.text.IElement?

The name of the class and that method pretty much sums it up - you can't. The entire pipeline is based on the assumption that a valid XML document will be passed in, everything else will throw an exception. You can customize the pipeline and add your own handlers for things like link resolution, custom CSS properties and new HTML tags, but the core document processor still needs valid HTML.
I would recommend looking into running your HTML through a library that can convert it to XHTML.
EDIT
Also check out wkhtmltopdf. It uses webkit to render HTML and does (apparently) a pretty good job.
How to use wkhtmltopdf.exe in ASP.net
wkhtmltopdf.exe System.Security.SecurityException on cloud web server. How can i override server security policy
C# html to pdf converter using wkhtmltopdf or any other free tools

Related

How to load html head tags from one source

Okay, for static pages. Is there a way to load everything between the head tags (css,javascript,etc) from one source so we don't have to load it in every html file? I know this may be a stupid question but I couldn't find one on here and if there was already a post about it, I guess I was stupid to miss it.
If you environment permits you can use Server Side Includes http://httpd.apache.org/docs/current/howto/ssi.html which doesnt really involve using traditional dynamic scripting languages or servlet technology. In any case the HTML standard also allows you reference external CSS and JS - they dont have to be inline. If they are at the same URL they will only get loaded once by the browser.

Using custom type attribute in <script> tags such as jQuery's text/x-jquery-tmpl

I noticed that jquery's beta template plugin is using, the type attribute "text/x-jquery-tmpl"
e.g
<script type="text/x-jquery-tmpl">
I've not seen custom use of the type attribute in the past. Has anyone seen current examples of this in use or perhaps ways mere mortal developers such as I can use this in our own code?
I presume that it's sort of a MIME type, however I would of thought that MIME type support was up to the browser. So I would of assumed that custom MIME types would be unsupported?
The type actually does indicate what sort of script is there. If the browser doesn't understand it, it should ignore it. In this case, it's a convenient and semantic sort of way to include the source of the template without displaying it on the screen.
Usually with jquery template, you'll give it an id and refer to it that way with your $(id).tmpl call.
script def here:
http://www.w3.org/TR/html401/interact/scripts.html#idx-scripting_language
examples of tmpl here:
http://api.jquery.com/tmpl/
No, MIME are provided by the server to identify resources. The browser then acts on the types it recognizes.
Yes, in the HTTP connection the browser lists the types it can recognize so the server can choose types that fit better (an example here would be HTML 5 and video, where you have some codec options and the browser may support only a subset).
In this case, the specific MIME helps to signal the browser a warning: "This is not normal Javascript, don't act on it like if it was."

How to begin using HTML DOM

I have trouble understanding how some things are related.
For a Wordpress plugin, I would like to use HTML DOM on content from wp_remote_open to find a string.
In order to use DOM, does it have to be enabled by my webhost? or do I include a DOM parsing script with the plugin?
I was thinking that if it needs to be enabled by the webhosting company, I would rather use a regular expression to find the string because then it would be compatible for everyone's installation.
DOM has nothing to do with your hosting provider or infrastructure. It is merely a model representing your HTML document. Most modern browsers support DOM. See more at the XML DOM introduction

Node/Express/Mongo: How do I render HTML attributes from dynamic content?

I have made a simple blog using Node/Express/Mongo/Jade (and/or HAML.js). I used (and slightly updated) the blog app from this tutorial, which itself an update of one from howtonode.org
I can render attributes such as links, etc., with the template engine just fine, but when I pass data from the db, none of the html renders. I get plain text print-outs of the HTML. I figure I need some other node packages/modules to render the 'dynamic' content, but I don't know where to start.
In jade, when you're passing content you DON'T want to be escaped, be sure you pass it along as != instead of =
BE EXTREMELY CAREFUL THOUGH! If you don't manually parse out the bad stuff, you could make your website extremely vulnerable.
You can read some more jade documentation here

Why do we use HTML helper in ASP.NET MVC?

Are there any good thing, best practice or profit we have after using the HTML helper in an ASP.NET MVC project?
When I am trying to use them I found that I lose the speed I have with HTML and many difficulties I have whenever I use an HTML helper.
Other [non-techie] persons can't understand what I write using Helper if I want to show them or they want to do something they need to spent more time on, even if they have working knowledge of HTML.
If I use an HTML helper I lose the speed. When I use HTML I just type and of course I am not aware of it. But using helper, it is hard to understand.
What thing do we get when I use HTML helper? I think it is nothing I get because I lose the speeed. Others can't understand what I do using helper and can't customize the code if they want.
Why do we use HTML helpers?
You use HTML helpers to encapsulate some small HTML fragments which are repeated all over your pages. And to avoid writing those HTML snippets all over again you use helpers.
They are very useful, especially when dealing with things like URLs because instead of hardcoding your links helpers take advantage of routing the definition on your server and by simply changing those routes the whole site URLs' change without ever touching any single HTML page.
Another scenario where HTML helpers are useful is for generating form input fields. In this case they automatically could handle values when posting back and show associated validation messages. Can you imagine the spaghetti code you would have to write in your views if there weren't HTML helpers?
The biggest advantage I find is with the editor and display templates.
If your editor for a field is more than just a simple input box, you can put that into a template and replace the several tags with a call to
<%:Html.EditorFor(m=>m.Property)%>
This means that your page is a lot easier to edit as you aren't wading through a lot of fluff HTML to find what you want.