HTML4.01 Strict ARIA Doctype for Validation via W3C Validator - doctype

Is there any <!doctype> for "HTML 4.01 Strict Markup + ARIA"?
Also am currently having HTML 4.01 Strict <!DOCTYPE>, I obviously get an error while validating my page with the W3C Validator.
Is there any solution to this problem?
Also when I use the Accessibilty Toolbar, I get the following error pointing to the line after the closing HTML tag:
Line 256, Column 1: character data is not allowed here
Content-Disposition: form-data; name="charset"
You have used character data somewhere it is not permitted to appear.
Mistakes that can cause this error include:
putting text directly in the body of the document without wrapping it in a container element (such as a <p>aragraph</p>), or
forgetting to quote an attribute value (where characters such as "%" and "/" are common, but cannot appear without surrounding quotes), or
using XHTML-style self-closing tags (such as <meta ... />) in HTML 4.01 or earlier. To fix, remove the extra slash ('/') character. For more information about the reasons for this, see Empty elements in SGML, HTML, XML, and XHTML.

There is no published DTD for HTML 4.01 + ARIA. It would be possible to write one, if only there were a stable, exact document that specifies the allowed ARIA attributes and the HTML 4.01 elements on which they may be used. Using WAI-ARIA in HTML is still a WD only, and calls itself “a practical guide”. And I’m not sure how it should be interpreted. I guess the simplest, and possibly the only realistic, approach would be to write a DTD that allows all ARIA attributes on all elements, with the same set of values for all elements, even thoughh this would violate many of the recommendations.
The other question seems to be unrelated, and should probably be asked as a separate question. And you should probably explain what accessibility toolbar you are referring to and what your HTML document contains.

Related

Rendering telephone links in HTL based on input from a Rich Text widget

I have a component using the Rich Text Edit widget (xtype="richtext") in my project that's used across the entire site as the default text component.
The users would like to be able to insert phone links using the tel URI scheme into the text entered using this component.
The dialog allows them to do so but when the contents of the Rich Text Edit are rendered in Sightly/HTL later on, the html context is used:
{$text # context='html'}
Once this is done, the value of my attribute is ignored.
The HTML stored in the repository is:
Call us!
And what's actually rendered on the page on the author instance is:
<a>Call us!</a>
on the publish instance, the tag gets removed altogether because of the link checker.
Changing the context to unsafe causes the href to render but it's not a solution I'm willing to accept. The component is used in a lot of places and I want to be sure the XSS protection is sufficient.
Is there a way I can affect the way the html context in HTL treats telephone links?
I tried adding an extra regular expression to the overlay of apps/cq/xssprotection/config.xml:
<regexp name="onsiteURL" value="([\p{L}\p{N}\\\.\##\$%\+&;\-_~,\?=/!]+|\#(\w)+)"/>
<regexp name="offsiteURL" value="(\s)*((ht|f)tp(s?)://|mailto:)[\p{L}\p{N}]+[\p{L}\p{N}\p{Zs}\.\##\$%\+&;:\-_~,\?=/!]*(\s)*"/>
<regexp name="telephoneLink" value="tel:\+?[0-9]+"/>
and further on:
<attribute name="href">
<regexp-list>
<regexp name="onsiteURL"/>
<regexp name="offsiteURL"/>
<regexp name="telephoneLink"/>
</regexp-list>
<!-- Skipped for brevity -->
</attribute>
but that doesn't seem to affect the way the Sightly/HTL escapes strings in the html context.
I've also tried overlaying the Sling xss rules located in /libs/sling/xss/config.xml but had no luck either.
How can it be done?
There are two xss protection config files:
/libs/cq/xssprotection/config.xml
/libs/sling/xss/config.xml
Sightly is using the second one, which means that you need to overlay it at path /apps/sling/xss/config.xml
What is worth mentioning is that new configuration seems to be applied only after restart of your aem instance.

Why setHTML("<table><tr>..</tr></table>"); but then getHTML(); return "<table><tbody><tr>..</tr></tbody></table>" (Gwt)?

I don't understand how Gwt setHTML & getHTML work. It doesn't seem to be consistent.
Let see this example:
myInlineHtml.setHTML(SafeHtmlUtils.fromSafeConstant("<table><tr><td>Test</td></tr></table>"));
System.out.println(myInlineHtml.getHTML());
Output: "<table><tbody><tr><td>Test</td></tr></tbody></table>"
Clearly when we set the html for myInlineHtml we don't have <tbody></tbody>, but when we getHTML from myInlineHtml then Gwt include <tbody></tbody>.
Why does that's happen because it can be confusing when you want to get the Html value and you thought it has the same value I the time we set it but it hasn't?
Does this happen independently from browsers or dpendently from
browsers? cos that is serious.
This is how HTML is parsed (how browsers are expected to parse it).
In HTML 4, TABLE was defined (in terms of SGML) as requiring a TBODY child element, and that TBODY is defined with both the start and end tags being optional.
In HTML5 (which codifies how browsers actually parse HTML), this is the same: when building a table, if the browser finds a tr, then it inserts a tbody element before parsing the tr as if there were a tbody initially.
Browsers try to format the html properly even if you omit certain keys or parameters. Most modern browsers will accept almost anything you pass it without complaining much, but instead of inserting exactly what you intended, it will interpret what you meant and insert valid HTML.
Therefore, is is perfectly valid to create a table without specifiyng a tbody node, but the browser will supply it for you. Once you use getHTML() you are accessing the parsed, well formatted tags.

How do you use in GWT UiBinder XML? Can you escape it?

In my mark-up I want to add a space ( ) between elements without always having to use CSS to do so. If I put in my markup, GWT throws errors. Is there a way around it?
For example:
<g:Label>One </g:Label><g:Label>Two</g:Label>
Should show:
One Two
And not:
OneTwo
As documented here, you just have to add this to the top of your XML file and it will work!
<!DOCTYPE ui:UiBinder SYSTEM "http://dl.google.com/gwt/DTD/xhtml.ent">
Note that the GWT compiler won't actually visit this URL to fetch the file, because a copy of it is baked into the compiler. However, your IDE may fetch it.
Rather than use a Label, which to me shouldn't allow character entities at all, I use an HTML widget. In order to set the content, though, I find I have to do it as the HTML attribute, not the body content (note that the uppercase HTML is important here, since the set method is setHTML, not setHtml)
<g:HTML HTML="One&nbsp;" />

confused about xhtml5: no more `<?xml?>` and now mandatory `meta`?

I've been a longtime user of XHTML 1.0 Strict, and I'm now trying to switch to XHTML5 in my new projects.
I'm confused that <?xml version='1.0' encoding='utf-8'?> is no longer considered valid, for HTML5, by http://validator.w3.org/. Why is that? Isn't that what all xml documents are supposed to start with?
And when I remove the standard <?xml…, my document still doesn't validate: now it's missing the encoding. I don't like those meta tags, but are they now effectively mandatory, to specify the encoding, in order to be valid (X)HTML5?
An XML declaration is valid and validates in XHTML serialization of HTML5. The following rather minimal document validates:
<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head><title></title></head>
<body></body>
</html>
However, this only applies to XHTML serialization (XHTML syntax) of HTML5. In HTML serialization, it is not allowed. If you write the above document in a file and store it in a server that will send it with Content-Type: text/html (which normally happens if the filename ends with “.html”), then you get an error message:
Saw <?. Probable cause: Attempt to use an XML processing instruction in HTML.
(XML processing instructions are not supported in HTML.)
Here “HTML” means HTML serialization only.
Browsers do not care about an XML declaration in either syntax. In HTML syntax, it is just ignored, as a recoverable syntax error. In XHTML syntax, it does not matter, except for the encoding part.
Although XML 1.0 specification recommends (but does not require) an XML declaration, it would in practice matter (apart from the significance of encoding) only to software that is capable of processing different versions of XML. Browsers aren’t. And in addition to XML 1.0, there’s just XML 1.1, which is not used much. Besides, HTML5 is defined so that the XML version used in XHTML syntax is XML 1.0.
The encoding part may matter, but utf-8 is the default anyway for XML. If you use another encoding for some reason, then an XML declaration may be useful to prevent any conflicts. HTML5 CR says this in it discussion of encodings: “In XHTML, the XML declaration should be used for inline character encoding information, if necessary.” A meta tag cannot really help in XHTML when served with an XML content type, since the encoding has already been decided (by defaulting to UTF-8 or otherwise) when the tag is seen.
For HTML syntax, the <meta charset=...> tag may be used, but it is not needed for validity, and the encoding can be specified in HTTP headers (which override any meta tags). Using a meta tag may however be helpful, since a page might be saved locally, and then there won’t be any HTTP headers available when it is opened.

Facelets charset problem

In my earlier post there was a problem with JSF charset handling, but also the other part of the problem was MySQL connection parameters for inserting data into db. The problem was solved.
But, I migrated the same application from JSP to facelets and the same problem happened again. Characters from input fields are replaced when inserting to database (č is replaced with Ä), but data inserted into db from sql scripts with proper charset are displayed correctly. I'm still using registered filter and page templates are used with head meta tag as following:
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-2">
If I insert into h:form tag the following attribute:
acceptcharset="iso-8859-2"
I get correct characters in Firefox, but not in IE7.
Is there anything else I should do to make it work?
Thanks in advance.
Add the following line to the filter:
response.setContentType("text/html;charset=ISO-8859-2");
Don't use acceptcharset attribute. IE has serious bugs with it.
Also, when you're using a <?xml?> declaration in top of Facelets XHTML page, ensure that it's using the desired charset or just remove the whole declaration, it's not strictly required.
<?xml version="1.0" encoding="ISO-8859-2"?>
i think you can see the implementation of org.springframework.web.filter.CharacterEncodingFilter
and you can start your tomcat by adding -Dfile.encoding=ISO-8859-2 as jvm arguments