Sling Mapping Rewrite Rules do not rewrite paths in meta tags - aem

I have sling mappings setup that rewrite outgoing paths to the external URL. An example of this rewrite:
/content/www-sitename/home.html would be rewritten to http://www.sitename.com/home.html
I have also configured the LinkCheckerTransformerFactory: linkcheckertransformer.rewriteElements=["a:href","area:href","form:action","link:href","meta:content"]
Some HTML on a page component:
<head>
<link rel="canonical" href="/content/www-sitename/home.html" />
<meta name="canonical" content="/content/www-sitename/home.html" />
</head>
When visited, only the link:href has been rewritten, meta:content is unchanged:
<head>
<link rel="canonical" href="http://www.sitename.com/home.html" />
<meta name="canonical" content="/content/www-sitename/home.html" />
</head>
Worth noting is that the link:href was not rewritten prior to configuring the linkcheckertransformer.rewriteElements to include it. Why did this change work for link:href, but not meta:content. Aside from creating a custom rewrite filter, what can be done to get links in meta:content attributes to be rewritten?

nerd answer is correct, by default the internal Sling mechanism responsible for parsing HTML (htmlparser) supports only following tags: a, area, form, base, link, script, body, so even if you add meta:content to the LinkChecker configuration, CQ won't recognize the <meta> as a tag which needs processing.
In order to reconfigure htmlparser, create a node named generator-htmlparser under /libs/cq/config/rewriter/default with following properties:
jcr:primaryType = nt:unstructured
includeTags = [A, AREA, FORM, BASE, LINK, SCRIPT, BODY, META]
The includeTags property should be multivalued, so you can add other tags in the future.
If you don't want to override the content under /libs, create your own rewriter configuration:
Copy /libs/cq/config/rewriter/default and its children to /apps/YOURAPP/config/rewriter/my-rewriter.
Set order property on the my-rewriter to 1.
Create generator-htmlparser under the my-rewriter as above.

I think you have to add meta tag to the htmplparser generator.
see my question and answer: How to add additional element to htmlparser generator

Related

XForms: Possible to modify HTML element attribute ? (I'm using XSLTForms)

Does XForms have a mechanism for manipulating attributes of the resultant HTML?
I guess I mean emitting HTML dynamically and setting the attributes as part of that.
I know that using a xf:repeat - you can effectively emit HTML elements, but I can't work out if this would stretch to attributes?
I'm using XSLTForms as the implementation - so maybe this support hooks for Javascript to do this if there isn't a built-in way?
The reason to ask specifically - I would like to work with the audio element (and some other HTML5 elements).
Yes, it is named AVT for Attribute Value Template. As in XSLT, just wrap XPath expressions into curly braces like in <div class="proto{$myclass}">.
Thanks to the help from Alain Couthures - I was able to put together the following. Sharing in case others find it interesting.
<?xml-stylesheet href="xsltforms/xsltforms.xsl" type="text/xsl"?>
<html
xmlns="http://www.w3.org/1999/xhtml"
xmlns:xf="http://www.w3.org/2002/xforms">
<head>
<title>Podcast Player</title>
<xf:model>
<xf:instance xmlns="">
<data>
<url/>
</data>
</xf:instance>
<xf:instance id="feed" src="https://podcasts.files.bbci.co.uk/b05qqhqp.rss"/>
</xf:model>
<style><![CDATA[
* { font-family: arial; background-color:black; color: white }
]]></style>
</head>
<body>
<h1><xf:output ref="instance('feed')/channel/title"/></h1>
<blockquote><xf:output ref="instance('feed')/channel/description"/></blockquote>
<xf:select1 ref="url" appearance="full">
<xf:itemset nodeset="instance('feed')/channel/item">
<xf:label ref="title"/>
<xf:value ref="enclosure/#url"/>
</xf:itemset>
</xf:select1>
<audio src="{url}" controls="true"/>
</body>
</html>
The relevant bit to this post is the "audio" tag and in particular the "{url}" attribute template.
Here's a screenshot:
For those that wish to try this example, you'll need XSLTForms : https://en.wikibooks.org/wiki/XSLTForms , other XForms implementations are available.
Note: save the file with the extension '.xhtml' and place behind a webserver of your choice.
For instance using test HTTP servers: php, python etc.

German Novel with DkPro

I tried German Novel with DkPro. My Sample input file is an XHTML file. How can I get my PosTagger output based on the XHTML index.
Script:
PACKAGE com.github.uima.ruta.novel;
ENGINE utils.HtmlAnnotator;
ENGINE utils.HtmlConverter;
ENGINE utils.ViewWriter;
TYPESYSTEM utils.HtmlTypeSystem;
TYPESYSTEM utils.TypeSystem;
IMPORT PACKAGE de.tudarmstadt.ukp.dkpro.core.api.lexmorph.type.pos FROM desc.type.POS;
IMPORT de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Lemma FROM desc.type.LexicalUnits;
UIMAFIT org.dkpro.core.opennlp.OpenNlpSegmenter;
UIMAFIT org.dkpro.core.stanfordnlp.StanfordPosTagger;
CONFIGURE(HtmlAnnotator, "onlyContent" = false);
Document{-> EXEC(HtmlAnnotator)};
Document { -> CONFIGURE(HtmlConverter, "inputView" = "_InitialView","outputView" = "plain"),
EXEC(HtmlConverter,{TAG})};
"<\\?xml version=\"1.0\" encoding=\"UTF-8\"\\?>"->MARKUP;
uima.tcas.DocumentAnnotation{-CONTAINS(POS)} -> {
uima.tcas.DocumentAnnotation{-> SETFEATURE("language", "de")};
EXEC(OpenNlpSegmenter);
EXEC(StanfordPosTagger, {POS});
};
Sample Input
<?xml version="1.0" encoding="UTF-8"?><html xmlns="http://www.w3.org/1999/xhtml"><head xmlns="http://www.w3.org/1999/xhtml"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><meta name="viewport" content="width=device-width, initial-scale=1.0" /><style></style><title></title></head><link xmlns="http://www.w3.org/1999/xhtml" src="./ckeditor.css" /><body xmlns="http://www.w3.org/1999/xhtml"><div class="WordSection1"><p class="Normal" data-name="Normal"><span data-bkmark="para10000"></span><span style="font-size:9pt">Der Idiot</span><span data-bkmark="para10000"></span></p>
<p class="Normal" data-name="Normal"><span data-bkmark="para10001"></span><span style="font-size:9pt">Ein Roman in vier Teilen.</span><span data-bkmark="para10001"></span></p>
</div>
<hr align="left" size="1" width="33%" /></body>
</html>
In the sample script, uima.tcas.DocumentAnnotation is sent to PosTagger Process. The MARKUP in this annotation affecting the accuracy. What I need to do to get the accuracy.
The HtmlAnnotator can be used to hide additional MARKUP so that rules are not affected by them.
The HtmlConverter is able to create a new document text without html/xml markup, but only in a new CAS view as the initial text in a CAS is static and cannot be changed.
The EXEC action is able to apply an external analysis engine on the current CAS object, and it can be configured to be applied on a different CAS view. However, the external analysis engine is applied on the complete CAS including the markup. No new CAS is created on the fly.
There are several options what you could do.
You could apply the pos tagger on the ‘plain’ view, but you cannot access these annotation with rules as the annotation will be present in a different view
You setup a multi view setting, e.g, by a two stage process. First convert the text to plain text without markup, and then apply the pos tagger on the new text
Depending on the external analysis engine, you maybe can also solve this by redefining what a token is.

How can i add an attribute to a tag that is created in runtime

So I need to add the attribute media="all" to these two link tags:
<link rel="stylesheet" href="/etc.clientlibs/farmers/clientlibs/clientlib-libraries.css" type="text/css">
<link rel="stylesheet" href="/etc.clientlibs/farmers/clientlibs/clientlib-base.css" type="text/css">
but my local HTML file is configured as:
<sly data-sly-use.clientlib="/libs/granite/sightly/templates/clientlib.html">
<sly data-sly-call="${clientlib.css # categories=['farmers.new.libraries','farmers.new.base']}" /> </sly>
It is a language called HTL, HTML Template Language. There's a way to add attributes via HTL but you need to create a whole java class in the back end and call it, it's a headache.
I want to know if I can add some javascript to append the attribute media="all" to the link tags to these specific CSS file path.
I was thinking of putting both paths inside a div and then with javascript find that div and append an attribute to each link tag inside that div.
var head = document.getElementsByTagName('head');
var element = document.createElement('link');
element.rel = 'stylesheet';
element.type = 'text/css';
element.href = '/etc.clientlibs/farmers/clientlibs/clientlib-libraries.css';
// Here's the magic
element.media = 'all';
head.appendChild(element, head.firstChild);
setTimeout(function () {
element.media = 'all';
});
A script tag is being created and I want to add async="" to this:
<!--/* Include Context Hub */-->
<sly data-sly-call="${clientlib.js # categories='granite.utils'}" />
<sly data-sly-resource="${'contexthub' # resourceType='granite/contexthub/components/contexthub'}" />
Yes, it looks fine. But your code has a mistake. You should get the first element from the collection to access <head></head> element.
var head = document.getElementsByTagName('head')[0];
Why would you do that? Just do not use the clientlib mechanism. Most frontend developers will use their IDEs to build and minify CSS and JS anyway so there is not much to be lost if you import those artifacts at buildtime and add the tags in your code directly in the way you need them.

Is the inputmode attribute valid (in HTML5 forms) or not?

I am getting validation errors with the inputmode attribute on text areas and text fields. The validator tells me Attribute inputmode not allowed on element input at this point but the HTML5 spec indicates that it is allowed.
Is there actually something wrong with this code, or is the validator at fault?
Here is a bare bones case which will produce exactly this kind of validation error (twice), in one case on an email input, and on the other on a textarea.
<!DOCTYPE HTML>
<html lang="en">
<head>
<meta charset="utf-8">
</head>
<body>
<form method="post" action="contactme.php">
<label class='pas block'>
Your E-Mail:<br/>
<input type='email' name='email' required inputmode='latin' placeholder='your e-mail here' />
</label>
<label class='pas block'>
Your Message:<br/>
<textarea name='message' required inputmode='latin' placeholder='and your message here!'></textarea>
</label>
</form>
</body>
</html>
Also, see the chart about which attributes apply to the different input types here:
http://www.whatwg.org/specs/web-apps/current-work/multipage/the-input-element.html#attr-input-type
The "inputmode" attribute applies only to "text" and "search".
UPDATE 2019-09-04: "inputmode" is now a global attribute (per WHATWG) and can be specified on any HTML element: https://html.spec.whatwg.org/multipage/dom.html#global-attributes
Another reference page for "inputmode":
https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/inputmode
On another note, "inputmode" is not a W3C HTML5 attribute, but it is a W3C HTML 5.1 attribute (at least at the time I'm writing this). UPDATE 2019-09-04: "inputmode" has been removed from HTML 5.2 and HTML 5.3.
The HTML5 spec says
The following content attributes must not be specified and do not apply to the element: accept, alt, checked, dirname, formaction, formenctype, formmethod, formnovalidate, formtarget, height, inputmode, max, min, src, step, and width.
It's under bookkeeping details at https://html.spec.whatwg.org/multipage/input.html#e-mail-state-(type=email)
Five years after the question was asked, some may wonder why some of the properties listed by #dsas doesn't trigger such errors, like enctype
The answer is simple support, while enctype for instance gained a wide support
inputmethod is supported only as of IE11 and Edge 14, for more infos click here

Adding items to a resource restfully using OpenRasta

I'm using OpenRasta to create a Survey application.
I have a SurveyResource that is accessible at /surveys/{id} and editable at /surveys/{id}/edit
I'd now like to add questions to the survey, as that is the point of a survey, but I'm not sure what the most restful way of doing this is and how to set it up in OR.
I'm thinking I should have a QuestionResource (that has details of the question type, question text, etc) and it should be posted to /surveys/{id}/questions and handled by a question handler, but I can't work out how to configure OR.
I've pushed my project onto github at https://github.com/oharab/OpenSurvey/tree/add_question_to_survey
Can anyone help me?
Ben
it depends on the way you want to model your resources. It's perfectly possible that you'd never explicitly provide access to a single question, and would modify the entire survey document, like so:
PUT /surveys/123
<survey>
<link rel="update" href="/surveys/123" method="PUT"
type="application/vnd.mycorp.survey+xml" />
<question id="age">
<label>How old are you?</label>
<select>
<option>0 - 5</option>
<option>6 - 10</option>
<option>10 - 13</option>
</select>
</question>
</survey>
If you go this route, you could even use HTML, or HTML 5 for your content so it's easy to consume by clients. Now you're just modifying the entire survey document at once.
Alternatively, you might want to separately address each question, giving them an individual URI, which I think is what you're talking about, like so:
GET /survey/123
<survey>
<link rel="add-question" href="/survey/123/questions"
type="application/vnd.mycorp.surveyquestion+xml" method="POST" />
<question>
<link rel="delete" href="/questions/123-age" method="DELETE" />
<link rel="update" href="/questions/123-age" type="application/vnd.mycorp.surveyquestion+xml" method="PUT" />
<label>How old are you?</label>
<select>
<option>0 - 5</option>
<option>6 - 10</option>
<option>10 - 13</option>
</select>
</question>
</survey>
Neither of these is more RESTful than the other, the difference is only in granularity of call. If you need the granularity of the latter, then configure yourself a separate handler per resource as in
using(OpenRastaConfiguration.Manual)
{
ResourceSpace.Has.ResourcesOfType<Survey>().AtUri("/survey/{id}").HandledBy<SurveyHandler>();
ResourceSpace.Has.ResourcesOfType<Question>().AtUri("/questions/{id}").HandleBy<QuestionHandler>();
}