HTML Special characters to HTML entities to prevent XSS vulnerabilites - eclipse

In order to minimize XSS vulnerabilities of my application, and as there are no user inputs at all I'm performing HTML-Entity escaping of my output as below, but my html breaks and displays nothing and if I replace to <script> the whole code appears in the output as is.
document.getElementById("dis").innerHTML = "JAVA";
document.getElementById("dis").innerHTML = "JAVA";
If this is not the right way please suggest the steps for using the public method below for HTML escaping to minimize the XSS vulnerabilities.
public static String escapeHtml (CharSequence text)

Related

Returning Array in USER_INT userFunc leads to <!--INT_SCRIPT output

I have a userFunc which I call via
lib.random = USER_INT
lib.random {
userFunc = My\Plugin\UserFunc\Functions->random
}
when I return a Array and try to access it is fails.
<v:variable.set name="random" value="{f:cObject(typoscriptObjectPath: 'lib.random')}" />
{random.max}
When I try to debug out it I get some <!--INT_SCRIPT string
Did anyone know the problem and a Solution?
/e:
I would like to make the problem a little clearer by describing the Szenario.
I have a Plugin with a Login form. When the User logs in I set a JWT with various basic informations (name, email).
This Informations have to be displayed on various places around the Website, not only on one page (for example profile page). Some cases are prefilled forms or just silly "Hello, Paul" stuff.
So when I first log in (Fresh browser, no cache) then I read "Hello, Paul" after I log out and log in with a another Account (Lets call it "Peter") then It still is written "Hello, Paul" , nor "Hello, Peter". When I clear my browser Cache then everything is fine.
Maybe this helps maybe to solve my dilemma. :)
TL;DR: uncached parts in TYPO3 are replaced in the generated page output string using markers and cannot communicate in the direction intended here. Selectively caching, disabling cache or detaching the data from the main request (with XHR or other) are the only possible methods.
It should be clear that USER_INT achieves its functionality by string replacement in the generated page body. This means, among other things, that:
You can never pass the output of a USER_INT to anything in Fluid, not even if the entire page is uncached. You will effectively be passing a string containing <!---INT_SCRIPT... (the entire marker).
You can however generate USER_INT from Fluid, which ends up in the generated page, which is then replaced with the rendered object (use f:cObject to render a USER_INT or COA_INT).
Then there are the use case context considerations. First of all, a cookie (in practice) changes the request parameters and should be part of the cache identifier that your page uses (it is not this way by default). Second, if said cookie changes the way the page renders (and it does, given your use case) this will cause trouble when the page is cached. Third, the page output changing based on a cookie indicates perhaps sensitive information or at the very least user-specific information.
Taking the above into account your use case should do one of the following things:
Either render the entire chunk of output that changes based on cookie, as USER_INT. That means wrapping the entire Fluid output and rendering it all without caching. Note that template compiling still happens (and you can use f:cache.static to hard-cache certain parts if they never change based on request parameters).
Or add the cookie value to the cHash (page hash value) so that having the cookie set means you request a specific cached version that matches the cookie. This is the preferred way if your cookie's values is generally the same for many users (e.g. it contains a selected contact person from a limited list and stores that in a cookie).
Or, in the case that your output contains actually sensitive information, require that the content element or page is only available when logged in with a specific group. This has two purposes: first, it protects the page from being viewed without authentication - but second, it also makes the page content not cache or be cached with the frontend user group ID as part of the cache identity.
Refactor to XHR request and make whichever endpoint it uses, a USER_INT or manually disabled cache context, then load the data. Or set the actual data in the cookie, then use JS to insert the values where needed.
Hopefully that clarifies the different contexts and why they can't communicate in the direction you're attempting; even if they had been exchanging strings instead of arrays.
See also: .cache sub-object in TypoScript which is where you would be able to craft a unique cache identifier for use case 2 described above.
USER_INT are not Cached, so the values for this are replaced after the cache is build up.
I think f:cObject is the wrong way. Implement an own ViewHelper to get the same data should be an better way.
<?php
namespace My\Plugin\ViewHelpers;
use TYPO3Fluid\Fluid\Core\Rendering\RenderingContextInterface;
use TYPO3Fluid\Fluid\Core\ViewHelper\AbstractViewHelper;
use TYPO3Fluid\Fluid\Core\ViewHelper\Traits\CompileWithRenderStatic;
class RandomViewHelper extends AbstractViewHelper
{
use CompileWithRenderStatic;
/**
* #var boolean
*/
protected $escapeOutput = false;
/**
* #param array $arguments
* #param \Closure $renderChildrenClosure
* #param RenderingContextInterface $renderingContext
* #return string
*/
public static function renderStatic(
array $arguments,
\Closure $renderChildrenClosure,
RenderingContextInterface $renderingContext
) {
return rand();
}
}
Now you can use it like following:
{my:random()} or <my:random />

Idiomatic way to overcome difficulties with `#extension = 'html'` in AEM 6.3 Sightly/HTL?

AEM's HTL (f.k.a. Sightly) has a special idiom for reformatting URL properties, e.g.
<a href="${properties.path # extension = 'html'}">
The purpose of this idiom is two-fold
to append .html to internal links authored via a pathbrowser field
to apply resource mappings which have been configured to strip /content/projectname
Unfortunately this well-intentioned feature has several problems:
It won't work with resource links, e.g. a PDF file in the DAM.
It won't work with external links that don't end in .html.
It escapes the "&" in URLs containing query string params, breaking the link.
My team is now tasked with fixing dozens of defects caused by over-use of this extension = 'html' trick, and we would like to fix them all consistently and quickly with a minimum risk of regressions.
Is there a quick fix, preferrably something that can be repeated via mindless search/replace of every occurrence of extension = 'html'?
I can suggest a combination of uri context and adding .html extension to resource URLs from server side.
It won't work with resource links, e.g. a PDF file in the DAM.
Use # context = 'uri', default context for href and src attributes and does not explicitly add .html extension.
Input -
Resource Link uses default uri context.
Output -
Resource Link
On any other html attribute, use an attribute context - # context='attribute'
Input -
<div data-link="${'/content/dam/repl/en.pdf' # context='attribute'}"/>
Output -
<div data-link="/content/dam/repl/en.pdf"/>
It won't work with external links that don't end in .html.
It escapes the "&" in URLs containing query string params, breaking the link.
Again use # context = 'uri', does not escape & in URLs, works fine with selectors and # params as well. Added advantage of XSS protection.
Input -
URI context
Output -
URI context
To append .html to internal resource URLs
You cannot use # extension and # context together in the same element.
You could append .html like this Title or better way would be to address this at the sling model level, a util method like this maybe.
public static String getUrl(String link, String extension, ResourceResolver resourceResolver) {
String updatedLink = "";
if (link != null) {
Resource pathResource = resourceResolver.getResource(link);
// check if resource exists
if (pathResource != null) {
// append .html
updatedLink = resourceResolver.map(link) + extension;
}
}
return updatedLink;
}
Side note: Avoid # context='unsafe' for obvious reasons - completely disables xss protection.
Check this for the available context expression options.

isValid() method in owasp html sanitizer

I have a page in my application where user can enter HTML input. Now in order to avoid XSS attack i am using OWASP HTML Sanitizer to sanitize the user input. If the user input is not valid according to the policy i just want to throw the user out.
is there a way to simple check if the input html is valid against the policy without sanitizing ?
something like
public static boolean isValid(String input, Policy policy);
You can define yourself the isValid method but I'm not sure you can do it without calling the sanitize method.
// Define the policy factory
PolicyFactory polFac = new HtmlPolicyBuilder()
.allowElements("a", "p")
.allowAttributes("href").onElements("a")
.toFactory();
boolean isValid(String input, PolicyFactory polFac){
return input.equals(polFac.sanitize(input));
}
You can obtain a more robust version of isValidusing the second version of the sanitizemethod (in the PolicyFactoryclass) that reports the names of rejected element and attributes.

zend framework urls and get method

I am developing a website using zend framework.
i have a search form with get method. when the user clicks submit button the query string appears in the url after ? mark. but i want it to be zend like url.
is it possible?
As well as the JS approach you can do a redirect back to the preferred URL you want. I.e. let the form submit via GET, then redirect to the ZF routing style.
This is, however, overkill unless you have a really good reason to want to create neat URLs for your search queries. Generally speaking a search form should send a GET query that can be bookmarked. And there's nothing wrong with ?param=val style parameters in a URL :-)
ZF URLs are a little odd in that they force URL parameters to be part of the main URL. I.e. domain.com/controller/action/param/val/param2/val rather than domain.com/controller/action?param=val&param2=val
This isn't always what you want, but seems to be the way frameworks are going with URL parameters
There is no obvious solution. The form generated by zf will be a standard html one. When submitted from the browser using GET it will result in a request like
/action/specified/in/form?var1=val1&var2=var2
Only solution to get a "zendlike url" (one with / instead of ? or &), would be to hack the form submission using javascript. For example you can listen for onSubmit, abort the submission and instead redirect browser to a translated url. I personally don't believe this solution is worth the added complexity, but it should perform what you're looking for.
After raging against this for a day-and-a-half, and doing my best to figure out the right way to do this fairly simple this, I gave up and did the following. I still can't believe there's not a better way.
The use case that necessitates this is a simple record listing, with a form up top for adding some filters (via GET), maybe some column sorting, and Zend_Paginate thrown in for good measure. I ran into issues using the Url view helper in my pagination partial, but I suspect with even just sorting and a filter-form, Zend_View_Helper_Url would still fall down.
But I digress. My solution was to add a method to my base controller class that merges any raw query-string parameters with the existing zend-style slashy-params, and redirects (but only if necessary). The method can be called in any action that doesn't have to handle POSTs.
Hopefully someone will find this useful. Or even better, find a better way:
/**
* Translate standard URL parameters (?foo=bar&baz=bork) to zend-style
* param (foo/bar/baz/bork). Query-string style
* values override existing route-params.
*/
public function mergeQueryString(){
if ($this->getRequest()->isPost()){
throw new Exception("mergeQueryString only works on GET requests.");
}
$q = $this->getRequest()->getQuery();
$p = $this->getRequest()->getParams();
if (empty($q)) {
//there's nothing to do.
return;
}
$action = $p['action'];
$controller = $p['controller'];
$module = $p['module'];
unset($p['action'],$p['controller'],$p['module']);
$params = array_merge($p,$q);
$this->_helper->getHelper('Redirector')
->setCode(301)
->gotoSimple(
$action,
$controller,
$module,
$params);
}

How to filter and validate user input in Zend Framework

on my website I have a comment section. I want to filter and validate the input before I store it in my database. If there are any invalid chars in the input the user gets the notice that his input is invalid.
My question, which chars are not allowed? e.g. I want to avoid sql injections
Tags are not allowed. How do I check that?
If you are using Zend_Db and parameterised queries (i.e.: $adapter->insert($tableName, array('param' => 'value'))) then it will automagically escape everything for you.
If however you want to further validate the user input, have a look at Zend_Validate and Zend_Filter
Also, if by "tags" you mean HTML tags, I wouldn't do anything to those on input but do make sure you properly escape / strip them on output (have a look at htmlspecialchars())
If you want to display an error message if the input contains HTML tags, and assuming $comment is the comment body, you could try:
if(strip_tags($comment) !== $comment) {
// It seems it contained some html tags
}