Ensure that all string variables placed into HTML is either properly contextually encoded manually - owasp

Can anyone explain this OWASP recommendation -
Ensure that all string variables placed into HTML or other web client code is either properly contextually encoded manually, or utilize templates that automatically encode contextually to ensure the application is not susceptible to reflected, stored and DOM Cross-Site Scripting (XSS) attacks.

Related

Is it okay to add custom keys to the object in a progressive web app `manifest.json` file?

I want to add a custom key to the manifest.json file for a progressive web app.
The MDN page doesn't mention custom keys:
Web App Manifest | MDN
The spec:
Web App Manifest
includes this text in the section "3.1 Media type registration" under a sub-heading "Security and privacy considerations":
As the manifest format is JSON and will commonly be encoded using [UNICODE], the security considerations described in [ECMA-404] and [UNICODE-SECURITY] apply. In addition, because there is no way to prevent developers from including custom/unrestrained data in a manifest, implementors need to impose their own implementation-specific limits on the values of otherwise unconstrained member types, e.g. to prevent denial of service attacks, to guard against running out of memory, or to work around platform-specific limitations.
Are there known limitations or restrictions on the use of custom keys in manifest.json files?
According to the standard SO-59512547
Browsers shall ignore any values starting with X- which is a common
abbreviation for custom headers in HTTP and E-Mail. As they are to be
used exclusively by developers.
My use case is sending boot up data early with HTTP 2.0, things that would normally be in headers or env variables, but that I would non the less want to keep dynamic... such as socket endpoints, custom loading UI and console logging level. Always loading manifest.json is extremely common, and thus it can serve as standard boot config file better than a custom named json file that we would have to tell the server we want prefectched along any request for index.html.
Australians, Mooners, Martians will thank me.

Need help in identifying the difference between ESAPI.validator() and ESAPI.encoder()

We are implementing application security in our website. Its a REST based application, so i will have to validate the whole request payload, rather than each attribute. This payload need to be validated against all type of attacks (SQL,XSS etc). While browsing i found people are using ESAPI for web security.
I am confused between ESAPI.validator().getValidXXX, ESAPI.encoder() Java API's of ESAPI library. What is the difference between these two and when to use which API. I would also like to know in what cases we might use both API's
As per my understanding i could encode an input to form a valid html using both API's
Eg:
ESAPI.encoder().encodeForHTML(input);
ESAPI.validator().getValidSafeHTML(context, input, maxLength, allowNull).
For XSS attacks, I have made code changes to strip-of html tags using java pettern&matcher, but i would like to achieve the same using ESAPI. Can someone help me how to achieve it.
Or
Are there any new java plugins developed for websecurity similar to ESAPI which i did not come accross. I have found https://jsoup.org/, but it solves only XSS attacks, i am looking for a library which provides API's for several attacks (SQL injection/XSS)
ESAPI.encoder().encodeForHTML(input);
You use this when you're sending input to a browser, so that the data you're sending gets escaped for HTML. This can get tricky, because you have to know if that exact data is for example, being passed to javascript before it is being rendered into HTML. Or if it's being used as part of an HTML attribute.
We use:
ESAPI.validator().getValidSafeHTML(context, input, maxLength, allowNull).
when we want to get "safe" HTML from a client, that is backed by an antisamy policy file that describes exactly what kinds of HTML tags and HTML attributes we will accept from the user. The default is deny, so you have to explicitly tell policy file, if you will accept:
text
You need to specify that you want the "a" tag, and that you will allow an "href" attribute, and you can even specify further rules against the content within the text fields and tag attributes.
You only need "getValidSafeHTML" if your application needs to accept HTML content from the user... which is usually specious in most corporate applications. (Myspace used to allow this, and the result was the Samy worm.)
Generally, you use the validator API when content is coming into your application, and the encoder API when you direct content back to a user or a backend interpreter. AntiSamy isn't supported anymore, so if you need a "safe HTML" solution, use OWASP's HTML Sanitizer.
Are there any new java plugins developed for websecurity similar to
ESAPI which i did not come accross. I have found https://jsoup.org/,
but it solves only XSS attacks, i am looking for a library which
provides API's for several attacks (SQL injection/XSS)
The only other one that attempts a similar amount of security is HDIV. Here is an answer that compares HDIV to ESAPI by an HDIV developer.
*DISCLAIMER: I am an ESAPI developer, and OWASP member.
Sidenote: I discourage the use of Jsoup, because by default it mutates incoming data, constructing "best guess" (invalid) parse trees, and doesn't allow you fine-grained control of that behavior... meaning, if there's an instance where you want to override and mandate a particular kind of policy, Jsoup asserts that it is always smarter than you are... and that's simply not the case.

RESTful resource representation - human web vs programmable web

I'm designing RESTful resources for accessing media. Media could be a live stream or archived stream. I'm using O'Riellys text "RESTful Web Services' as a guide but I'm struggling with the representation of resources relative to the 'programmagable web' versus the 'human web'. For human web request I'd like to return an HTML representation. For programmable web requests I'd like to return XML. That being said, consider:
GET http:// localhost :8080/stream - returns a list of streams
GET http:// localhost :8080/search?stream=abc - return a specific stream
How do I differentiate between a request from the 'human web' versus the 'programmable web' such that I could return the right representation?
O'Reillys text seem to suggest design of two seperate resources. From page 24 of the PDF he states:
I’d use the same tools to fetch and process a web page.
These two URIs:
1) http:// api. search.yahoo.com/WebSearchService/V1/webSearch?appid=restbook&query=jellyfish
2) http:// search.yahoo.com/search?p=jellyfish
point to different forms of the same thing: “a list of search results for the query ‘jellyfish.’”
One URI serves HTML and is intended for use by web browsers; the other serves
XML and is intended for use by automated clients.
Are two separate resources for dealing with the human web versus programmable web the norm or is there alternatives? Thoughts welcomed.
I'd say the official "fielding compliant" answer is to use content type negotiation using the ACCEPTS header. Lots of good stuff at http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm
If the client requests text/html, feed the human readable html. If the client requests text/xml, feed it xml. The trick here is that pragmatically this isn't always well supported by the clients, so you'll often need a bunch of fallbacks using query strings or resource name mangling, as in the example you posted.
Personally, I try to follow ideology as long as I can, and then start adding fallbacks pragramaticly as necessary. I wouldn't create a separate resource for programmatic or human consumption until you run into a client that can't properly handle sending an accept header.
Your example doesn't match the question, so I will answer both.
In the example you give, you have two different resources: a list of streams, and an individual stream. As such they should be allocated separate URIs, and I would strongly recommend against using the query string for that where there is a clean and obvious alternative.
In this case it is classic ReST. /stream/ is the Resource consisting of the list of available streams, this list should be presented as a either a human or computer (or preferably both) list of URIs so (as text/html):
<ul>
<li>ABC</li>
...
</ul>
This leads to your next question, how to identify the different Representations of the stream-list Resource. There are three techniques I have used: content negotiation, format query param, and RDFa.
RDFa is my preferred alternative, in this case you only have one representation that encodes both human and machine readable content. In the case of a simple list this is a trivial change to your HTML:
<ul>
<li><a rev="rdfs:member" href="/stream/abc">ABC</a></li>
...
</ul>
If you have one or more pure machine serializations of your data then there are two alternatives I have used; generally, both at the same time.
Content negotiation is the purest, and most convenient. Just have one text/html, and another application/xml or application/json, and let the client choose.
This isn't as convenient when testing the machine version from a browser, command line (curl/wget/etc), or a script. So I like to also support a format query parameter. For convenience sake, have it take a mime-type.
I prefer to have my resource handled by the same controller/servlet/etc, have it fetch the information from the filesystem/database/whatever, and dispatch it to the appropriate view based on the mime-type (content neg or format param) for display. Either way you are dealing with different representations of the same resource, so it is a good idea to ensure they are available from the same Base URI, whatever alternative approaches you decide to support.

How can the client side encoding be bypassed in XSS

I hear everyone saying Output encoding has to be done client-side instead of server-side. My question is: doesnt it vary with context?
Are there cases where client-side output encoding is good enough and
cant be bypassed?
If I use a client side js function like encodeURIComponent to encode a url causing XSS, how can an attacker bypass this and still cause XSS?
Phishing can also happen due to XSS. If I at least do output encoding can phishing be prevented?
The short answer is that XSS encoding needs to happen where data is put into html or javascript be it server-sider and/or client-side. I could easily imagine data put into a script tag on the server side begin properly encoded, but then javascript on the client-side is using that value in an insecure way, creating an XSS vulnerability.
So when putting untrusted data into a web page (be it in a html tag, inside -tags, in css. etc - see the OWASP XSS prevention cheat sheet) we need to encode. Then when we come to the client side, we also need to make sure our javascript does not introduce XSS-problems. This could for instance be DOM-based XSS, or the example mentioned above.
So my answer is, you need to do encoding both on the server AND client side.
I don't understand how the 3rd question is related. Phishing could happen in so many different ways. On a completely different domain just mimicking the original page etc.
Edit: One more thing. If utrusted data is put into the page server side without encoding, there is very little the client side can do to fix that. It's most likely already to late.
Erlend answer is beautiful. I want to share my findings regarding output encoding.
Output encoding done in server side is better that in client side.
You can get more knowledge regarding output encoding from OWASP Xss Prevention
and you can do this client side too. If you are going to use un trusted(user given input) data in html context, please use javascript's native api innerText IE Docs ( textContent for moz) or encoding the characters (<,>,',",/,) into html entity

Can you please correct me if i am wrong about REST?

REST is used to communicate any two systems.
So that if you want get info from one machine we have to use GET method and add info in one system we need to use the method POST..Like wise PUT and DELETE.
When a machine GETs the resource, it will ask for the machine readable one. When a browser GETs a resource for a human, it will ask for the human readable one.
So When you are sending request from machine 1. It will go to some machine x. Machine x will send a machine readable format to machine 1. Now Browser changes to user readable format.
So JSON is a machine readable format and HTML is a client readable format...Correct me if i am wrong?
REST is an architectural style, not a technology. That being said, the only technology that most people know that is intended to align with the REST architectural style is HTTP. If you want to understand the REST architectural style, I recommend the following two resources:
Roy Fielding's presentation "The Rest of REST" (http://roy.gbiv.com/talks/200709_fielding_rest.pdf)
The book "RESTful Web Services"
When you send a GET request for a resource, it is up to the server to determine what representation (format, e.g. html vs. json) it wishes to send back. The client can send along an Accept header that specifies a set of preferred formats, but it's ultimately up to the server to decide what it wants to send. To learn more about this interaction, Google on "HTTP content negotiation".
The reason browsers tend to get back HTML is that they send an Accept header with "text/html". If you somehow configured your browser to always send an Accept header of only "application/json", you would sometimes get JSON back (if the server supported JSON representations), sometimes HTML (if the server ignored your Accept header) and sometimes an error saying that the server could not support the representation you requested.
A computer can parse either JSON or HTML if you have the right libraries. JSON content tends to be structured data (optimized for parsing) and HTML tends to be optimized for presentation, so JSON is generally much easier for a program to parse.
Sounds about right to me. HTML is definitely for end-user consumption (despite all the nasty screen-scraping code out there) and there's no way that I'd want to deliver JSON (or XML or YAML) to end-user clients for direct display.
You probably want to ensure that you only deliver HTML that matches up with the same basic data model that you're delivering to mechanical clients; producing XHTML on demand by applying an XSLT stylesheet to an XML version of your standard responses is probably the easiest way to do it since it's likely you can do that with a layer that's independent of your basic application.
To be pedantic, both HTML and JSON are machine readable formats. The difference is that HTML has a specification that describes some semantics that web browsers know how to interpret in order to visually render it. The JSON spec has really no semantics other than defining how to serialize arrays, objects and properties.
Don't forget JSON and HTML are just two of the hundreds of potentially useful media type formats that a RESTful system can use.