Is it okay to add custom keys to the object in a progressive web app `manifest.json` file? - progressive-web-apps

I want to add a custom key to the manifest.json file for a progressive web app.
The MDN page doesn't mention custom keys:
Web App Manifest | MDN
The spec:
Web App Manifest
includes this text in the section "3.1 Media type registration" under a sub-heading "Security and privacy considerations":
As the manifest format is JSON and will commonly be encoded using [UNICODE], the security considerations described in [ECMA-404] and [UNICODE-SECURITY] apply. In addition, because there is no way to prevent developers from including custom/unrestrained data in a manifest, implementors need to impose their own implementation-specific limits on the values of otherwise unconstrained member types, e.g. to prevent denial of service attacks, to guard against running out of memory, or to work around platform-specific limitations.
Are there known limitations or restrictions on the use of custom keys in manifest.json files?

According to the standard SO-59512547
Browsers shall ignore any values starting with X- which is a common
abbreviation for custom headers in HTTP and E-Mail. As they are to be
used exclusively by developers.
My use case is sending boot up data early with HTTP 2.0, things that would normally be in headers or env variables, but that I would non the less want to keep dynamic... such as socket endpoints, custom loading UI and console logging level. Always loading manifest.json is extremely common, and thus it can serve as standard boot config file better than a custom named json file that we would have to tell the server we want prefectched along any request for index.html.
Australians, Mooners, Martians will thank me.

Related

Rest API Localization - Headers vs Payload

We have one POST API live in production. Now we have a requirement to accept Localization information and proceed with execution accordingly.
e.g. if distanceUnit is "KM" then process all incoming data in Kilometers.
There are three options I could think of to accept localization information.
As a http header i.e. localization: {"distanceUnit": "km"}
As a part of payload itself.
Request parameter.
I like the 1st option as
it doesn't change api contract.
It's easier for other apis to send this info in case they need to be localized in future.
Localization is a part of content negotiation so I don't think it should be part of payload/query parameter.
Any opinions here would be helpful to zero in on 1st or second option.
Thanks.
While accept-language, as indicated by the proposed link Kit posted, may be attempting, this only supports registered languages, maintained by IANA, the standadization gremium of the Web, but not certain generic configuration options out of the box. It may be attempting to default to miles for i.e. Accept-Language: us and use km elsewhere, American scientists may have certain issues with your application then if they want to use km instead of miles. But if this might not be the case, this clearly could be an option you might consider. In regards to custom HTTP headers, I wouldn't recommend using those as the problem with custom HTTP headers in general is that arbitrary generic HTTP clients do not support these which somehow contradicts the idea why one should use a REST architecture.
Let us transfer your problem to the Web domain for a second and see how we usually solve that task there. As REST is basically just a generalized approach to the common way we humans interact with the Web, any concepts used on the Web also apply to a REST architecture. Thus, designing the whole interaction flow as if your application interacts on a typical Web page is just common practice (or at least should be).
On the Web a so called Web form is used to "teach" a Web client (a.k.a. Browser) what data the server expects as input. It not only teaches the client about the respective properties the server either expects or supports for a certain resource but also which HTTP method to use, about the target URI to send the request to and about the media-type to use, which implicitly is often just given as application/x-www-form-urlencoded but may also be multipart/form-data.
The usage of forms and links fall into the HATEOAS constraint where these concpets allow clients to progress through their task, i.e. of buying an item in a Web shop or administrating users in a system, without the need of ever having to consult an external documentation at all. Applications here basically just use the build-in hypermedia capabilities to progress through their tasks. Clients usually follow some kind of predefined processes where the server instructs clients on what they need to do in order to add an item to the shopping cart or on how to add or edit a user while still just operating on a generic HTML document that by itself isn't tailored to the respective task at hands. This approach allows Web clients to basically render all kinds of pages and users to interact with those generic pages. If something in that page representation changes your browser will automatically adept and render the new version on the next request. Hence, the system is able to evolve over time and adapt to changes easily. This is probably one of the core reasons why anyone wants to use a REST architecture basically.
So, back to the topic. On the Web a server would advertise to a client that it supports various localization information with above mentioned forms. A user might be presented a choice or dropdown option where s/he can select the appropriate option. The user usually does not care how this input is transferred to the server or about the internals of the server at all. All s/he cares for is that the data will be available after the request was submitted (in case of adding or updating a resource). This also holds true for application in a REST architecture.
You might see a pattern here. REST and the browsable Web are basically the same thing. The latter though focuses on human interaction while the primer one should allow applications to "surf the Web" and follow allong processes outlined by the server (semi-)automatically. As such it should be clear by now that the same concepts that apply to the browsable Web also apply to REST and applications in that REST architecture.
I like the 1st option as ... it doesn't change api contract
Clients shouldn't bind to a particular API as this creates coupling, which REST tries to avoid at all costs. Instead of directly binding to an API, the Web and as such also REST should use contracts build on hyper media types that define the admissible syntax and semantics of messages exchanged. By abstracting the contract away from the API itself to the media-type a client can support various contracts simultaneously. The generalization of the media-type furthermore allows to i.e. express various different things with the same media type and thus increase the likelihood for reusage and thus a better integration support into application layers.
Supporting various media-types is similar to speaking different languages. By being able to speak various languages you just increase the likelihood that you will be able to communicate with other people (services) out of the box without the need of learning those languages before. A client can tell a server via the Accept header which media-types it is able to "speak( (a.k.a. process) and the server will either respond with either of these or respond with a 406 Not Acceptable. That error response is, as Jim Webber put it, coordination data that at all times tells you whether everything went well or in case of failures gives you feedback on what went wrong.
In order to stay future-proof I therefore would suggest to design the configuration around hypertext enabled media types that support forms, i.e. HTML forms, applicaiton/hal-forms+json or application/ion+json. If in future you need to add further configuration options adding these is just a trivial task. Whether that configuration is exposed as own resource which you just link to, as embedded part within the resource or not return to the client at all is also a choice you have. If the same configuration may be used by multiple resources it would be benefitial to expose it as own resource and then just create a reference from the resource to that configuration but as mentioned these are design decisions you have to make.
If the POST request body is the only place where this is used, and you never have to do GET requests and automatically apply any conversion, my preference would probably go to adding it to the body.
It's nice to have a full document that contains all the information to describe itself, without requiring external out-of-band data to fully interpret its meaning.
You might like to define your schema to always include the unit in relevant parts of the document, for example:
distance: [5, 'km']
or, as you said, do it once at the top of the doc.

What is the use of suffix in sling URLs

Sling provides a functionality to ease resource resolution. It's ability to resolve to exact resource representation we need is very useful in content based application.
However I am not able to understand one question is the use of suffix.
Example:
http://localhost:4502/content/app/mycomponent.large.html/something.html
Here, "something.html" is the suffix. I want to know under what circumstances would I go for a suffix ? What advantages do we get when compared to passing the information as a selector ?
Pretty hard question, but I will try to clear up it a bit.
According to best practices, selectors should not be treated as input parameters in functions. It means, that you should use selectors only for registering servlets (or JSP file names) and selectors should notify sling about the operation you want to do with given resource or the way it should be displayed.
For example, let's imagine, that you have page /page/a.html and you have some special representation for mobile devices. Then, accessing it with /page/a.mobile.html will open this page in a mobile friendly way.
On the other hand, suffix usually used to provide additional information to your servlet/JSP page. Just check editor interface in TouchUI: the url looks like
localhost:4502/editor.html/content/pageYouEdit.html
So you always stays on the same page /editor.html, but suffix notifies Edit Interface which page to edit.
Also another example:
there is a nice library for include content dynamically - https://github.com/Cognifide/Sling-Dynamic-Include.
When it's configured for some component, then after the page is loaded, your component will be included with AJAX call, like this:
publish/pathToThePage/_jcr_content/pathToTheComponentNode.nocache.html//apps/pathToTheRenderer
In this example, you can see, that both selector and suffix is used. Selector tells, what is special about a representation of this component we need and suffix tells which component should render requested data.
It's used to provide different versions of a resource, which are cacheable. This plays nicely with the Apache HTTP module known as "Dispatcher" which Adobe architects will recommend in any AEM implementation.
http://me.com/page.html/todays_promotion <-- cacheable
http://me.com/page.html?todays_promotion <-- not cacheable
The second example there, with a request parameter, should be treated as a variable resource that could produce different results upon each request.

Need help in identifying the difference between ESAPI.validator() and ESAPI.encoder()

We are implementing application security in our website. Its a REST based application, so i will have to validate the whole request payload, rather than each attribute. This payload need to be validated against all type of attacks (SQL,XSS etc). While browsing i found people are using ESAPI for web security.
I am confused between ESAPI.validator().getValidXXX, ESAPI.encoder() Java API's of ESAPI library. What is the difference between these two and when to use which API. I would also like to know in what cases we might use both API's
As per my understanding i could encode an input to form a valid html using both API's
Eg:
ESAPI.encoder().encodeForHTML(input);
ESAPI.validator().getValidSafeHTML(context, input, maxLength, allowNull).
For XSS attacks, I have made code changes to strip-of html tags using java pettern&matcher, but i would like to achieve the same using ESAPI. Can someone help me how to achieve it.
Or
Are there any new java plugins developed for websecurity similar to ESAPI which i did not come accross. I have found https://jsoup.org/, but it solves only XSS attacks, i am looking for a library which provides API's for several attacks (SQL injection/XSS)
ESAPI.encoder().encodeForHTML(input);
You use this when you're sending input to a browser, so that the data you're sending gets escaped for HTML. This can get tricky, because you have to know if that exact data is for example, being passed to javascript before it is being rendered into HTML. Or if it's being used as part of an HTML attribute.
We use:
ESAPI.validator().getValidSafeHTML(context, input, maxLength, allowNull).
when we want to get "safe" HTML from a client, that is backed by an antisamy policy file that describes exactly what kinds of HTML tags and HTML attributes we will accept from the user. The default is deny, so you have to explicitly tell policy file, if you will accept:
text
You need to specify that you want the "a" tag, and that you will allow an "href" attribute, and you can even specify further rules against the content within the text fields and tag attributes.
You only need "getValidSafeHTML" if your application needs to accept HTML content from the user... which is usually specious in most corporate applications. (Myspace used to allow this, and the result was the Samy worm.)
Generally, you use the validator API when content is coming into your application, and the encoder API when you direct content back to a user or a backend interpreter. AntiSamy isn't supported anymore, so if you need a "safe HTML" solution, use OWASP's HTML Sanitizer.
Are there any new java plugins developed for websecurity similar to
ESAPI which i did not come accross. I have found https://jsoup.org/,
but it solves only XSS attacks, i am looking for a library which
provides API's for several attacks (SQL injection/XSS)
The only other one that attempts a similar amount of security is HDIV. Here is an answer that compares HDIV to ESAPI by an HDIV developer.
*DISCLAIMER: I am an ESAPI developer, and OWASP member.
Sidenote: I discourage the use of Jsoup, because by default it mutates incoming data, constructing "best guess" (invalid) parse trees, and doesn't allow you fine-grained control of that behavior... meaning, if there's an instance where you want to override and mandate a particular kind of policy, Jsoup asserts that it is always smarter than you are... and that's simply not the case.

Scraping WebObjects website & REST

I need to programmatically interact with a WebObjects website and extract data from the responses. The particular WebObjects site I am scraping uses component actions and stores sessions in cookies (not urls). This means that all urls look something like this:
http://example.com/WOApp/WebObjects/WOApp.woa/wo/7.0.0.0.29.1.1.1
My first questions are:
Does urls like this not completely destroy local and shared caching opportunities (cachable constraint in REST)? I imaging the only effective caching with such urls is the WebObjects server itself.
Isn't addressability broken as well? Each resource does have a unique endpoint, but it changes constantly. Furthermore (I think) that WebObjects also makes too old URLs invalid since they "time-out" after a period of time. I'm not sure whether this applies only to urls with sessions though.
Regarding the scraping I am not sure whether it's possible to extract any meaningful endpoints from the website. For example, with a normal website I would look through the HTML and extract the POST urls, then use them in my scraper by posting directly to them instead of going through the normal request-response cycle.
In this case I obviously cannot use any URLs extracted from the HTML since they are dynamically generated on each request, but I read something about being able to access WebObjects components directly if the security settings have not been set to disallow this (see https://developer.apple.com/legacy/library/documentation/LegacyTechnologies/WebObjects/WebObjects_3.5/PDF/WebObjectsDevGuide.pdf, p. 53 "Limitations on Direct requests"). I don't understand exactly how to do this though or if it's even possible.
If it's not possible what would be a good approach then? The only options I can think of is:
Using a full-blown browser client to interact with the website (e.g. WatiR or Selenium) and extract & process the HTML from their responses
Manually extracting the dynamic end-points by first request the page where they are on and then find the place in the HTML where they're located. Then use them afterwards as if they were "static".
I am interested in opinions on how to approach this scenario since I don't believe any of the solutions above are particularly good.
You've asked a number of questions, and I'll see if I can cover each in turn.
Does urls like this not completely destroy local and shared caching
opportunities (cachable constraint in REST)? I imaging the only
effective caching with such urls is the WebObjects server itself.
There is, indeed, a page cache within the WebObjects application server, and you're right to observe that these component action URLs probably thwart any other kind of caching. Additionally, even though the session ID is not present in the URL, you'd need the session ID in the cookie to re-create the same page, so having just that URL would get you a session restoration error from the application server.
Isn't addressability broken as well? Each resource does have a unique
endpoint, but it changes constantly.
Well, yes, on the face of it this is true. You've given a component action URL as an example, and they're tied to the session.
Furthermore (I think) that
WebObjects also makes too old URLs invalid since they "time-out" after
a period of time. I'm not sure whether this applies only to urls with
sessions though.
Again, all true. Component action URLs generate sessions, and sessions time out.
At this point, let me take a quick diversion. I'm assuming you're not the owner of the WebObjects application—you're talking about having to scrape a WebObjects app, and you've identified some ways in which this particular app doesn't conform to REST principles. You're completely right—a fully component-action-based WebObjects application won't be RESTful. WebObjects pre-dates REST by a few years. Having said that, there are ways in which a WebObjects application can be completely RESTful:
Using session-less direct actions gives a degree of REST-like behaviour, and would certainly solve the problems you identify with caching, addressability and expiry.
Using the ERRest framework to create a 100% RESTful application.
Of course, none of this will help you if you're just trying to scrape a legacy application.
Regarding the scraping I am not sure whether it's possible to extract
any meaningful endpoints from the website. For example, with a normal
website I would look through the HTML and extract the POST urls, then
use them in my scraper by posting directly to them instead of going
through the normal request-response cycle.
Again, if it's a fully component action-based application, you're right—all those URLs will be dynamically generated and useless to you.
In this case I obviously cannot use any URLs extracted from the HTML
since they are dynamically generated on each request, but I read
something about being able to access WebObjects components directly if
the security settings have not been set to disallow this…
That's talking about getting a component to render directly from its template with some restrictions:
As you note, the application can easily prevent it from happening at all.
As mentioned on p.53, the user input and action-invocation phases of rendering the component are skipped, which probably means this approach would be limited to rendering a component that didn't have any dynamic content anyway. This might be of some very limited use to you, though you'd need to know the component names you were interested in, and they wouldn't normally be exposed anywhere.
I'm not sure you're going to find anything better than the types of high-level functional approaches you've already suggested above, such as automating at the browser level with Selenium. If what you need is REST-style direct addressability of resources within the application, you're not going to get that unless you can re-write the application to use direct actions or ERRest where you need them.
A little late, but could help.
I use the Apache's mod_ext_filter (little modified) to pre/post filter the requests/responses from our WebObjects application. The filter calls PHP scripts and can read the dynamical hyperrefs and other things from the HTML pages. The scripts can also modify the HTTP requests, so we can programatically add/remove parameters from the request to implement new workflows in front of the legacy app and cleanup the requests before they will reach WebObjects. It is also possible to handle an additional database within the scripts and store some things over multiple requests.
So you can get the dynamically created links (maybe a button's name or HTML form destination) and can recognize these names within the request.
It is also possible to "remote control" such applications with little scripts like "click on the third button on the page". The only thing you need is a DOM parser to get the structure of the HTML pages and then rebuild the actions which the browser would do (i.e. create the HTTP request manually and send it as POST to the extracted form destination href). The only problem is the Javascript code, which we analyze and reprogram within PHP (i.e. enable/disable input elements, so they will not be transmitted within the requests)
There were some problems within the WebObjects Adapter Module for Apache. It still uses Content-Length within the HTTP header, which you cannot change in mod_ext_filter. If you change the HTML or the parameters within the request, the length of the content will not longer match. But it is possible to change that.
Theoretically it could also be possible to control such an closed-source legacy application from a new UI on a tablet or smartphone, which delegates the user interaction to the backend WebObjects app.
The scripts depends on the page structure, so if your WebObjects app will be changed, you have to correct some things in the scripts (i.e. third button could be now the fourth button).
It should also be possible to add a Restful interface in front of the application and query the data from the legacy app by the filter scripts.

Adding custom metadata in alfresco

I added custom metadata through xml configuration specified in their wiki ... I could see the aspect I added in the /share application in manage aspects but it is not listed in /alfresco app and when i am uploading the document using the rest api it says unable to find the field i added ..
Share and old Alfresco Web Client have different configurations.
Check these resources out for more information:
http://wiki.alfresco.com/wiki/Web_Client_Customisation_Guide
http://wiki.alfresco.com/wiki/Displaying_Custom_Metadata
Please read this tutorial which covers creating custom content types and aspects and exposing those to both the Share (/share) and Explorer (/alfresco) web clients.
It sounds like you may have multiple problems, though, beyond configuration, because the REST API should be able to see your custom model, if it is defined correctly, regardless of whether or not it is configured in either of the two web clients.