Portal URL from Portlet - portlet

Is it possible to get the Portal base URL (like http://www.thisismyportal.com) from a Portlet using Portlet 2.0 API?
Right now I'm planning to manually build it concatenating PorletRequest.getServerName(), PortletRequest.getServerPort() and PortletRequest.getContextPath(); but it seems kind of clumsy (and there's no PortletRequest.getProtocol())

While it is clumsy, it is the safest way to construct the URL; and while there is no PortletRequest.getProtocol() method, you can conclude the protocol using the PortletRequest.isSecure() method.
I would advise against using an external configuration for the base URL, for a couple of reasons.
First, it would be yet another configuration item for you to maintain across environments (test, integration, production and so forth). There's very little justification to hold, in configuration, something that is fully reproducible using the current request.
Second, under certain circumstances, it might be impossible to designate a particular URL as a "base URL" for the portal. An example would be the case in which the portal server is associated with multiple hosts, or multiple host aliases.

We had those configuration properties in Resource Environment Provider for the purpose of generating external URLs for sending them in emails. It was specific solution and it wasn't a problem for us as we had other properties stored there as well so we knew it will be available at runtime. I don't know if that suits your needs. It depends on your scenario.
Also, we used https only during login, so we always generated http URLs.
Hope this helps.

Related

Datasnap method name [duplicate]

I've written a REST server in Delphi XE (using the wizard) and I want to change the URLs a bit so that instead of having
http://192.168.1.84:8080/datasnap/rest/TServerMethods1/GetListings
I get something that looks more like http://192.168.1.84:8080/GetListings
Is there a nice easy of doing this?
The naming convention is (Delphi XE3):
http://my.site.com/datasnap/rest/URIClassName/URIMethodName[/inputParameter]
You can easily change the "datasnap" and "rest" part of the URL in the TDSHTTPWebDispatcher component properties. You can change the Class Name and Method Name of the URL by simply changing the name of your class and method. However, you still have to have 4 components to the URL, so for example it could be:
http://my.site.com/api/v1/People/Listing
See here:
http://docwiki.embarcadero.com/RADStudio/XE3/en/REST#Customizing_the_URL_for_REST_requests
You could put IIS or Apache in between to accomplish this, and indeed rewrite the URL to point to your service the way you like.
That provides some more advantages anyway (security and scalability mostly). For example, you can create a fail-safe setup with double servers, or you can create multiple machines with your service, and have your web server do the load balancing for example.
You'll get extra logging capabilities, and if you easily want to serve other web content it's easy to have a full fledged web server anyway.
URL rewriting is usually done in the web server configuration, in Apache using entries in the .htaccess file

Best practices for REST API that access multiple environments

When writing a RESTful API that needs to access different environments such as a lab/test database and a production database, what's the best practices around setting up the API?
Should there be a #PathParam?:
/employee/{emp_id}/{environment}
/{environment}/employee/{emp_id}/
Should there be a #QueryParam?:
/employee/{emp_id}/?environment="test"
/employee/{emp_id}/?environment="prod"
Should there be a field in the payload?:
{"emp_id":"123","environment":"test"}
{"emp_id":"123","environment":"production"}
In fact I see two ways to handle this. The reason to use one or the other corresponds to what is the most convenient to implement in your RESTful application.
Using a path parameter
With this approach, it should be a path parameter at the very beginning of the resource path. So URL would be like this: /{environment}/employee/{emp_id}. Such approach is convenient if you have several applications deployed under different root paths. For example:
/test: application packaged with the configuration for the test environment
/prod: application packaged with the configuration for the production
In this case, applications for each environment are isolated.
Using a custom header
You could also a custom header to specify on which environment to route. Github uses something like that to select the version of the API to use. See this link: https://developer.github.com/v3/#current-version. It's not exactly the same thing but you could have something like that:
GET /employee/{emp_id}
x-env: test
A reverse proxy could handle this header and route the request to the right environment.
I'm not convinced by the approach within the payload since an field environment isn't actually a part of the representation for element resource employee. Regarding the query parameter approach, it's similar since such parameters apply to the request on the resource.
Hope it helps you,

Scraping WebObjects website & REST

I need to programmatically interact with a WebObjects website and extract data from the responses. The particular WebObjects site I am scraping uses component actions and stores sessions in cookies (not urls). This means that all urls look something like this:
http://example.com/WOApp/WebObjects/WOApp.woa/wo/7.0.0.0.29.1.1.1
My first questions are:
Does urls like this not completely destroy local and shared caching opportunities (cachable constraint in REST)? I imaging the only effective caching with such urls is the WebObjects server itself.
Isn't addressability broken as well? Each resource does have a unique endpoint, but it changes constantly. Furthermore (I think) that WebObjects also makes too old URLs invalid since they "time-out" after a period of time. I'm not sure whether this applies only to urls with sessions though.
Regarding the scraping I am not sure whether it's possible to extract any meaningful endpoints from the website. For example, with a normal website I would look through the HTML and extract the POST urls, then use them in my scraper by posting directly to them instead of going through the normal request-response cycle.
In this case I obviously cannot use any URLs extracted from the HTML since they are dynamically generated on each request, but I read something about being able to access WebObjects components directly if the security settings have not been set to disallow this (see https://developer.apple.com/legacy/library/documentation/LegacyTechnologies/WebObjects/WebObjects_3.5/PDF/WebObjectsDevGuide.pdf, p. 53 "Limitations on Direct requests"). I don't understand exactly how to do this though or if it's even possible.
If it's not possible what would be a good approach then? The only options I can think of is:
Using a full-blown browser client to interact with the website (e.g. WatiR or Selenium) and extract & process the HTML from their responses
Manually extracting the dynamic end-points by first request the page where they are on and then find the place in the HTML where they're located. Then use them afterwards as if they were "static".
I am interested in opinions on how to approach this scenario since I don't believe any of the solutions above are particularly good.
You've asked a number of questions, and I'll see if I can cover each in turn.
Does urls like this not completely destroy local and shared caching
opportunities (cachable constraint in REST)? I imaging the only
effective caching with such urls is the WebObjects server itself.
There is, indeed, a page cache within the WebObjects application server, and you're right to observe that these component action URLs probably thwart any other kind of caching. Additionally, even though the session ID is not present in the URL, you'd need the session ID in the cookie to re-create the same page, so having just that URL would get you a session restoration error from the application server.
Isn't addressability broken as well? Each resource does have a unique
endpoint, but it changes constantly.
Well, yes, on the face of it this is true. You've given a component action URL as an example, and they're tied to the session.
Furthermore (I think) that
WebObjects also makes too old URLs invalid since they "time-out" after
a period of time. I'm not sure whether this applies only to urls with
sessions though.
Again, all true. Component action URLs generate sessions, and sessions time out.
At this point, let me take a quick diversion. I'm assuming you're not the owner of the WebObjects application—you're talking about having to scrape a WebObjects app, and you've identified some ways in which this particular app doesn't conform to REST principles. You're completely right—a fully component-action-based WebObjects application won't be RESTful. WebObjects pre-dates REST by a few years. Having said that, there are ways in which a WebObjects application can be completely RESTful:
Using session-less direct actions gives a degree of REST-like behaviour, and would certainly solve the problems you identify with caching, addressability and expiry.
Using the ERRest framework to create a 100% RESTful application.
Of course, none of this will help you if you're just trying to scrape a legacy application.
Regarding the scraping I am not sure whether it's possible to extract
any meaningful endpoints from the website. For example, with a normal
website I would look through the HTML and extract the POST urls, then
use them in my scraper by posting directly to them instead of going
through the normal request-response cycle.
Again, if it's a fully component action-based application, you're right—all those URLs will be dynamically generated and useless to you.
In this case I obviously cannot use any URLs extracted from the HTML
since they are dynamically generated on each request, but I read
something about being able to access WebObjects components directly if
the security settings have not been set to disallow this…
That's talking about getting a component to render directly from its template with some restrictions:
As you note, the application can easily prevent it from happening at all.
As mentioned on p.53, the user input and action-invocation phases of rendering the component are skipped, which probably means this approach would be limited to rendering a component that didn't have any dynamic content anyway. This might be of some very limited use to you, though you'd need to know the component names you were interested in, and they wouldn't normally be exposed anywhere.
I'm not sure you're going to find anything better than the types of high-level functional approaches you've already suggested above, such as automating at the browser level with Selenium. If what you need is REST-style direct addressability of resources within the application, you're not going to get that unless you can re-write the application to use direct actions or ERRest where you need them.
A little late, but could help.
I use the Apache's mod_ext_filter (little modified) to pre/post filter the requests/responses from our WebObjects application. The filter calls PHP scripts and can read the dynamical hyperrefs and other things from the HTML pages. The scripts can also modify the HTTP requests, so we can programatically add/remove parameters from the request to implement new workflows in front of the legacy app and cleanup the requests before they will reach WebObjects. It is also possible to handle an additional database within the scripts and store some things over multiple requests.
So you can get the dynamically created links (maybe a button's name or HTML form destination) and can recognize these names within the request.
It is also possible to "remote control" such applications with little scripts like "click on the third button on the page". The only thing you need is a DOM parser to get the structure of the HTML pages and then rebuild the actions which the browser would do (i.e. create the HTTP request manually and send it as POST to the extracted form destination href). The only problem is the Javascript code, which we analyze and reprogram within PHP (i.e. enable/disable input elements, so they will not be transmitted within the requests)
There were some problems within the WebObjects Adapter Module for Apache. It still uses Content-Length within the HTTP header, which you cannot change in mod_ext_filter. If you change the HTML or the parameters within the request, the length of the content will not longer match. But it is possible to change that.
Theoretically it could also be possible to control such an closed-source legacy application from a new UI on a tablet or smartphone, which delegates the user interaction to the backend WebObjects app.
The scripts depends on the page structure, so if your WebObjects app will be changed, you have to correct some things in the scripts (i.e. third button could be now the fourth button).
It should also be possible to add a Restful interface in front of the application and query the data from the legacy app by the filter scripts.

How to tackle changing properties on the clientside in GWT applications?

while developing GWT apps we ran into lots of problems with project configuration. Let me explain... As usual in development, we have few environments for our application: local, demo, preview and live. Of course they are running on different machines, some are using SSL while others don't. But most importantly - all of them have different URLs.
Now, in few places in our application we need specify some URLs. Usually we would use *.properties files stored on server, and tools like Spring taglibs and it's <spring:message /> tag. But since GWT does not have such tools, we ended up in leaving hard coded URLs and performing code replacement on different SVN branches. As you can imagine - this is the worst possible scenario, causing us much problems.
So, my question is:
how one could build proper, flexible mechanism of storing config properties shared for both client and server side in GWT application. This properties have to be available for server-side handlers, client app (compiled JavaScript), UiBinder, other code running on server (workers, Spring, etc.).
The preferred way would be to avoid gwtc build if we change value of some property, but I guess it will be hard to achieve. So I will accept any reasonable alternative.
How about using relative URI references (e.g. absolute paths, without scheme or authority; i.e. /path/to/foo instead of http://example.com/path/to/foo)?
And in the few places where you absolutely need an URI (with scheme and authority), then use another property to store the "prefix" (e.g. http://example.com), and then concatenate with the above path.
Those places where you need a full URI should all be on the server, which means you don't have to recompile your GWT project when you change the "prefix", so everything is only runtime configuration and you can deploy the same artifacts in all environments.
That being said, if you ever need something configurable at runtime in GWT, then use a dynamic host page and JSNI (or a com.google.gwt.i18n.client.Dictionary); see http://code.google.com/webtoolkit/articles/dynamic_host_page.html

What is the benefit of global resource URIs (i.e. addressability)?

What is the benefit of referencing resources using globally-unique URIs (as REST does) versus using a proprietary id format?
For example:
http://host.com/student/5
http://host.com/student?id=5
In the first approach the entire URL is the ID. In the second approach only the 5 is the ID. What is the practical benefit of the first approach over the second?
Why does REST (seem to) go out of its way to advocate the first approach?
-- EDIT:
My question was confusing because it really asked two separate questions:
What is the benefit of addressibility?
What is the difference between the two URI forms seen above.
I've answered both questions below using my own post.
The main thing when i see uri's like that is a normal user would be able to remember that uri.
Us geeks are fine with question marks and get variables, but if someone remembers http://www.host.com/users/john instead of http://www.host.com/?view=users&name=john, then that is a huge benefit.
I will answer my own question:
1) Why are URIs important?
I'll quote from RESTful Web Services by Leonard Richardson and Sam Ruby (ISBN: 978-0-596-52926-0):
Consider a real URI that names a resource in the genre “directory of resources about
jellyfish”: http://www.google.com/search?q=jellyfish. That jellyfish search is just as
much a real URI as http://www.google.com. If HTTP wasn’t addressable, or if the Google
search engine wasn’t an addressable web application, I wouldn’t be able to publish that
URI in a book. I’d have to tell you: “Open a web connection to google.com, type ‘jellyfish’
in the search box, and click the ‘Google Search’ button.
This isn’t an academic worry. Until the mid-1990s, when ftp:// URIs
became popular for describing files on FTP sites, people had to write
things like: “Start an anonymous FTP session on ftp.example.com. Then
change to directory pub/files/ and download file file.txt.” URIs made
FTP as addressable as HTTP. Now people just write: “Download ftp://
ftp.example.com/pub/files/file.txt.” The steps are the same, but now they
can be carried out by machine.
[...]
Addressability is one of the best things about web applications. It makes it easy for
clients to use web sites in ways the original designers never imagined.
2) What is the benefit of addressibility?
It is far easier to follow server-provided URIs than construct them yourself. This is especially true as resource relationships become too complex to be expressed in simple rules. It's easier to code the logic once in the server than re-implement it in numerous clients.
The relationship between resources may change even though the individual resource URIs remain unchanged. For example, if Google Maps were to change the scale of their map tiles, clients that calculate relative tile positions would break.
3) What is the benefit of URIs over custom IDs?
Custom IDs identify a resource uniquely. URIs go a step further by telling you where to find it. This simplifies the client logic.
Search engine optimization mostly.
It also makes them easier to remember, and cleaner, more professional looking in my opinion.
The first is more aesthetically pleasing.
Technically there is no difference, but use the former when you can.
As Ólafur mentioned, The clarity of the former url is one benefit.
Another is implementation flexibility.
Let's say that student 5 changes infrequently. If you use the REST-style url you have the option of serving a static file instead of running code. In Rails it is common that the first request to students/5 would create a cached html file under your web root. That file is used to serve subsequent requests w/o touching the backend. Naturally, there's nothing rails specific about that approach.
The later url wouldn't allow for this. You can't have url variables (?, =) in the names of static pages.
Both URIs are valid from a REST perspective, however just realize that web caches treat the querystring parameters very differently.
If you want to use caching to your advantage then I suggest that you do not use a query string parameter to identify your resource.
I think it comes down to how closely you want to adhere to the principles of feng shui.