What is the benefit of global resource URIs (i.e. addressability)? - rest

What is the benefit of referencing resources using globally-unique URIs (as REST does) versus using a proprietary id format?
For example:
http://host.com/student/5
http://host.com/student?id=5
In the first approach the entire URL is the ID. In the second approach only the 5 is the ID. What is the practical benefit of the first approach over the second?
Why does REST (seem to) go out of its way to advocate the first approach?
-- EDIT:
My question was confusing because it really asked two separate questions:
What is the benefit of addressibility?
What is the difference between the two URI forms seen above.
I've answered both questions below using my own post.

The main thing when i see uri's like that is a normal user would be able to remember that uri.
Us geeks are fine with question marks and get variables, but if someone remembers http://www.host.com/users/john instead of http://www.host.com/?view=users&name=john, then that is a huge benefit.

I will answer my own question:
1) Why are URIs important?
I'll quote from RESTful Web Services by Leonard Richardson and Sam Ruby (ISBN: 978-0-596-52926-0):
Consider a real URI that names a resource in the genre “directory of resources about
jellyfish”: http://www.google.com/search?q=jellyfish. That jellyfish search is just as
much a real URI as http://www.google.com. If HTTP wasn’t addressable, or if the Google
search engine wasn’t an addressable web application, I wouldn’t be able to publish that
URI in a book. I’d have to tell you: “Open a web connection to google.com, type ‘jellyfish’
in the search box, and click the ‘Google Search’ button.
This isn’t an academic worry. Until the mid-1990s, when ftp:// URIs
became popular for describing files on FTP sites, people had to write
things like: “Start an anonymous FTP session on ftp.example.com. Then
change to directory pub/files/ and download file file.txt.” URIs made
FTP as addressable as HTTP. Now people just write: “Download ftp://
ftp.example.com/pub/files/file.txt.” The steps are the same, but now they
can be carried out by machine.
[...]
Addressability is one of the best things about web applications. It makes it easy for
clients to use web sites in ways the original designers never imagined.
2) What is the benefit of addressibility?
It is far easier to follow server-provided URIs than construct them yourself. This is especially true as resource relationships become too complex to be expressed in simple rules. It's easier to code the logic once in the server than re-implement it in numerous clients.
The relationship between resources may change even though the individual resource URIs remain unchanged. For example, if Google Maps were to change the scale of their map tiles, clients that calculate relative tile positions would break.
3) What is the benefit of URIs over custom IDs?
Custom IDs identify a resource uniquely. URIs go a step further by telling you where to find it. This simplifies the client logic.

Search engine optimization mostly.
It also makes them easier to remember, and cleaner, more professional looking in my opinion.

The first is more aesthetically pleasing.
Technically there is no difference, but use the former when you can.

As Ólafur mentioned, The clarity of the former url is one benefit.
Another is implementation flexibility.
Let's say that student 5 changes infrequently. If you use the REST-style url you have the option of serving a static file instead of running code. In Rails it is common that the first request to students/5 would create a cached html file under your web root. That file is used to serve subsequent requests w/o touching the backend. Naturally, there's nothing rails specific about that approach.
The later url wouldn't allow for this. You can't have url variables (?, =) in the names of static pages.

Both URIs are valid from a REST perspective, however just realize that web caches treat the querystring parameters very differently.
If you want to use caching to your advantage then I suggest that you do not use a query string parameter to identify your resource.

I think it comes down to how closely you want to adhere to the principles of feng shui.

Related

Complex domain URL structure with RESTful

I'm building a RESTful wrapper around quite complex underlaying domain. I built a domain model using UML packages and classes and derived REST-resources from the meaningful classes.
When it came to the endpint-URL design, I decided to map the package/class structure directly to URLs in order to streamline the process and have a neat traceability between my logical model (domain) and physicall implementation (REST API).
Let's say I have a following domain extract:
Admin and Work are packages, while User, Permission and Task are classes (REST Resources).
I derived the following endpoint URLs from this domain:
mydomain/admin/user -> Users collection
mydomain/admin/user/id -> User instance with id
mydomain/admin/user/id/permissions -> All permissions of the User with id
mydomain/work/task, and so on...
A collegue of mine noticed that the URLs are not optimal, mainly because these "package" parts do not map to the concrete resources. For example "admin" is not a Resource and is part of the URL. As the domain structure grows, there will be even more these intermediary non-resource-segments in the URL and he finds it wrong.
I find it nice, as URL structure itself tells something about the resource and is backed up with a complete, well documented domain model.
Is this a violation of RESTful standard?
Is this a violation of RESTful standard?
No. REST has no opinon on how an URL should look. You could use an URL like
/foo/bar/baz?qux=12
without violating any REST principle. Such an URL has no meaning for a human reader but that doesn't matter.
It is not necessary that every parts of an URL like
/foo
/foo/bar
maps to a resource. In fact is is a common misconception that RESTful URLS must follow some pattern or build rule. That is not the case.
Of courese there are best practices commonly used. One such practice would be to have collection resources like
/mydomain/admin/user
and single resources like
/mydomain/admin/user/42
But, again, that is not required by REST.
The REST architecture doesn't dictate what your URL's should look like. So, from that perspective, you're not violating any rules.
Instead, an important aspect of REST is that one should be able to use hyperlinks in order to navigate from one URL to another (something we are all used to when browsing HTML websites, but is not as common in REST APIs). Using links, consumers of your web application (wether they are humans using a web browser, or other applications that are using your API) can discover the available URLs, and the actual structure of your URLs doesn't really matter. Your URLs can even change without breaking other applications, because they will simply follow the link to the new URL.
So from a REST perspective, you could use whatever URL structure you like. As long as you provide a fixed entry point that provides a link to your users collection, the actual URL could be anything.
However, it is obviously still helpful when URLs are easy to understand and somewhat predictable. For instance, right now you have collections with a singular URL (yourdomain/admin/user) and collections with a plural URL (yourdomain/admin/user/3/permissions), which isn't very consistent. I'd suggest using only plural names, so yourdomain/admin/user becomes yourdomain/admin/users.
As for your actual question, as I mentioned this doesn't matter from a REST perspective. More important is that the URL makes clear what it represents. Something I'd take into consideration is the amount of different endpoints you're gonna have. If you are building a small application, with a small amount of endpoints, I'd keep it simple. But if you are creating a very large application with a lot of domain models, prepending them with some kind of categories sounds like a good idea.
Conclusion
URLs in a REST API should be discoverable by hyperlinks, and therefore no hard rules exist. Just make sure your URLs make sense to anybody who has to dig into them.
Some tips for usefull URLs can be found in REST-ful URI design.
You aren't following the strict definition of being RESTful; but it's not due to your colleague's concerns.
Your main challenge is that your proposal has the endpoints baked in to the client applications. A client looking for a user goes straight to /mydomain/admin/user.
But if (and when) you re-locate that endpoint, say /mydomain/admin/department/user, your clients are now broken.
The solution:
REST defines that you have one (or very few) endpoints, termed "Cool URIs", which are fixed enough that they never change.
Instead of your client going to the endpoint of /mydomain/admin/user they will take this approach:
Retrieve a root object from /mydomain (the Cool URI). => "mydomain" service object.
"mydomain" object contains a URI which has the identifier of "admin". Follow that URI.
"admin" object contains a URI which has the identifier of "user". Follow that URI.
"user" object returned (more likely a list of users).
In this respect, there is no reliance upon the client application having to know the URI format; therefore when you change the endpoints, you just change the URIs that your service embeds in the returned REST objects.
This approach to REST, where each returned object contains the URI to the next, is part of the HATEOAS principle.
If you're worried about how a client will pull back a given ID number (e.g. user 42), then that's catered for too: you can implement something like OData. So again, the URI is provided by the parent object. Instead of the client using a pre-baked endpoint like this:
/mydomain/admin/users/42
...it instead does this:
/mydomain (Cool URI)
Follow the link for "admin".
Follow the link for "users", appending ?$filter=UserId eq 42
There are also some really good concepts around solving object versioning issues, which again are difficult if you take the approach of hard-coding the endpoints upfront.
If it's any consolation, I started off defining my REST architecture around fixed endpoints (as you are), and then discovered that using a true RESTful approach is actually simpler in the long run.
Best of luck!

How do you avoid fixed resource

Roy Fielding writes
A REST API must not define fixed
resource names or hierarchies (an
obvious coupling of client and
server). Servers must have the freedom
to control their own namespace.
Instead, allow servers to instruct
clients on how to construct
appropriate URIs, such as is done in
HTML forms and URI templates, by
defining those instructions within
media types and link relations.
How do you do this for system-to-system interfaces? Say the client wants to create an order in the server at http://my.server.org How is it supposed to learn that for creating an order it is supposed to use the url http://my.server.org/newOrder and not http://my.server.org/nO or anything else?
For a human interface (i.e. browser), I guess the server would provide some form of link (possibly in a form element) and the text around and in that link would tell a user which of the forms on that page is the correct one for creating an order (as supposed to creating a user or navigating to some search result)
What are the mechanics used for implementing this on the client side? And also: are they actually used or does the majority of people just hardwire the urls into the client?
How do you do this for
system-to-system interfaces? Say the
client wants to create an order in the
server at http://my.server.org How is
it supposed to learn that for creating
an order it is supposed to use the url
http://my.server.org/newOrder and not
http://my.server.org/nO or anything
else?
It doesn't learn. Machine clients, generally, can't "learn". Not yet at least, we're still pre-Skynet. You have to "teach" them.
But what the key is that you don't teach them URLs. You teach them relations.
Consider, in HTML...
<a rel="order" href="http://my.server.org/newOrder"/>
and
<a rel="order" href="http://my.server.org/nO"/>
You'll notice that the rel is the same, "order", but the URL is not.
In a "perfect" world, you system will have a single entry point, say, http://my.server.org/ and from there the client can find all of the rels that it needs to know about.
In practice, many systems have several "well known", and defined entry points from which the client can start, just as an expediency so the client does not alway have to start at the root of the system. These well known entry points have an implied commitment from the provider that these URLs won't be changing any time soon. That they're long lived, and the server will support them very well.
But once passed the entry point, any URL you find likely does not have such a promise behind it. The URL can be a one use only URL. It could be directed to different machines for, say, load balancing. Who knows. But as a consumer of the service, you really don't care what the URL is, you only care about the relation. The relation tells you the detail of the URL to use.
The documentation of your hypermedia API explains how to apply the uniform interface to each of the rels that your client will encounter. The client can't "intuit" that either, it has to be taught.
Basically, by teaching the client how to navigate the relations that it will or MAY find in the payloads it processes is how the client manipulates the hypermedia API. The payloads contain sign posts to show the way, but the server dictates where those sign posts go.
As for how often it is used, in the machine to machine world, likely not very much. Most systems aren't large enough where the URLs change enough to matter, and the clients are so few that changing the clients is not a significant burden. So most just hard code away.
But then, in the end, you just have bad clients. Nothing a REST system can do with a bad client. It can't tell them apart at runtime anyway.
No matter how you publish an API (to be consumed by machines), it is possible to make breaking changes.
When wrapping your API behind a UI (such as HTML forms), you have the freedom to change the URI without breaking the user, but that is because the user is consuming an abstraction you provided. Change the URL schema without changing your form, and you'll still break the client.
A couple ways to avoid breaking machine clients (basically, supporting backward-compatibility):
Build in some sort of URL versioning
Do redirection from old URL schemas to your new schema
We've quite sucessfully approached it the following way: expose a WADL file at the very root URL of the application describing the media types as well as where to find links in it and their semantics. I know this (WADL) is something seen critical by some in the REST community but I always felt intimidated by very URL focus of WADL only. Beyond all the religious debates we liked having a well defined way of documenting representations. There is a way to get around the URL focus of WADL and rather point out where links can be found in the representation and then rather document that. See that blog post (currently down because of maintenance so you might want to look at it in the Google cache) for details on the approach.
This results in only a single URL to be known by the client as he can find out about it accessing the WADL, and from then on just learn about the representation and where to find links, what HTTP method needs what parameters when being invoked and so on.

What is the advantage of using REST instead of non-REST HTTP?

Apparently, REST is just a set of conventions about how to use HTTP. I wonder which advantage these conventions provide. Does anyone know?
I don't think you will get a good answer to this, partly because nobody really agrees on what REST is. The wikipedia page is heavy on buzzwords and light on explanation. The discussion page is worth a skim just to see how much people disagree on this. As far as I can tell however, REST means this:
Instead of having randomly named setter and getter URLs and using GET for all the getters and POST for all the setters, we try to have the URLs identify resources, and then use the HTTP actions GET, POST, PUT and DELETE to do stuff to them. So instead of
GET /get_article?id=1
POST /delete_article id=1
You would do
GET /articles/1/
DELETE /articles/1/
And then POST and PUT correspond to "create" and "update" operations (but nobody agrees which way round).
I think the caching arguments are wrong, because query strings are generally cached, and besides you don't really need to use them. For example django makes something like this very easy, and I wouldn't say it was REST:
GET /get_article/1/
POST /delete_article/ id=1
Or even just include the verb in the URL:
GET /read/article/1/
POST /delete/article/1/
POST /update/article/1/
POST /create/article/
In that case GET means something without side-effects, and POST means something that changes data on the server. I think this is perhaps a bit clearer and easier, especially as you can avoid the whole PUT-vs-POST thing. Plus you can add more verbs if you want to, so you aren't artificially bound to what HTTP offers. For example:
POST /hide/article/1/
POST /show/article/1/
(Or whatever, it's hard to think of examples until they happen!)
So in conclusion, there are only two advantages I can see:
Your web API may be cleaner and easier to understand / discover.
When synchronising data with a website, it is probably easier to use REST because you can just say synchronize("/articles/1/") or whatever. This depends heavily on your code.
However I think there are some pretty big disadvantages:
Not all actions easily map to CRUD (create, read/retrieve, update, delete). You may not even be dealing with object type resources.
It's extra effort for dubious benefits.
Confusion as to which way round PUT and POST are. In English they mean similar things ("I'm going to put/post a notice on the wall.").
So in conclusion I would say: unless you really want to go to the extra effort, or if your service maps really well to CRUD operations, save REST for the second version of your API.
I just came across another problem with REST: It's not easy to do more than one thing in one request or specify which parts of a compound object you want to get. This is especially important on mobile where round-trip-time can be significant and connections are unreliable. For example, suppose you are getting posts on a facebook timeline. The "pure" REST way would be something like
GET /timeline_posts // Returns a list of post IDs.
GET /timeline_posts/1/ // Returns a list of message IDs in the post.
GET /timeline_posts/2/
GET /timeline_posts/3/
GET /message/10/
GET /message/11/
....
Which is kind of ridiculous. Facebook's API is pretty great IMO, so let's see what they do:
By default, most object properties are returned when you make a query.
You can choose the fields (or connections) you want returned with the
"fields" query parameter. For example, this URL will only return the
id, name, and picture of Ben:
https://graph.facebook.com/bgolub?fields=id,name,picture
I have no idea how you'd do something like that with REST, and if you did whether it would still count as REST. I would certainly ignore anyone who tries to tell you that you shouldn't do that though (especially if the reason is "because it isn't REST")!
Simply put, REST means using HTTP the way it's meant to be.
Have a look at Roy Fielding's dissertation about REST. I think that every person that is doing web development should read it.
As a note, Roy Fielding is one of the key drivers behind the HTTP protocol, as well.
To name some of the advandages:
Simple.
You can make good use of HTTP cache and proxy server to help you handle high load.
It helps you organize even a very complex application into simple resources.
It makes it easy for new clients to use your application, even if you haven't designed it specifically for them (probably, because they weren't around when you created your app).
Simply put: NONE.
Feel free to downvote, but I still think there are no real benefits over non-REST HTTP. All current answers are invalid. Arguments from the currently most voted answer:
Simple.
You can make good use of HTTP cache and proxy server to help you handle high load.
It helps you organize even a very complex application into simple resources.
It makes it easy for new clients to use your application, even if you haven't designed it specifically for them (probably, because they weren't around when you created your app).
1. Simple
With REST you need additional communication layer for your server-side and client-side scripts => it's actually more complicated than use of non-REST HTTP.
2. Caching
Caching can be controlled by HTTP headers sent by server. REST does not add any features missing in non-REST.
3. Organization
REST does not help you organize things. It forces you to use API supported by server-side library you are using. You can organize your application the same way (or better) when you are using non-REST approach. E.g. see Model-View-Controller or MVC routing.
4. Easy to use/implement
Not true at all. It all depends on how well you organize and document your application. REST will not magically make your application better.
IMHO the biggest advantage that REST enables is that of reducing client/server coupling. It is much easier to evolve a REST interface over time without breaking existing clients.
Discoverability
Each resource has references to other resources, either in hierarchy or links, so it's easy to browse around. This is an advantage to the human developing the client, saving he/she from constantly consulting the docs, and offering suggestions. It also means the server can change resource names unilaterally (as long as the client software doesn't hardcode the URLs).
Compatibility with other tools
You can CURL your way into any part of the API or use the web browser to navigate resources. Makes debugging and testing integration much easier.
Standardized Verb Names
Allows you to specify actions without having to hunt the correct wording. Imagine if OOP getters and setters weren't standardized, and some people used retrieve and define instead. You would have to memorize the correct verb for each individual access point. Knowing there's only a handful of verbs available counters that problem.
Standardized Status
If you GET a resource that doesn't exist, you can be sure to get a 404 error in a RESTful API. Contrast it with a non-RESTful API, which may return {error: "Not found"} wrapped in God knows how many layers. If you need the extra space to write a message to the developer on the other side, you can always use the body of the response.
Example
Imagine two APIs with the same functionality, one following REST and the other not. Now imagine the following clients for those APIs:
RESTful:
GET /products/1052/reviews
POST /products/1052/reviews "5 stars"
DELETE /products/1052/reviews/10
GET /products/1052/reviews/10
HTTP:
GET /reviews?product_id=1052
POST /post_review?product_id=1052 "5 stars"
POST /remove_review?product_id=1052&review_id=10
GET /reviews?product_id=1052&review=10
Now think of the following questions:
If the first call of each client worked, how sure can you be the rest will work too?
There was a major update to the API that may or may not have changed those access points. How much of the docs will you have to re-read?
Can you predict the return of the last query?
You have to edit the review posted (before deleting it). Can you do so without checking the docs?
I recommend taking a look at Ryan Tomayko's How I Explained REST to My Wife
Third party edit
Excerpt from the waybackmaschine link:
How about an example. You’re a teacher and want to manage students:
what classes they’re in,
what grades they’re getting,
emergency contacts,
information about the books you teach out of, etc.
If the systems are web-based, then there’s probably a URL for each of the nouns involved here: student, teacher, class, book, room, etc. ... If there were a machine readable representation for each URL, then it would be trivial to latch new tools onto the system because all of that information would be consumable in a standard way. ... you could build a country-wide system that was able to talk to each of the individual school systems to collect testing scores.
Each of the systems would get information from each other using a simple HTTP GET. If one system needs to add something to another system, it would use an HTTP POST. If a system wants to update something in another system, it uses an HTTP PUT. The only thing left to figure out is what the data should look like.
I would suggest everybody, who is looking for an answer to this question, go through this "slideshow".
I couldn't understand what REST is and why it is so cool, its pros and cons, differences from SOAP - but this slideshow was so brilliant and easy to understand, so it is much more clear to me now, than before.
Caching.
There are other more in depth benefits of REST which revolve around evolve-ability via loose coupling and hypertext, but caching mechanisms are the main reason you should care about RESTful HTTP.
It's written down in the Fielding dissertation. But if you don't want to read a lot:
increased scalability (due to stateless, cache and layered system constraints)
decoupled client and server (due to stateless and uniform interface constraints)
reusable clients (client can use general REST browsers and RDF semantics to decide which link to follow and how to display the results)
non breaking clients (clients break only by application specific semantics changes, because they use the semantics instead of some API specific knowledge)
Give every “resource” an ID
Link things together
Use standard methods
Resources with multiple representations
Communicate statelessly
It is possible to do everything just with POST and GET? Yes, is it the best approach? No, why? because we have standards methods. If you think again, it would be possible to do everything using just GET.. so why should we even bother do use POST? Because of the standards!
For example, today thinking about a MVC model, you can limit your application to respond just to specific kinds of verbs like POST, GET, PUT and DELETE. Even if under the hood everything is emulated to POST and GET, don't make sense to have different verbs for different actions?
Discovery is far easier in REST. We have WADL documents (similar to WSDL in traditional webservices) that will help you to advertise your service to the world. You can use UDDI discoveries as well. With traditional HTTP POST and GET people may not know your message request and response schemas to call you.
One advantage is that, we can non-sequentially process XML documents and unmarshal XML data from different sources like InputStream object, a URL, a DOM node...
#Timmmm, about your edit :
GET /timeline_posts // could return the N first posts, with links to fetch the next/previous N posts
This would dramatically reduce the number of calls
And nothing prevents you from designing a server that accepts HTTP parameters to denote the field values your clients may want...
But this is a detail.
Much more important is the fact that you did not mention huge advantages of the REST architectural style (much better scalability, due to server statelessness; much better availability, due to server statelessness also; much better use of the standard services, such as caching for instance, when using a REST architectural style; much lower coupling between client and server, due to the use of a uniform interface; etc. etc.)
As for your remark
"Not all actions easily map to CRUD (create, read/retrieve, update,
delete)."
: an RDBMS uses a CRUD approach, too (SELECT/INSERT/DELETE/UPDATE), and there is always a way to represent and act upon a data model.
Regarding your sentence
"You may not even be dealing with object type resources"
: a RESTful design is, by essence, a simple design - but this does NOT mean that designing it is simple. Do you see the difference ? You'll have to think a lot about the concepts your application will represent and handle, what must be done by it, if you prefer, in order to represent this by means of resources. But if you do so, you will end up with a more simple and efficient design.
Query-strings can be ignored by search engines.

What is the difference between category/category_id/item_id and category?category_id={}&item_id={} in REST?

I just began looking at REST and was wondering what the basic difference between the two representations was. The first one looks pretty nice to me and the second one has to pass some attribute values but the underlying logic seems to be boiling to almost the same thing (I could be mistaken though)
http://url/category/category_id/item_id
AND
http://url/category?category_id={12}&item_id={12334}
I think you are labouring under some fundamental misconceptions about what REST is about.
The URL used to access a resource really is a detail and actually should not matter to the client. URL's should really be "discovered" by clients anyway if they follow the HATEAOS principe that is one of the tenets of REST.
Essentially you are right though: either URL could represent the resource you are exposing in the end, but as I say, this really is a detail and it comes down to preference in many cases at what URL you expose something. The point of HATEOAS is to allow you to change the URL's that are used to access resources at-will without affecting clients that work against your existing services.
The following URL's might help you understand some of the properties that make services truly RESTful:
How to GET a cup of coffee
Describing RESTful Applications
[disclaimer: just because HATEAOS is a principle of REST does not make it easy to do. You will find most of the services on the web do not follow this principle strictly at all, as evidenced by their documentation which is full of URL templates; not the way services should be documented in the ideal world. I'm struggling myself to find good examples of truly RESTful services and clients...]
It should be possible for agents to reason about the resource structure:
based on the URL, and
based on links returned by requests for resources.
The problem with the second representation is that it can be considered as a set unordered keys and values, with no real structure/heirarchy.
If you click the button from your tag restful-url you get a good link from this site explaining the difference between those two styles:
How to obtain REST resource with different finder "methods"?

REST: Way to map URLs to services/files?

In your humble opinion: what would be a best practice aproach to map REST URLs to services/files within one's architecture (let's assume MVC pattern here)?
In addition to Darrel's answer:
Make use of the hierarchical nature of HTTP URIs; think of every path segment as drilling down into the overall space of managed items (e.g. orders, customers). If at some point you need to 'index' into a collection along multiple dimensions (e.g. query) then use query string parameters:
/service/products/cars/japanese-cars/toyota/corola/&priceMin=2000&priceMax=5000
Note that (as darrel said) the structure should be opaque to the client. That means that the client needs to discover the parameters at run time (that is what forms or URI templates are for). Of course client and server need shared knowledge about the meaning of e.g. priceMin. That knowledge should be in some design time specification, for example the specification of a link relation. Maybe look at http://www.opensearch.org for detailed use case.
Also interesting is the host part of the URIs. If you might at some stage need to move parts of your services to another machine, design your URIs so that the relevant information is in the domain part. Then you can use simple DNS to route requests to different machines.
HTH,
Jan
The best way to map URLs to resources is dependent on what web framework you use to provide the REST service. Pick whatever url structure is easiest to manage with the tools you have.
The url structure should be completely opaque to the clients of your service so they should not care what they look like.
The most important thing in my opinion is that when you are looking at an URL it should be relatively easy to guess which controller on the server is going to respond to that URL. That will make development and debugging much easier.