In REST, how should a GET request to a findAll operation be handled when the Resources are paged? - rest

In a RESTful Service, Resources that cannot all be retrieved at once are paginated. For example:
GET /foo?page=1
The question is, how should I handle a getAll request such as:
GET /foo
Taking discoverability/HATEOAS into consideration, I see a few options:
return a 405 Method Not Allowed and include a Link header to the first page:
Link=<http://localhost:8080/rest/foo?page=0>; rel=”first“
return a 400 Bad Request and include the Link header (same as above)
return a 303 See Other to the first paginated page
return a 200 OK but actually return only the first page (and include the URI of the next page into the Link):
Link=<http://localhost:8080/rest/foo?page=1>; rel=”next“
note: I would rather not do this, having learned not to manage anything for the client by default, if they haven't explicitly asked for it.
These are of course only a few options. I'm leaning towards the first, but I'm not sure if there is a best practice on this that I am not aware of.
Any feedback is appreciated.
Thanks.

Lets start with the fact that REST is not a set-in-stone protocol like SOAP, it's simply a means of structuring a service, similar to how languages are described as being Object-Oriented.
That all being said, I'd recommend the handling this as follows.
Treat a RESTful call like a function declaration.
GET /foo
foo()
Some functions require parameters.
GET /foo?start=??&count=??
foo(start, count)
Some languages support default parameters, others don't; you get to decide for yourself how you want to handle parameters.
With default parameters, you could assume that the function was defined as
foo(start = 0, count = 10)
so that a call to GET /foo would actually be equivalent to GET /foo?start=0&count=10, whereas a call to GET /foo?start=100 would be equivalent to GET /foo?start=100&count=10.
If you don't want default parameters, you could force the user of the API to explicitly set start and count:
foo(start, count)
so that a call to GET /foo would return a 400 Bad Request status code, but a call to GET /foo?start=0&count=10 would return a 200 OK status code along with the content contained by the specified range.
In either case you'll have to decide how you'll handle errors, such as
GET /foo?start=-10&count=99999999
If parameters have maximums and minimums, you'll need to decide whether to normalize the parameters, or simply return errors. The previous example might return a 400 Bad Request status code, but it could also be constrained to turn into:
GET /foo?start=0&count=1000
In the end it's up to you to decide what makes the most sense in the context of your application.

From a RESTful point of view, I think it perfectly alright to handle both representations the same. Consider a software with several versions you want to download, the latest one being 3.8. So if you want to get the latest version, you could address it with both GET /software/version/latest.zip and GET /software/version/3.8.zip until there comes a newer version. So two different links point to the same resource.
I like to imagine pagination pretty much the same. On the first page there are always the latest articles. So if no page-parameter is provided, you could simply imply it's 1.
The approach with the rel attribute goes in a slightly different direction. It's a creation of Google to better handle the problem with duplicate content and is primarily considered to be used in order to distinguish between a "main" page and pagination-pages. Here's how to use it:
//first page:
<link rel="next" href="http://www.foo.com/foo?page=2" />
//second page:
<link rel="prev" href="http://www.foo.com/foo?page=1" />
<link rel="next" href="http://www.foo.com/foo?page=3" />
//third and last page:
<link rel="prev" href="http://www.foo.com/foo?page=2" />
So from a SEO point of view it's a good idea (and recommended by Google) to use those elements. They also go perfectly with the resource-orientated idea of REST and the hypermedia representation of the resources.
Choosing one of your suggestions, I think the 303 See Other is the right way to go. It was intended to be used for this kind of purposes and is a good way to canonicalize your resources. You can make them available through many URIs, but have one "real" URI for a representation (like the software with different versions).
According to the specification, the response should look something like this:
303 See Other
Location: http:www.foo.com/foo?page=1
http:www.foo.com/foo?page=1
So you provide a Location-header with the "real" representation, and the body should contain a hypertext document linking to the new URI. Note that according to the specification the client is expected to send a GET request to the value of Location, but it doesn't have to.
//EDIT as answer to your comment (yep, it's really bad practice to claim something without proving it :-) - my bad!):
Google presented the rel="next" and rel="prev" attributes in September 2011 on the Official Webmaster Central Blog. They can be used additionally to (or in some cases instead of) the rel="canonical" tag.
Under those links you can find the differences between them explained:
rel="next" and rel="prev" link elements are "to indicate the relationship between component URLs in a paginated series"
the rel="canonical" "allows you to publicly specify your preferred version of a URL"
So there is a slight difference between them. So you can break down your problem to a canonical issue: There are several URLs pointing to the same resource (/foo and foo?page=1 but you have a preferred version of the URL (foo?page=1). So now there are a few options for a RESTful approach:
If there is no page-parameter given in the query, use a default value (e.g. 1) when processing it. I think in this specific case it is OK to use a default value even though you point it out as bad practice.
Respond with 303 See Other providing the preferred URL in the Location-header (as described above). I think a 3xx-response is the best (and most likely RESTfully intended) way to deal with duplicate/canonical content.
Respond with 400 Bad Request in case you want to force the client to provide a page-parameter (as explained by zzzzBov in his answer). Note that this response does not have something like a Location header (as assumed in your question), so the explanation why the request failed and/or the correct URL (if given) must go to the entity-body of the response. Also, note that according to the specification this response is commonly used when the client submits a bad/malformed representation (! not URL !) along with a PUT or POST request. So keep in mind that this also might be a little ambiguous for the client.
Personally, I don't think your suggestion to respond with 405 Method Not Allowed is a good idea. According to the specification, you must provide an Allow-header listing the allowed methods. But what methods could be allowed on this resource? I can only think of POST. But if you do not want the client to POST to it either, you could also respond with 403 Forbidden with an explanation why it is forbidden, or 404 Not Found if you do not want to tell why it is forbidden. So it might be a little ambiguous, too (in my opinion).
Using link-elements with the mentioned rel-attributes as you propose in your question is not essentially 'RESTful' because it's only hypermedia which is settled in the representation of the resource. But your problem (as far as I understand it) is that you want to decide how to respond to a specific request and which representation to serve. But still it's not absolutely pointless:
You can consider the whole SEO issue as a side effect of using rel="next/prev/canonical", but keep in mind that they also create connectedness (as the quality of having links) which is one of the characteristics of REST (see Roy Fielding's dissertation).
If you want to dive into RESTful Web Services (which is totally worth it) I recommend reading the book RESTful Web Services by Leonard Richardson and Sam Ruby.

In some cases not implicitly managing anything for the client can lead to a overlay complex interface, examples would be where the consumer isn't technical or isn't intending on building on top of interface, for example in a web page. In such cases even a 200 may be appropriate.
In other cases I would agree implicit management would be a bad idea as the where a consumer would want to be able to predict the response correctly and where a simple specification may be required. In such cases 405, 400 and 303.
It's a matter of context.

Related

REST convention for a move operation

Is there a REST convention for a call to move a resource? I want to move an item (say item1) from location loc1 to location loc2. I could do a delete/insert, but that would require two separate calls. How can I do it with just one call? I've thought about several possibilities, like:
http://someapi.com/location/loc1/items/item1/location/loc2/move-item
but that looks awkward to me. In that URI it looks like the second location is a property of the item. Switching loc2 and item would then make it look as if loc2 is a property of loc1, again an awkward thing. As you can see I'm thoroughly confused, any suggestions?
As you can see I'm thoroughly confused, any suggestions?
In REST, identifiers are identifiers. They don't have any domain semantics associated with them. You can use any spelling you want for a resource.
Your resources are usually going to be affordances - documents about things, or documents that enable protocols that do things. Think web pages with links and forms.
POST /4e52bfd1-87d0-4cf1-8b45-175fb13294f3
POST is the right method token to use for request messages that aren't worth standardizing.
The target URI will normally identify some resource that we expect to be changed by the request. The reason for this choice is that it gives us invalidation of client/intermediate caches "for free".
Most of the semantic information you need to describe this "not worth standardizing" method belongs in the payload, not in the URI.
POST /4e52bfd1-87d0-4cf1-8b45-175fb13294f3
Content-Type: text/plain
Please move /4e52bfd1-87d0-4cf1-8b45-175fb13294f3 to /fc32bf42-b757-4ae2-b164-2cb61e3ca8fd
It is normal that a single resource might accept several different kinds of requests that all use the POST method. The server can discriminate based on the payload (consider, for example, multiple web forms that all POST to the same resource - it works just fine, but you do need to include enough information in the form that the server can figure out what the request means.)
For "move" semantics specifically -- the WebDAV specification does define a MOVE method token. If those standardized semantics match what you want to do, then it could be appropriate to support that. (I tend to avoid WebDAV myself - I'm suspicious of the constraints on "WebDAV-compliant resources", and therefore somewhat concerned about what assumptions general purpose clients are entitled to make about resources when one of the WebDAV method tokens is used. It might be fine? it might not be fine? but what I'm looking for is "obviously better than using POST" and I don't see that.)
I have to admit I'm still confused about this REST thing,
That's not your fault: LOTS of people are confused by this REST thing. There is much more misleading noise than there is signal.
Three talks that might help:
Jim Webber, 2011 - REST: DDD In the Large
Stefan Tilkov, 2014 - REST: I don't Think it Means What You Think it Does
Jon Moore, 2010 - Hypermedia APIs

HATEOAS Content-Type : Custom mime-type

I've been trying to implement a RESTFul architecture, but I've gotten thoroughly confused as to whether custom media-types are good or bad.
Currently my application conveys "links", using the Http Link: header. This is great, I use it with a title attribute, allowing the server to describe what on earth this 'action' actually is, especially when presented to a user.
Where I've gotten confused is whether I should specify a custom mime-type or not. For instance I have the concept of a user. It may be associated with the current resource. I'm going to make up an example and say I have an item on an auction. We may have a user "watching" it. So I would include the link
<http://someuserrelation> rel="http://myapp/watching";title="Joe Blogg", methods="GET"
In the header. If you had the ability to remove that user from watching you would get.
<http://someuserrelation> rel="http://myapp/watching";title="Joe Blogg", methods="GET,DELETE"
I'm pretty happy with this, if the client has the correct role he can remove the relationship. So I'm defining what to do with a relationship. The neat thing is say we call GET on that 'relation' resource, I redirect the client to the user resource.
What's confusing me is whether or not to use a custom mime-type. There's arguments both ways on the internet, and in my head in regards to this.
I've done a sample in which I call HEAD on an unknown url, and the server returns Content-Type: application/vnd.myapp.user. My client then decides whether it can understand this mime-type (it maintains mappings of resources it understands to views), and will either follow it, or explain that it's unable to figure out what is at the end of that link.
Is this bad?. I have to maintain special mime-types. What's particularly odd is I'm more than happy to use a standard application/user format, but can't find one specified anywhere.
I'm beginning to think I should be attempting to completely guess at rendering what in any HTTP response, almost to the point that maybe my RESTFul api should just be rendering html instead of attempting to do anything with json/xml.
I've tried searching (even Roy Fieldings blog), but can't find anything that describes how the client should deal with this sort of situation.
EDIT: the argument I have with including the custom type, is that it may not necessarily be a 'user' watching the item, it could be something with application/vnd.myapp.group. By getting the response a client knows the body has something different, and so change to a view that displays groups. But is this coupling of mime-type to view bad?.
I'd say, you definitely want to have a specific media-type for all representations. If you can find a standard one (html, jpeg, atom, etc.) use that, but if not, you should define one (or multiple ones).
The reason is: representations should be self-contained. That means your client gets a link from somewhere it should know what to do with it. How to display it, how to proceed from there, etc. For example a browser knows how to display text/html. You client should know how to display/handle application/vnd.company.user.
Also, I think you've got content negotiation backwards. You don't need to call HEAD to determine what representations the server supports. You can tell the server what your client supports in the GET/POST/etc requests using the "Accepts" header. Indeed this would be the standard way to do it. The server then responds with the 'best' representation it can give you for your accepted mime-types. You don't need more round-trips.
So, although the links you are providing can contain contextual information, usually given in the 'rel' attribute, like if the link points to a 'next page', 'previous page', 'subscribed user' or 'owner user', etc., the client can not assume any representation under those links. It knows it is semantically a 'user', so it can fill the 'Accepts' header with all supported representations for a user (application/vnd.company.user). If the representation only says text/xml, there is nothing the client can assume either of the content, or semantics of Links that it may receive.
In practice you can of course code any client to just assume what representations are under what links/urls, and you don't have to conform to REST all the time, but you do get a lot of benefits (described in Roy Fielding's paper) if you do.
Another minor point: The links do not need to contain which methods are available for a given resource, that's what OPTIONS is for. Admittedly, it is rarely implemented.
You don't have to send HTML to serve hypermedia. There are many different hypermedia formats which are much easier to parse than HTML. https://sookocheff.com/post/api/on-choosing-a-hypermedia-format/ https://stackoverflow.com/a/13063069/607033
You don't have to use a domain specific MIME type, in my opinion it is better to use a domain specific vocabulary with a general hypermedia type e.g. microformats or schema.org with JSON-LD + Hydra or ATOM/XML + microdata / RDFa, etc... There are many alternatives depending on your taste. https://en.wikipedia.org/wiki/Microdata_(HTML) http://microformats.org/ http://schema.org/ http://www.hydra-cg.com/
I am not sure whether adding the same relation to multiple methods is a good choice. Sending multiple links with different in link headers with d if you want: https://stackoverflow.com/a/25416118/607033

Hypermedia Restful API using Link Header and Range

I'm trying to develop a RESTful api using the hypertext as an engine of application state principals.
I've settled on using the Link header (RFC5988) for all 'state transitions', it seems natural to place links there, and doesn't make my response types specific on an implementation (eg XML/json/etc all just work).
What I'm struggling with at the moment is the pagination of a collection of resources. In the past I've used the Range header to control this, so the client can send "Range: MyObjects=0-20" and get the first 20 back. It seems natural to want to include a "next" relation to indicate the next 20 items (but maybe it isn't), but I'm unsure how to do it.
Many examples include the range information as part of the URI. eg it would be
Link: <http://test.com/myitems?start=20&end=40>;rel="next"
Using the range header would I do something like the following?
Link: <http://test.com/myitems;rel="next";range="myitems=20-40"
The concern here is that the link feels non-standard. A client would have to be checking the range header to get the next items.
The other thing is, would I just leave this all as somewhat out-of-band information?. Never showing next/previous ranges (as that sort of depends on what the client is doing), and expect the client to just serialize down what it needs when it needs it?. I can use the "accepts-ranges" link hints in the initial link to the collection to let the client know its 'pageable'
eg something like
OPTIONS http://test.com
-> Link:"<http://test.com/myitems";rel="http://test.com/rels/businessconcept";accepts-ranges="myitem""
Then the client would go, oh it accepts ranges, I will only pull down the first X, and then go for the next range as necessary (this sort of feels like implicit states though).
I can seem to figure out what is really in the best spirit of HATEOAS here.
The link header spec doesn't do exactly what you want AFAIK. It has no concept of identifying client supplied parameters. The rel doc's for your link can and should specify how to apply your range implementation.
The link-template header spec https://datatracker.ietf.org/doc/html/draft-nottingham-link-template-00#page-3 does a lot of what you want, but that expired. This specified how to use https://www.rfc-editor.org/rfc/rfc6570 style templates in headers, and if you want to use headers that the direction i'd suggest.
Personally, although I agree having the links in the header abstracts you from the body's content type, i find the link header to be difficult to develop against as you can't just paste a url in a browser and see links immediately.

Does HATEOAS imply that query strings are not RESTful?

Does the HATEOAS (hypermedia as the engine of app state) recommendation imply that query strings are not RESTful?
Edit: It was suggested below that query strings may not have much to do with state and that therefore the question is puzzling. I would suggest that it doesn't make sense for the URI to have a query string unless the client were filling in arguments. If the client is filling arguments then it is adulterating the server-supplied URI and I wonder if this violates the RESTful principle.
Edit 2: I realize that the query string seems harmless if the client treats it as opaque (and the query string might be a legacy and therefore convenient). However, in one of the answers below Roy Fielding is quoted as saying that the URI should be taken to be transparent. If it is transparent then I believe adulterating is encouraged and that seems to dilute the HATEOAS principle. Is such dilution still consistent with HATEOAS? This raises the question of whether REST is calling for the tight coupling that URI building seems to be.
Update At this REST tutorial http://rest.elkstein.org/ it is suggested that URI building is bad design and is not RESTful. It also iterates what was said by #zoul in the accepted answer.
For example, a "product list" request could return an ID per product, and the specification says that you should use http://www.acme.com/product/PRODUCT_ID to get additional details. That's bad design. Rather, the response should include the actual URL with each item: http://www.acme.com/product/001263, etc. Yes, this means that the output is larger. But it also means that you can easily direct clients to new URLs as needed
If a human is looking at this list and does not want what he/she can see, there might be a "previous 10 items" and a "next 10 items" button, however, if there is no human, but rather a client program, this aspect of REST seems a little weird because of all the "http://www" that the client program may have no use for.
In Roy Fielding's own words (4th bullet point in the article):
A REST API must not define fixed resource names or hierarchies (an obvious coupling of client and server). Servers must have the freedom to control their own namespace. Instead, allow servers to instruct clients on how to construct appropriate URIs, such as is done in HTML forms and URI templates, by defining those instructions within media types and link relations.
In other words, as long as the clients don't get the pieces they need to generate URIs in out-of-band information, HATEAOS is not violated.
Note that URI templates can be used in URIs without query strings as well:
http://example.com/dictionary/{term}
So the question is more about whether it is RESTful to allow the client to construct URLs than whether it is RESTful to use query strings.
Notice how the amount of information served to the client in the example above is exactly equivalent to serving an exhaustive list of all the possible terms, but is a lot more efficient from a bandwidth point of view.
It also allows the client to search the dictionary while respecting HATEAOS, which would be impossible without the in-band instructions. I am quite confident that Roy Fielding is not promoting a Web without any search feature...
About your third comment, I think Roy Fielding is encouraging API designers to have "transparent" URIs as an extra feature on top of HATEAOS. I don't interpret his quote in zoul's answer as a statement that clients would use "common sense" to navigate an API with clear URIs. They would still use in-band instructions (like forms and URI templates). But this does not mean that transparent URIs are not way better than dark, surprising, opaque URIs.
In fact, transparent URIs provide added value to an API (debugging is one use case I can think of where having transparent URIs is invaluable).
For more info about URI templates, you can have a look at RFC6570.
My take on it is that REST itself says nothing about whether URI are opaque or transparent but that REST app should not depend on the client to construct URI that the server hasn't already told it about. There are a variety of ways for the server to do this: for example, a collection which may include links to its members, or an HTML form with the GET method will obviously cause a URI with params to be created client-side and fetched, or there are at least a couple of proposed standards for URI templates. The important thing for REST is that the description of valid URI should be defined somehow by the server in the responses it gives to clients, and not in some out-of-band API documentation
URI transparency is a good thing in the same way as transparency anywhere is a good thing - it promotes and permits novel and unplanned uses for resources beyond what the designer had originally imagined - but (at least in my understanding) nice URIs are not required to describe an interface as RESTful
I would suggest that it doesn't make
sense for the URI to have a query
string unless the client were filling
in arguments.
That does not seem true to me. If you ask server for a handful of photos, it’s perfectly valid for the server to return something like this:
<photos>
<photo url="http://somewhere/photo?id=1"/>
<photo url="http://somewhere/photo?id=2"/>
</photos>
You could use /photo/id/xx path instead, but that’s not the point. These URLs are usable even without the client changing them. As for your second point:
If the client is filling arguments
then it is adulterating the
server-supplied URI and I wonder if
this violates the RESTful principle.
I guess this is the heart of your question. And I don’t think you have to treat URLs as opaque identifiers, see this quote by Roy Fielding himself:
REST does not require that a URI be
opaque. The only place where the word
opaque occurs in my dissertation is
where I complain about the opaqueness
of cookies. In fact, RESTful
applications are, at all times,
encouraged to use human-meaningful,
hierarchical identifiers in order to
maximize the serendipitous use of the
information beyond what is anticipated
by the original application.
I don’t see what query strings have to do with state tracking. The point of the HATEOAS principle is to refrain from tracking the state on the client, from “cheating” and going to “known” URLs for data. Whether those URLs have query strings or not seems irrelevant to me.
Oh. Maybe you’re interested in something like search URLs where a certain part of the URL has to change according to search criteria? Because such URLs would seemingly have to be known beforehand, thus representing the out-of-band information that we seek to eliminate with REST? I think that this can be solved using URL templates. Example:
client -> server
GET /items
server -> client
/* …whatever, an item index… */
<search by="color">http://somewhere/items/colored/{#color_id}</search>
This way you don’t need no a priori URL knowledge to search and you should be true to the hypermedia state tracking principle. But my grasp of REST is very weak, I’m answering mainly to sort things in my head and to get feedback. Surely there’s a better answer.
No HATEOAS does not mean query strings are not RESTful. In fact the exact opposite can be the case.
Consider the common login scenario where the user tries to access a secured resource and they are sent to a login screen. The URL to the login screen often contains a query string parameter named redirectUrl, which tells the login screen where to return to after a successful login. This is an example of using URIs to maintain client state.
Here is another example of storing client state in the URL: http://yuml.me/diagram/scruffy/class/[Company]<>-1>[Location], [Location]+->[Point]
To follow-on from what Darrel has said, you do not need to alter the URL to include a second URL.
For cookie-based authentication you could return the login form within the body of the 401 response, with an empty form action, and use unique field names that can be POSTed to and processed by every resource. That way you avoid the need for a redirect entirely. If you can't have every resource process log-in requests, you can make the 401 form action point to a log-in action resource and put the redirect URL in a hidden field. Either way, you avoid having an ugly URL-in-a-URL, and the first way avoids the need for both an RPC log-in action and a redirect, keeping all the interaction focused on the resource.

Is this a bad REST URL?

I've just been reading about REST URLs and seen the following example:
/API/User/GetUser
Now if this is accessed over HTTP with a verb GET isn't this a bad URL becuase it describes the action (GET) in the URL?
It's more of a convention, than a hard rule, but I would rather see something like /API/User/7123. The GET/POST/etc describes the action verb, so also putting it in the url makes it redundant.
And in this situation there's no reason not to follow good proven practices.
Here's some good stuff: Understanding REST: Verbs, error codes, and authentication
There is no such thing as a REST URL. In fact, the word REST URL is pretty much an oxymoron. The Hypermedia As The Engine Of Application State Constraint guarantees that URLs are irrelevant: you are only ever following links presented to you by the server, anyway. You never see, read or type a URI anywhere. (Just like browsing the web: you don't look at the URL of a link, read it, memorize it and then type it into the address bar; you just click on it and don't care what it actually says.)
The term REST URL implies that you care about your URLs in your REST architecture. However, if you care about your URLs in your REST architecture, you are not RESTful. Therefore, REST URL is an oxymoron.
[Note: Proper URI design is very important for the URI-ness of a URI, especially the I part. Also, there's plenty of good usability reasons for pretty URLs. But both of these have nothing whatsoever to do with REST.]
Better way would be to have /API/User/7123 and use GET/POST method to signify operations
This is not necessarily bad... it has more to do with the framework you are using to generate your rest URLs. The link #Infinity posted is a good resource, but don't limit yourself to a set theory because it can cause an excessive amount of work in certain frameworks.
For example, there is no reason why you wouldn't want to run a GET on /API/Users/{id}/Delete to display an "are you sure" type of message before using the DELETE method.
/API/User/GetUser is not RESTful. Using a verb to identify a resource is not good thing. The example url is still valid but that doesn't make it right either. It is as wrong as the following declaration
String phoneNumber = "jhon#gmail.com";