Recently, I've designing a RESTful API, and I want to use the Link header field to implement HATEOAS.
This is all fine and works without any real problem, but I want to make it easier for the clients of the API.
The Link header for example might look like this:
Link: <https://api.domain.com/orders/{id}>; rel="https://docs.domain.com/order"
In this case the client would have to find the link by searching for the https://docs.domain.com/order value in the rel attribute.
This works, but since the URI for the docs could change it's fragile, and because of the length it makes it a bit impractical as a key to search for.
So I want to try and use a CURIE to make it something like this instead:
Link: <https://api.domain.com/orders/{id}>; rel="[rc:order]"
That way the problem of a URI changing is mitigated for the most part, and it's much more compact allowing for easier search by the client.
My problem is, that since I'm using a Link header field to implement HATEOAS I feel it would be most consistent to also include the CURIE as an HTTP header field, rather than introducing meta data in the response body.
Since the entire API uses standard HTTP header fields and status codes for all of it's meta data (pagination, versioning etc), I would rather not introduce meta data into the response body just to define a CURIE.
But if I use HTTP header fields, which field should I use for a CURIE?
Is there even a standard way to do this with HTTP header fields?
If not, should I just use a custom header field, like X-Curie: <https://docs.domain.com>; name="rc", and just document it somewhere for clients?
I've looked around and most resources are either in reference to XML or the HAL standard, so any information on this in relation to HTTP header fields would be appreciated.
No, you can't do that. The link relation is a string - nothing more.
The question that you should ask yourself is why you are using an unstable link relation name in the first place...
Even if you don't use the Link header, CURIE's would not solve the problem you present. Because the CURIE's standard state that a shortened URI must be "unwrapped" before any comparison is performed. This would also apply to comparison agains the link relation in question.
A more pragmatic approach would be to coin your own URI like foo:order. Then you can use some custom url shortening method of resolving the documentation url for the relation in question. This method is used by hypermedia formats like HAL+JSON (the HAL formats use of curies is actually misleading, it should only be used as a method for resolving URL's to documentation).
CURIEs in HTTP Link header's rel properties would not get expanded, because all rel properties are equated with simple string matches, none are treated as URIs.
My main concern would be the phrase "since the URI for the docs could change it's fragile" — then choose a URI which won't change. Even if you use a URL that redirects to some location deep in the docs, the URI you choose for the link relation needs to be permanent if you want client devs to be able to resolve it.
Related
We are creating a two REST APIs
To do a template processing
To convert the HTML to PDF.
The template in 1, and the HTML in 2 can be provided along with the request body as well as present in a persistent location.
What is the best way to design the REST contract? Are these good ways to design it? Or are there better suggestions for this?
For Scenario 1
PUT - /template/process; include the Template markup in the body
PUT - /templates/{id}/process; there is a template with that {id} in a persistent location
For Scenario 2
PUT /html2pdf/convert with the HTML markup mentioned in the body
PUT html2pdf/{id}/convert with the resource HTML with the {id} available in a persistent location.
Is it a good idea to have the action too (process, convert in this case) in the URL? Or is it possible to give a meaning to it using the HTTP verb itself?
The use of PUT is suspicious. Remember, part of the point of REST is that we have a uniform interface - everybody understands HTTP messages the same way.
In particular, PUT is the method token that we use if we are uploading a web page to a server. The semantics of the request are analogous to "save" or "upsert" - you are telling the server that you want to make the server's copy of the target resource look like your copy of the target resource.
POST serves many useful purposes in HTTP, including the general purpose of “this action isn’t worth standardizing.” -- Fielding, 2009
What we want in a lot of cases similar to yours is an effectively read only request with a payload. As of early 2021, the registry of HTTP methods doesn't have a good match for that case - the closest options are SEARCH and REPORT, but both of those carry WebDAV baggage.
But in late 2020, the HTTP WG adopted the draft-snell-search-method spec:
the scope of work is to define a method that has the behavior of a GET with a body -- Tommy Pauly
So when that work is done, we'll have another standard method to consider for this type of problem.
Is it a good idea to have the action too in the URL?
It's a neutral idea.
URL/URI are identifiers; they are (from the point of view of the protocol) semantically opaque. So long as your spelling conventions are consistent with the production rules defined in RFC 3986, you can do as you like.
Any HTTP compliant component should understand that the semantics of the message are defined my the method token, not by the target-uri.
Resources are generalization of documents. The URI is, effectively, the "name" of the document. You can use any name for your documents that makes sense to your human beings (operators looking at access logs, tech writers trying to document your app, other developers trying to read those documents, etc).
I was wondering if it is technically RESTful to use an identifier as a grouping and not be a particular resource (no corresponding id).
For example:
get /location/address
get /location/coverage
get /location/region
These things are all a location hence having them behind the location identifier. Is this correct?
Or is it better to rethink the structure of these endpoints or break them into just /address - /coverage - /region?
First and foremost, a URI in a REST architecture is just a URI, a reference to an other resource. The actual spelling used inside the URI is furthermore not of relevance as the URI as a whole is a pointer that shouldn't get segmented or analyzed. While a set of URIs may span a whole tree of available paths to use, there is no requirement actually therefore.
An URI itself also does not necessarily tell anything about the content of the resource it points to. Fielding even claimed that
A REST API should never have “typed” resources that are significant to the client. Specification authors may use resource types for describing server implementation behind the interface, but those types must be irrelevant and invisible to the client. The only types that are significant to a client are the current representation’s media type and standardized relation names. (Source)
This blog post further describes what Fielding meant with this statement which basically just states that proper content type negotiation should be used instead of assuming that a resource has a certain type. On the Web you might retrieve a data page of your preferred car or sports team though you will most-likely receive a page in HTML format that is generic enough and therefore widely spread that plenty of clients (browsers) support it. Through the affordance of each of the defined elements your client (browser) knows how to present these to you for more convenient access of the content.
I'm not sure about you but if I have the choice between reading some line summarizing the content of a link and having to decipher the URI itself, I definitely prefer the former one. Such metadata is usually attached to a link in some way. In HTML the anchor (<a>) tag not only ships with meta data such as href or target but also rel which basically allows to annotate a URI with a set of keywords that tell a client about the purpose of the link. Such a keywords may be something like first, last, prev or next for paging through a sub-set of a collection, or preload for telling a client that it can load the content of the referenced URI early to speed up load times as the client will most likely be interested in the content of that link next. Such keywords should be standardized to gain wider acceptance or at least base off of an extension mechanism defined in Web Linking, as Dublin Core does it i.e.
Looking at your respective URIs they already seem to express what link relation are there for. As such they could make up for good candidates for defining them as link relations such as:
http://www.acme.com/rel/address
http://www.acme.com/rel/coverage
http://www.acme.com/rel/region
A link can basically have multiple relation names assigned simultaneously, depending on the media-type the payload is exchanged for. A client that does not know what a certain link-relation means will simply ignore it, clients that however have the knowledge will act accordingly upon finding URIs with such annotations. I admit that an arbitrary client will not be able to make use of all of the link-relations out of the box, especially the extension ones, but such link-relation names might be enforced by media-types as well or support for such relation names may be added through updates or plug-ins later on.
Media types, after all, are just a human-readable description on how a payload received for such a form will look like and how it has to be processed by some automata. Hence, a generic application/json is usually a bad media-type in a REST API as it just defines that the content should be embedded between curly braces and primarily represents a key-value structure with the addition of objects and arrays, but it lacks to hint a client on what a link is or on the semantics of certain elements found within the payload. application/hal+json is an extension of the basic JSON notation and adds some processing rules and semantic definitions of certain elements such as _links or _embedded, which a client can use. Here, curies could be used a link-relations as well I guess as in the end they also end up with unique URIs as requested by the Web linking extension definition.
Certain media-types also allow to pass along further processing hints in the media-type itself through the use of profiles. As such a server could hint a client i.e. that the collection expressed by the requested resource contains entries that follow a certain logic, i.e. that the collection contains a set of orders, where a processing entity can apply additional checks upon, i.e. that certain fields must be specified or that certain inputs have to be in bound between two values and stuff.
As writing a whole new media-type is quite some effort, investigating into already defined ones is for sure a good idea. If you really think there is no media type available yet that really is applicable to your domain, you should start to write one, probably in a community effort. Keep in mind though that this media-type should be as generic as possible to allow adoption by third parties otherwise hardly any client will really support it and thus limit the number of potential peers your applications can interact with. It is further a good idea to take reference at the HTML spec to see how elements got defined and how they maintained backwards compatibility as you don't want to register a new media-type on each change.
If you just want to implement some filter mechanism to only show locations that represent addresses rather than regions you may take reference on how it is done on the Web. Here usually a server will provide you a form to select a choice among the given set of choices and upon clicking the submit button (or enter key) a request is issued to the server which will return the subset of entries matching the query. Instead of using a form a server may already provide a client with a link that is annotated, as mentioned above, with some hints a client knows that they refer to addresses, regions or whatever options you have available and can chose based on the annotated link relations the URI which is of interest to the client. Again, the actual form of the URI is not of relevance here either as a client should just use the URIs that were provided by the server.
You might ask yourself why all that is needed?! The basic answer to this question is simply to avoid breaking things when stuff changes over time. Think of a case where a client wants to retrieve details on a previously sent order. If it had the knowledge of the URI hardcoded into it and the server changes the URI style the client might not be able to retrieve this data. On the other side if it had the knowledge to look for any URI annotated with http://www.acme/rel/order and just use that URI it couldn't care less if the URI changed as it just uses the given information to send it to the mentioned endpoint. By relying on well-defined media-types and the semantics of the defined elements, any peer supporting this media-type will also be able to process it. Almost every client is able to handle HTML in some way, though hardly any generic HTTP client can really act upon a custom JSON payload format, i.e. present you a clickable link or render a nice form you can update the data of an existing resource. Custom payload makes it also difficult to reuse the same client for different endpoints. I.e. you probably wont be able to use the same client to shop on two different web-shops that expose Web-RPC-based HTTP endpoints.
So, to sum up my post, as mentioned, the form of the URI is not really of relevance in a REST architecture as a server should always provide it to the client anyway and a client shouldn't deduct the meaning from it. Instead link-relations, media-type support and content-type negotiation should be used consistently.
REST doesn't care what spellings you use for your URI, beyond the fact that they should be consistent with the production rules defined in RFC 3986.
GET /location/region
GET /region
Those are both fine.
I've been trying to implement a RESTFul architecture, but I've gotten thoroughly confused as to whether custom media-types are good or bad.
Currently my application conveys "links", using the Http Link: header. This is great, I use it with a title attribute, allowing the server to describe what on earth this 'action' actually is, especially when presented to a user.
Where I've gotten confused is whether I should specify a custom mime-type or not. For instance I have the concept of a user. It may be associated with the current resource. I'm going to make up an example and say I have an item on an auction. We may have a user "watching" it. So I would include the link
<http://someuserrelation> rel="http://myapp/watching";title="Joe Blogg", methods="GET"
In the header. If you had the ability to remove that user from watching you would get.
<http://someuserrelation> rel="http://myapp/watching";title="Joe Blogg", methods="GET,DELETE"
I'm pretty happy with this, if the client has the correct role he can remove the relationship. So I'm defining what to do with a relationship. The neat thing is say we call GET on that 'relation' resource, I redirect the client to the user resource.
What's confusing me is whether or not to use a custom mime-type. There's arguments both ways on the internet, and in my head in regards to this.
I've done a sample in which I call HEAD on an unknown url, and the server returns Content-Type: application/vnd.myapp.user. My client then decides whether it can understand this mime-type (it maintains mappings of resources it understands to views), and will either follow it, or explain that it's unable to figure out what is at the end of that link.
Is this bad?. I have to maintain special mime-types. What's particularly odd is I'm more than happy to use a standard application/user format, but can't find one specified anywhere.
I'm beginning to think I should be attempting to completely guess at rendering what in any HTTP response, almost to the point that maybe my RESTFul api should just be rendering html instead of attempting to do anything with json/xml.
I've tried searching (even Roy Fieldings blog), but can't find anything that describes how the client should deal with this sort of situation.
EDIT: the argument I have with including the custom type, is that it may not necessarily be a 'user' watching the item, it could be something with application/vnd.myapp.group. By getting the response a client knows the body has something different, and so change to a view that displays groups. But is this coupling of mime-type to view bad?.
I'd say, you definitely want to have a specific media-type for all representations. If you can find a standard one (html, jpeg, atom, etc.) use that, but if not, you should define one (or multiple ones).
The reason is: representations should be self-contained. That means your client gets a link from somewhere it should know what to do with it. How to display it, how to proceed from there, etc. For example a browser knows how to display text/html. You client should know how to display/handle application/vnd.company.user.
Also, I think you've got content negotiation backwards. You don't need to call HEAD to determine what representations the server supports. You can tell the server what your client supports in the GET/POST/etc requests using the "Accepts" header. Indeed this would be the standard way to do it. The server then responds with the 'best' representation it can give you for your accepted mime-types. You don't need more round-trips.
So, although the links you are providing can contain contextual information, usually given in the 'rel' attribute, like if the link points to a 'next page', 'previous page', 'subscribed user' or 'owner user', etc., the client can not assume any representation under those links. It knows it is semantically a 'user', so it can fill the 'Accepts' header with all supported representations for a user (application/vnd.company.user). If the representation only says text/xml, there is nothing the client can assume either of the content, or semantics of Links that it may receive.
In practice you can of course code any client to just assume what representations are under what links/urls, and you don't have to conform to REST all the time, but you do get a lot of benefits (described in Roy Fielding's paper) if you do.
Another minor point: The links do not need to contain which methods are available for a given resource, that's what OPTIONS is for. Admittedly, it is rarely implemented.
You don't have to send HTML to serve hypermedia. There are many different hypermedia formats which are much easier to parse than HTML. https://sookocheff.com/post/api/on-choosing-a-hypermedia-format/ https://stackoverflow.com/a/13063069/607033
You don't have to use a domain specific MIME type, in my opinion it is better to use a domain specific vocabulary with a general hypermedia type e.g. microformats or schema.org with JSON-LD + Hydra or ATOM/XML + microdata / RDFa, etc... There are many alternatives depending on your taste. https://en.wikipedia.org/wiki/Microdata_(HTML) http://microformats.org/ http://schema.org/ http://www.hydra-cg.com/
I am not sure whether adding the same relation to multiple methods is a good choice. Sending multiple links with different in link headers with d if you want: https://stackoverflow.com/a/25416118/607033
I'm trying to develop a RESTful api using the hypertext as an engine of application state principals.
I've settled on using the Link header (RFC5988) for all 'state transitions', it seems natural to place links there, and doesn't make my response types specific on an implementation (eg XML/json/etc all just work).
What I'm struggling with at the moment is the pagination of a collection of resources. In the past I've used the Range header to control this, so the client can send "Range: MyObjects=0-20" and get the first 20 back. It seems natural to want to include a "next" relation to indicate the next 20 items (but maybe it isn't), but I'm unsure how to do it.
Many examples include the range information as part of the URI. eg it would be
Link: <http://test.com/myitems?start=20&end=40>;rel="next"
Using the range header would I do something like the following?
Link: <http://test.com/myitems;rel="next";range="myitems=20-40"
The concern here is that the link feels non-standard. A client would have to be checking the range header to get the next items.
The other thing is, would I just leave this all as somewhat out-of-band information?. Never showing next/previous ranges (as that sort of depends on what the client is doing), and expect the client to just serialize down what it needs when it needs it?. I can use the "accepts-ranges" link hints in the initial link to the collection to let the client know its 'pageable'
eg something like
OPTIONS http://test.com
-> Link:"<http://test.com/myitems";rel="http://test.com/rels/businessconcept";accepts-ranges="myitem""
Then the client would go, oh it accepts ranges, I will only pull down the first X, and then go for the next range as necessary (this sort of feels like implicit states though).
I can seem to figure out what is really in the best spirit of HATEOAS here.
The link header spec doesn't do exactly what you want AFAIK. It has no concept of identifying client supplied parameters. The rel doc's for your link can and should specify how to apply your range implementation.
The link-template header spec https://datatracker.ietf.org/doc/html/draft-nottingham-link-template-00#page-3 does a lot of what you want, but that expired. This specified how to use https://www.rfc-editor.org/rfc/rfc6570 style templates in headers, and if you want to use headers that the direction i'd suggest.
Personally, although I agree having the links in the header abstracts you from the body's content type, i find the link header to be difficult to develop against as you can't just paste a url in a browser and see links immediately.
Does the HATEOAS (hypermedia as the engine of app state) recommendation imply that query strings are not RESTful?
Edit: It was suggested below that query strings may not have much to do with state and that therefore the question is puzzling. I would suggest that it doesn't make sense for the URI to have a query string unless the client were filling in arguments. If the client is filling arguments then it is adulterating the server-supplied URI and I wonder if this violates the RESTful principle.
Edit 2: I realize that the query string seems harmless if the client treats it as opaque (and the query string might be a legacy and therefore convenient). However, in one of the answers below Roy Fielding is quoted as saying that the URI should be taken to be transparent. If it is transparent then I believe adulterating is encouraged and that seems to dilute the HATEOAS principle. Is such dilution still consistent with HATEOAS? This raises the question of whether REST is calling for the tight coupling that URI building seems to be.
Update At this REST tutorial http://rest.elkstein.org/ it is suggested that URI building is bad design and is not RESTful. It also iterates what was said by #zoul in the accepted answer.
For example, a "product list" request could return an ID per product, and the specification says that you should use http://www.acme.com/product/PRODUCT_ID to get additional details. That's bad design. Rather, the response should include the actual URL with each item: http://www.acme.com/product/001263, etc. Yes, this means that the output is larger. But it also means that you can easily direct clients to new URLs as needed
If a human is looking at this list and does not want what he/she can see, there might be a "previous 10 items" and a "next 10 items" button, however, if there is no human, but rather a client program, this aspect of REST seems a little weird because of all the "http://www" that the client program may have no use for.
In Roy Fielding's own words (4th bullet point in the article):
A REST API must not define fixed resource names or hierarchies (an obvious coupling of client and server). Servers must have the freedom to control their own namespace. Instead, allow servers to instruct clients on how to construct appropriate URIs, such as is done in HTML forms and URI templates, by defining those instructions within media types and link relations.
In other words, as long as the clients don't get the pieces they need to generate URIs in out-of-band information, HATEAOS is not violated.
Note that URI templates can be used in URIs without query strings as well:
http://example.com/dictionary/{term}
So the question is more about whether it is RESTful to allow the client to construct URLs than whether it is RESTful to use query strings.
Notice how the amount of information served to the client in the example above is exactly equivalent to serving an exhaustive list of all the possible terms, but is a lot more efficient from a bandwidth point of view.
It also allows the client to search the dictionary while respecting HATEAOS, which would be impossible without the in-band instructions. I am quite confident that Roy Fielding is not promoting a Web without any search feature...
About your third comment, I think Roy Fielding is encouraging API designers to have "transparent" URIs as an extra feature on top of HATEAOS. I don't interpret his quote in zoul's answer as a statement that clients would use "common sense" to navigate an API with clear URIs. They would still use in-band instructions (like forms and URI templates). But this does not mean that transparent URIs are not way better than dark, surprising, opaque URIs.
In fact, transparent URIs provide added value to an API (debugging is one use case I can think of where having transparent URIs is invaluable).
For more info about URI templates, you can have a look at RFC6570.
My take on it is that REST itself says nothing about whether URI are opaque or transparent but that REST app should not depend on the client to construct URI that the server hasn't already told it about. There are a variety of ways for the server to do this: for example, a collection which may include links to its members, or an HTML form with the GET method will obviously cause a URI with params to be created client-side and fetched, or there are at least a couple of proposed standards for URI templates. The important thing for REST is that the description of valid URI should be defined somehow by the server in the responses it gives to clients, and not in some out-of-band API documentation
URI transparency is a good thing in the same way as transparency anywhere is a good thing - it promotes and permits novel and unplanned uses for resources beyond what the designer had originally imagined - but (at least in my understanding) nice URIs are not required to describe an interface as RESTful
I would suggest that it doesn't make
sense for the URI to have a query
string unless the client were filling
in arguments.
That does not seem true to me. If you ask server for a handful of photos, it’s perfectly valid for the server to return something like this:
<photos>
<photo url="http://somewhere/photo?id=1"/>
<photo url="http://somewhere/photo?id=2"/>
</photos>
You could use /photo/id/xx path instead, but that’s not the point. These URLs are usable even without the client changing them. As for your second point:
If the client is filling arguments
then it is adulterating the
server-supplied URI and I wonder if
this violates the RESTful principle.
I guess this is the heart of your question. And I don’t think you have to treat URLs as opaque identifiers, see this quote by Roy Fielding himself:
REST does not require that a URI be
opaque. The only place where the word
opaque occurs in my dissertation is
where I complain about the opaqueness
of cookies. In fact, RESTful
applications are, at all times,
encouraged to use human-meaningful,
hierarchical identifiers in order to
maximize the serendipitous use of the
information beyond what is anticipated
by the original application.
I don’t see what query strings have to do with state tracking. The point of the HATEOAS principle is to refrain from tracking the state on the client, from “cheating” and going to “known” URLs for data. Whether those URLs have query strings or not seems irrelevant to me.
Oh. Maybe you’re interested in something like search URLs where a certain part of the URL has to change according to search criteria? Because such URLs would seemingly have to be known beforehand, thus representing the out-of-band information that we seek to eliminate with REST? I think that this can be solved using URL templates. Example:
client -> server
GET /items
server -> client
/* …whatever, an item index… */
<search by="color">http://somewhere/items/colored/{#color_id}</search>
This way you don’t need no a priori URL knowledge to search and you should be true to the hypermedia state tracking principle. But my grasp of REST is very weak, I’m answering mainly to sort things in my head and to get feedback. Surely there’s a better answer.
No HATEOAS does not mean query strings are not RESTful. In fact the exact opposite can be the case.
Consider the common login scenario where the user tries to access a secured resource and they are sent to a login screen. The URL to the login screen often contains a query string parameter named redirectUrl, which tells the login screen where to return to after a successful login. This is an example of using URIs to maintain client state.
Here is another example of storing client state in the URL: http://yuml.me/diagram/scruffy/class/[Company]<>-1>[Location], [Location]+->[Point]
To follow-on from what Darrel has said, you do not need to alter the URL to include a second URL.
For cookie-based authentication you could return the login form within the body of the 401 response, with an empty form action, and use unique field names that can be POSTed to and processed by every resource. That way you avoid the need for a redirect entirely. If you can't have every resource process log-in requests, you can make the 401 form action point to a log-in action resource and put the redirect URL in a hidden field. Either way, you avoid having an ugly URL-in-a-URL, and the first way avoids the need for both an RPC log-in action and a redirect, keeping all the interaction focused on the resource.