Writing a client for a RESTful (hypermedia) API - rest

I've been reading up on 'real' RESTful APIs for a few days now, and I think I'm near to groking what it's about.
But one of the things that I stumble on is that I can't even begin to imagine how one would write a client for a 'real' hypermedia API:
Most of the examples I've read talk about browsers and spiders, but that's not especially helpful: one is human-directed and 'intelligent', the other is dumb and 'random'. As it stands, I kind of get the impression that you'd need to learn AI to get a client working.
One thing that isn't clear to me is how the client knows which verb to use on any given link? Is that implicit in the 'rel' type of the uri? The alternative (reading here) seems to be using xhtml and having a client which can parse and post forms.
How likely is it that a link will change, but not the route to the link?
In most examples you see around, the route and the link are the same:
eg. if I want to set up a client which will bring me back the list of cakes from Toni's Cake Shop:
http://tonis.com
{ link: { type : "cakes" ; uri : "http://tonis.com/cakes" } }
What happens when Toni's becomes Toni's Food Shop, and the link becomes http://tonis.com/desserts/cakes?
Do we keep the initial cakes link at the root, for reverse-compatibility? And if not, how do we do a 'redirect' for the poor little agent who has been told "go to root, look for cakes"?
What am I missing?

Ok, I'm not a REST expert either, I've been reading much related stuff lately, so what I'm going to write is not my experience or opinion but rather a summary of what I read, especially, the REST In Practice book.
First of all, you can't escape from having some initial agreement between client and server, the goal of REST is to make them agree on the very minimum of things which are relevant to both of them and let each party care about their own stuff themselves. E.g., client should not care about links layout or how the data is stored on server, and server should not care about a client's state. What they agree on in advance (i.e. before the interaction begins) is what aforementioned book's authors call "Domain Application Protocol" (DAP).
The important thing about DAP is that it's stateful, even though HTTP itself is not (since any client-service interaction has state, at least, begin and end). This state can be described in terms of "What a client can/may/is expected to do next": "I've started using the service, what now? Ok, I can search items. Search this item, what's next? Ok, I can order this and that... etc"
The definition of Hypermedia content-type is being able to handle both data exchange and interaction state. As I already mentioned, state is described in terms of possible actions, and as comes from "Resource" in REST, all actions are described in terms of accessible resources. I guess, you have seen the acronym "HATEOAS (Hypermedia as the engine of application state), so that's what it apparently means.
So, to interact with the service, a client uses a hypermedia format they both understand, which can be standard, homegrown or a mixture of those (e.g. XML/XHTML-based). In addition to that, they must also share the protocol, which is most likely HTTP, but since some details are omitted from the standard, there must be some idioms of its usage, like "Use POST to create a resource and PUT to update". Also, such protocol would include the entry points of the service (again, in terms of accessible resources).
Those three aspects fully define the domain protocol. In particular, a client is not supposed to know any internal links before it starts using the service or remember them after the interaction completes. As a result, any changes in the internal navigation, like renaming /cakes to /f5d96b5c will not affect the client as soon as it adhere the initial agreement and enters the shop through the front door.

#Benjol
You must avoid to program clients against particular URI's. When you describe a link main importance has it's meaning and not URI itself. You can change the URI any time, though this shouldn't break your client.
I'd change your example this way:
{"link": {
"rel": "collection http://relations.your-service.com/cakes",
"href": "http://tonis.com/cakes",
"title": "List of cakes",
"type": "application/vnd.yourformat+json"
}}
if there is a client which consumes your service, it needs to understand:
link structure itself
link relationships(in this case "collection" which is RFC and
"http://relations.your-service.com/cakes" which is your domain
specific link relation)
In this case client can just dereference address specified by "href" attribute and display list of cakes. Later, if you change the cake list provider URI client will continue to work, this implies that client still understands semantics of your media type.
P.S.
See registered link relation attributes:
http://www.iana.org/assignments/link-relations/link-relations.xml
Web Linking RFC: https://www.rfc-editor.org/rfc/rfc5988

Related

How to manage HATEOAS links when the server is the client?

I'm learning about HATEOAS. The backend server I'm working on will use a third party REST API that uses HATEOAS. That API has an end point to return the url for each resource and also returns the related resource links with regular requests.
But I'm wondering what's a good way to manage these links on the server to avoid hardcoding them. For example if the third party changes the url of the resource, how will the server detect that change? Are there any standard practices for managing HATEOAS resource links?
Possible ways I can think of
When the server starts, get all the resources urls and cache them. Whenever the third party API needs to be called, reuse these cached urls. Whenever there is a 404 or related error, update the resource url. Or update the url periodically in intervals.
Get the resource url each time before calling the end point. Simplest but essentially doubles the number of requests.
But neither sound like robust ways.
While discovery is generally a good thing and should allow a HATEOAS system to introduce changes in ways that 'hardcoded urls' don't, if urls start breaking arbitrarily I would still consider this a major issue.
You should be able to store urls / links on your side and have some expectation that those keep working.
There are some mechanisms that deal with changes though:
The server should return 301 / 308 redirects if a resource moved. If this were the case, you should also update your references.
The server can emit Sunset or Deprecated headers. See: https://www.rfc-editor.org/rfc/rfc8594
Those are more general answers, but ultimately the existence of best practices does not mean that vendors will abide by them. With that in mind I think your best bet is to try and find out what the deprecation policy is of your vendor and see what they recommend.
Use a cached resource if it is valid, request a refresh when you don't have a local valid copy.
RFC 7234 defines the caching semantics of HTTP.
Ideally, you don't implement the caching rules yourself, but instead you use a general purpose cache.
In its ideal form, your bespoke implementation is talking to a headless browser, and the headless browser worries about the caching rules for you.
In theory, you need the initial URL to start the process, and everything else comes from that.
Each resource you get from the server should include links to other edges on the graph of service for that resource.
So, once you get the initial resource, all of the rest come automatically.
That said, it's not untoward to have "well known" entry points that are, ideally, unchanging URLs. But in the end, those are just "bookmarks", and not necessarily guaranteed end points.
Consider a shopping site such as Amazon. Outside of amazon.com, you don't know any of their URLs. They're all provided on the various forms and pages, and the human simply navigates the site. Those URLs can be changing all the time, and no one would know. With HATEOAS, it's up to the machine to follow the links, rather than a human. But the process of navigation is the same.
As others have mentioned, idea of caching a root resource has merit. Then you rely on the caching headers to direct you to how often you have to refresh the links.
But that said, operationally, there's no difference between following a normal link, and following a cached link. Underneath, the cached resource loads faster, but you still need to "follow the link". Because that's where the caching behavior kicks in. This is different from assuming the link is good, assuming you know the result of a resource lookup. Your application follows the link. Always. The underlying infrastructure is responsible for making it efficient.
So, your code should not, say, load up a root resource, and then stuff a map filled with links, and then assume they're good. Rather, the code should request the root resource, perhaps as a Map of links (datatypes for the win), and let the next layer handle the details. Because it all depends on the type of caching involved. Some have coded durations where no followup is necessary. Others, you make the request anyway, and the server tier responds back "nothing changed", so you can use your local copy, but you're still require to ask in the first place.
Those are implementation details that the SERVER mandates (not the client). It's a server contract. If they want you pinging them each and every time, so be it. That's the contract they're presenting to you and if you want to be a Good Citizen, then you should honor that contact.
Ideally, the server makes good decisions on these kinds of issues for the sake of efficiency, but in the end it's really up to them.
The client has to go along. The client in a HATEOAS system cedes a lot to the server. They're simply not decisions for the client to make.

REST API design for action that is not REST [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I'm designing REST API for devices management system.
Endpoints:
http://api.example.com/users
http://api.example.com/devices
I want to achieve that there is an endpoint that will do some action on selected device. Something like that:
http://api.example.com/devices/1/send_signal
I know that is not REST compatible and I'm looking for suggestions to make it the right way.
I was thinking about adding another endpoint like:
http://api.example.com/calls
So when user send POST request with let's say deviceId parameter to that endpoint there will be new entry into the database (to have history of all calls and who called the function) and at the same time calls that function on specified device.
Would it be great architecture and REST compatible?
You are right about your hunch. It is not proper REST. Sometimes that is OK, but most often, it is a sign that something in your Domain needs redesigning.
Quite often, there is a domain model waiting to be discovered. Most often, things like "send_signal" are telling you that you've modeled your API too close to some library, backend service or database. An API, after all, is the interface you provide.
AS I've written before: The R in REST stands for resource (which isn't true.... etc).
Think in resources. Not in procedures or calls, or internal tools, backend services or architecture of your system. That is your stuff. The API-user should only be bothered with (clean) abstractions that make senses to the API-user.
Both /call and /.../send_signal bother too much about procedures and internals.
What do you want to do with a device? You want to turn it's camera on? That would be an update to the Camera model on the Device with ID 1337:
PUT /device/1337/camera { power: "on" }
You want a device to bzip up some log files and send them to a debug-server? You're creating a DebugSession model:
POST /device/1337/debug_session { delivery_bucket: 42, compress: "bzip" }
You want a device to send a message to some server? Create a Message on a Device:
POST /device/1337/messages { to: john, body: "Hello World" }
And so on.
That is REST. In REST you model your domain models carefully. Lots of REST servers are really poor because they are very thin wrappers around some relational databases, and suffer from far too much leaky abstractions. Lots of other REST servers are really poor because they are written far too close around backend services, jobs being ran, or other internal details.
If I want to start a new server, I want to say:
POST /server/ { region: eu-1, size: xl, disk: 1MB }
And not:
POST /resources/blockdisks/create { size: 10GB } => 1337 is created
GET /resources/blockdisks/1337?include_attrs=mountpoint,format
GET /servers/available/map_for/eu-1?xl => DB-Zfaa-dd-12
POST /servers/reserve { id: DB-Zfaa-dd-12, attach: { id: 1337, mountpoint: /dev/sdb2, format: zfs }
(I'm not making this up, I had to deal with such API's they are a PIAS to use, and quite certainly an even bigger PIAS to maintain)
The lesson here: the first exposes a domain model of a Server, with few attributes only interesting to the user of the API. The second is modeled far too close around all sorts of internal tooling, and systems.
Edit: and all this completely ignores the even more important REST-part: discovery. Links, headers, redirects, etc. But you were explicitly asking about naming resources, so that's what my answer is about. Once you have your resources, your domain models, architectured, go back to the whiteboard and do it all over: now including Links, headers, or other metadata so that your API-clients can discover what they can do and where they can do that.
http://api.example.com/devices/1/send_signal
I know that is not REST compatible and I'm looking for suggestions to make it the right way.
It is REST compatible. REST doesn't care what spellings you use for your resource identifiers.
Would it be great architecture and REST compatible?
If this is what you want, think about how you would achieve the same result with a website. The client would open a bookmark in a browser, maybe follow some links, fill in a form, and submit it. That would all work because the (generic) browser understands the processing rules for HTML, and how to manage cache meta data in HTTP, and so on. There's no point in the chain where the client needs to compose an identifier -- it either uses one provided by the server as is, or it calculates one using the generic processing rules for HTML forms.
OP clearly uses a verb, to communicate intent and procedures, which is not RESTful.
No; this is a common misunderstanding. There's nothing in the REST architectural style that says anything about human readable semantics in identifiers. URL shorteners work.
It's analogous to saying that a variable name in a program should never be a verb, because that doesn't correctly communicate intent -- the compiler/interpreter doesn't care.
The use of spelling conventions does not make a URI more or less RESTful. See Tilkov. It might even be preferable to avoid using predictable identifiers, as that ensures that consumers read the identifiers provided in the hypermedia representations, which is the point.
I think you are on the right track. Based on what you said about the system, I would probably start with this way of doing it and go from there.
POST http://api.example.com/devices/123/calls
This would send the details of the call to the API and it would intern save the call to a data store and send off an event to the appropriate subsystem or internal library to call out to the device.
Then you could have the following endpoints to get call details.
GET http://api.example.com/devices/123/calls/456
GET http://api.example.com/devices/123/calls -This would also include query parameters to limit the results in some way, probably by date.
If you want to get calls from all devices then you could to this with some query parameters restricting the result set, again maybe by date.
GET http://api.example.com/devices/calls
Just as a side note, if this is an internal API used only by your applications RPC style may be appropriate. But, by following HTTP/REST you will make your software more malleable so you can use it in more ways without making it specific to any one function.
This is a good article on REST vs RPC if you'd like to learn more. https://cloud.google.com/blog/products/application-development/rest-vs-rpc-what-problems-are-you-trying-to-solve-with-your-apis

Rest Services conventions

In Rest Services,we normally use 'GET' requests when we want to retrieve some data from the server, however we can also retrieve data using a 'POST' request.
We use 'POST' to create, 'PUT' to update, and 'DELETE' to delete, however we can even create new data using a 'DELETE' request.
So I was just wondering what is the real reason behind for that, why these conventions are used?
So I was just wondering what is the real reason behind for that, why these conventions are used?
So the world doesn't fall apart!
No but seriously, why are any protocols or standards created? Take this historical scenario. Back in the early days of Google, many developers (relative to nowadays) weren't too savvy on the HTTP protocol. What you might've caught was a bunch of sites who just made use of the well known (maybe only known) GET method. So there would be links that would be GET requests, but would perform operations that were meant to be POST request, that would change the state of the server (sometimes very important changes of state). Enter Google, who spends its days crawling the web. So now you have all these links that Google is crawling, all these links that are GET requests, but changing the state of the server. So all these companies are getting a bunch of hits on their servers changing state. They all think they're being attacked! But Google isn't doing anything wrong. HTTP semantics state that GET requests should not have state changing behaviors. It should be a "read only" method. So finally these companies smartened up, and started following HTTP semantics. True story.
The moral of the story: follow protocols, that's what they're there for - to follow.
You seem to be looking at it from the perspective of server implementation. Yeah you can implement your server to accept DELETE request to "get" something. That's not really the matter at hand. When implementing the server, you need to think about what the client expects. I mean ultimately, you are creating an API. Look at it from a code API perspective
public class Foo {
public Bar bar;
public Bar deleteBar() {
return bar; // Really?!
}
public void getBar() {
bar = null; // What the..??!
}
}
I don't know how long a developer would last in the game, writing code like that. Any callers expecting to "get" Bar (simply by naming semantics) has another thing coming. Same goes for your REST services. It is ultimately a WEB API, and should follow the semantics of the protocol (namely HTTP) on which it is built. Those who understand the protocol, will have an idea of what the API does (at least in the CRUD sense), simply based on the type of request they make.
My suggestion to you or anyone trying to learn REST, is to get a good handle on HTTP. I would keep the following document handy. Read it once, then keep it as a reference
Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content
GET cached by proxies POST and DELETE not!
Yes you can create data with GET but now you have to destroy that cached.Why to do extra job.
Also maximum header sizes accepted are different because of purpose of usage.
I recommend reading the spec which clearly states how each each http-method should be used.
why these conventions are used?
They are conventions, that is, best practices that have been adopted as a standard. You do not have to adhere to the standard, but most consumers of a REST service assume you do. That way it is easier to understand the implementation / interface.

Connectedness & HATEOAS

It is said that in a well defined RESTful system, the clients only need to know the root URI or few well known URIs and the client shall discover all other links through these initial URIs. I do understand the benefits (decoupled clients) from this approach but the downside for me is that the client needs to discover the links each time it tries access something i.e given the following hierarchy of resources:
/collection1
collection1
|-sub1
|-sub1sub1
|-sub1sub1sub1
|-sub1sub1sub1sub1
|-sub1sub2
|-sub2
|-sub2sub1
|-sub2sub2
|-sub3
|-sub3sub1
|-sub3sub2
If we follow the "Client only need to know the root URI" approach, then a client shall only be aware of the root URI i.e. /collection1 above and the rest of URIs should be discovered by the clients through hypermedia links. I find this cumbersome because each time a client needs to do a GET, say on sub1sub1sub1sub1, should the client first do a GET on /collection1 and the follow link defined in the returned representation and then do several more GETs on sub resources to reach the desired resource? or is my understanding about connectedness completely wrong?
Best regards,
Suresh
You will run into this mismatch when you try and build a REST api that does not match the flow of the user agent that is consuming the API.
Consider when you run a client application, the user is always presented with some initial screen. If you match the content and options on this screen with the root representation then the available links and desired transitions will match nicely. As the user selects options on the screen, you can transition to other representations and the client UI should be updated to reflect the new representation.
If you try and model your REST API as some kind of linked data repository and your client UI as an independent set of transitions then you will find HATEOAS quite painful.
Yes, it's right that the client application should traverse the links, but once it's discovered a resource, there's nothing wrong with keeping a reference to that resource and using it for a longer time than one request. If your client has the possibility of remembering things permanently, it can do so.
consider how a web browser keeps its bookmarks. You probably have maybe ten or a hundred bookmarks in the browser, and you probably found some of these deep in a hierarchy of pages, but the browser dutifully remembers them without requiring remembering the path it took to find them.
A more rich client application could remember the URI of sub1sub1sub1sub1 and reuse it if it still works. It's likely that it still represents the same thing (it ought to). If it no longer exists or fails for any other client reason (4xx) you could retrace your steps to see if you can find a suitable replacement.
And of course what Darrel Miller said :-)
I don't think that that's the strict requirement. From how I understand it, it is legal for a client to access resources directly and start from there. The important thing is that you do not do this for state transitions, i.e. do not automatically proceed with /foo2 after operating on /foo1 and so forth. Retrieving /products/1234 initially to edit it seems perfectly fine. The server could always return, say, a redirect to /shop/products/1234 to remain backwards compatible (which is desirable for search engines, bookmarks and external links as well).

RESTful Web Services: method names, input parameters, and return values?

I'm trying to develop a simple REST API. I'm still trying to understand the basic architectural paradigms for it. I need some help with the following:
"Resources" should be nouns, right? So, I should have "user", not "getUser", right?
I've seen this approach in some APIs: www.domain.com/users/ (returns list), www.domain.com/users/user (do something specific to a user). Is this approach good?
In most examples I've seen, the input and output values are usually just name/value pairs (e.g. color='red'). What if I wanted to send or return something more complex than that? Am I forced to deal with XML only?
Assume a PUT to /user/ method to add a new user to the system. What would be a good format for input parameter (assume the only fields needed are 'username' and 'password')? What would be a good response if the user is successful? What if the user has failed (and I want to return a descriptive error message)?
What is a good & simple approach to authentication & authorization? I'd like to restrict most of the methods to users who have "logged in" successfully. Is passing username/password at each call OK? Is passing a token considered more secured (if so, how should this be implemented in terms of expiration, etc.)?
For point 1, yes. Nouns are expected.
For point 2, I'd expect /users to give me a list of users. I'd expect /users/123 to give me a particular user.
For point 3, you can return anything. Your client can specify what it wants. e.g. text/xml, application/json etc. by using an HTTP request header, and you should comply as much as you can with that request (although you may only handle, say, text/xml - that would be reasonable in a lot of situations).
For point 4, I'd expect POST to create a new user. PUT would update an existing object. For reporting success or errors, you should be using the existing HTTP success/error codes. e.g. 200 OK. See this SO answer for more info.
the most important constraint of REST is the hypermedia constraint ("hypertext as the engine of application state"). Think of your Web application as a state machine where each state can be requested by the client (e.g. GET /user/1).Once the client has one such state (think: a user looking at a Web page) it sees a bunch of links that it can follow to go to a next state in the application. For example, there might be a link from the 'user state' that the client can follow to go to the details state.
This way, the server presents the client the application's state machine one state at a time at runtime. The clever thing: since the state machine is discovered at runtime on state at a time, the server can dynamically change the state machine at runtime.
Having said that...
on 1. the resources essentially represent the application states you want to present to the client. The will often closely match domain objects (e.g. user) but make sure you understand that the representations you provide for them are not simply serialized domain objects but states of your Web application.
Thinking in terms of GET /users/123 is fine. Do NOT place any action inside a URI. Although not harmful (it is just an opaque string) it is confusing to say the least.
on 2. As Brian said. You might want to take a look at the Atom Publishing Protocol RFC (5023) because it explains create/read/update cycles pretty well.
on 3. Focus on document oriented messages. Media types are an essential part of REST because they provide the application semantics (completely). Do not use generic types such as application/xml or application/json as you'll couple your clients and servers around the often implicit schema. If nothing fits your needs, just make up your own type.
Maybe you are interested in an example I am hacking together using UBL: http://www.nordsc.com/blog/?cat=13
on 4. Normally, use POST /users/ for creation. Have a look at RFC 5023 - this will clarify that. It is an easy to understand spec.
on 5. Since you cannot use sessions (stateful server) and be RESTful you have to send credentials in every request. Various HTTP auth schemes handle that already. It is also important with regard to caching because the HTTP Authorization header has special specified semantics to caches (no public caching). If you stuff your credentials into a cookie, you loose that important piece.
All HTTP status codes have a certain application semantic. Use them, do not tunnel your own error semantics through HTTP.
You can come visit #rest IRC or join rest-discuss on Yahoo for detailed discussions.
Jan