REST - What exactly is meant by Uniform Interface? - rest

Wikipedia has:
Uniform interface
The uniform interface constraint is fundamental to the design of any REST service.[14] The uniform interface simplifies and decouples the architecture, which enables each part to evolve independently. The four guiding principles of this interface are:
Identification of resources
Individual resources are identified in requests, for example using URIs in web-based REST systems. The resources themselves are conceptually separate from the representations that are returned to the client. For example, the server may send data from its database as HTML, XML or JSON, none of which are the server's internal representation, and it is the same one resource regardless.
Manipulation of resources through these representations
When a client holds a representation of a resource, including any metadata attached, it has enough information to modify or delete the resource.
Self-descriptive messages
Each message includes enough information to describe how to process the message. For example, which parser to invoke may be specified by an Internet media type (previously known as a MIME type). Responses also explicitly indicate their cacheability.
Hypermedia as the engine of application state (A.K.A. HATEOAS)
Clients make state transitions only through actions that are dynamically identified within hypermedia by the server (e.g., by hyperlinks within hypertext). Except for simple fixed entry points to the application, a client does not assume that any particular action is available for any particular resources beyond those described in representations previously received from the server.
I'm listening to a lecture on the subject and the lecturer has said:
"When someone comes up to our API, if you are able to get a customer object and you know there are order objects, you should be able to get the order objects with the same pattern that you got the customer objects from. These URI's are going to look like each other."
This strikes me as wrong. It's not so much about what the URI's look like or that there is consistency as it is the way in which the URI's are used (identify resources, manipulate the resources through representations, self-descriptive messages, and hateoas).
I don't think that's what Uniform Interface means at all. What exactly does it mean?

Using interfaces to decouple classes from the implementation of their dependencies is a pretty old concept. In REST you use the same concept to decouple the client from the implementation of the REST service. In order to define such an interface (a contract between the client and the service), you have to use standards. This is because if you want an internet size network of REST services, you have to enforce global concepts, like standards to make them understand each other.
Identification of resources - You use the URI (IRI) standard to identify a resource. In this case, a resource is a web document.
Manipulation of resources through these representations - You use the HTTP standard to describe communication. So for example GET means that you want to retrieve data about the URI-identified resource. You can describe an operation with an HTTP method and a URI.
Self-descriptive messages - You use standard MIME types and (standard) RDF vocabs to make messages self-descriptive. So the client can find the data by checking the semantics, and it doesn't have to know the application-specific data structure the service uses.
Hypermedia as the engine of application state (a.k.a. HATEOAS) - You use hyperlinks and possibly URI templates to decouple the client from the application-specific URI structure. You can annotate these hyperlinks with semantics e.g. IANA link relations, so the client will understand what they mean.

The Uniform Interface constraint, that any ReSTful architecture should comply with, actually means that, along with the data, server responses should also announce available actions and resources.
In chapter 5 ("Reprensational State Transfer") of his dissertation, Roy Fielding states that the aim of using uniform interfaces is to:
ease and improve global architecture and the visibility of interactions
In other words, querying resources should allow the client to request other actions and resources without knowing them in advance.
The JSON-API specs (jsonapi.org) offer a good example in the form of a JSON response to an (hypothetical) GET HTTP request on http://example.com/articles :
{
"links": {
"self": "http://example.com/articles",
"next": "http://example.com/articles?page[offset]=2",
"last": "http://example.com/articles?page[offset]=10"
},
"data": [{
"type": "articles",
"id": "1",
"attributes": {
"title": "JSON API paints my bikeshed!"
},
"relationships": {
"author": {
"links": {
"self": "http://example.com/articles/1/relationships/author",
"related": "http://example.com/articles/1/author"
},
},
"comments": {
"links": {
"self": "http://example.com/articles/1/relationships/comments",
"related": "http://example.com/articles/1/comments"
}
}
},
"links": {
"self": "http://example.com/articles/1"
}
}]
}
Just by analysing this single response, a client knows:
What entities were queried ("articles" in this example);
How these entities are structured (articles have fields: id, title, author, comments);
How to retrieve related entities (i.e. the author and the comments);
That there are more entities of type "articles" (10, based on current response length and pagination links).
For those passionate about the topic, I strongly recommend reading Roy Thomas Fielding's dissertation!

Your question is somewhat broad, you seem to be asking for a restatement of the definitions you have. Are you looking for examples or do you not understand somethings specifically stated.
I agree that the line:
These URI's are going to look like each other
is fundamentally wrong. URIs needn't look anything like each other for the Uniform interface constraint to be met. What needs to be present is a uniform way to discover the URIs that identify the resources. This uniform way is unique to each message type, and there must be some agreed upon format. For example in HTML one document resource links to another via a simple tag:
fallback relationship
HTTP servers return html as a text/html resource type which browsers have an agreed upon way of parsing. The anchor tag is the hypermedia control (HATEOAS) that has the unique identifier for the related resource.
The only point that wasn't covered was manipulation. HTML has another awesome example of this, the form tag:
<form action="URI" method="verb">
<input name=""></input>
</form>
again, browser know how to interpret this meta information to define a representation of the resource acted upon at the URI. Unfortunately HTML only lets you GET and POST for verbs...
more commonly in a JOSN based service, when you retrieve a Person resource, it's easy to manipulate that representation and then PUT or PATCH it right back to it's canonical URL. No pre-existing knowledge of the resource is needed to modify it. Now when we write client code we get all wrapped up with the idea that we do in fact need to know the shape before we consume it...but that really is just to make our parsers efficient and easy. We could make parsers that analyze the semantic meaning of each part of a resource and modify it by interpreting the intent of the modification. IE: a command of make the person 10 years older would parse the resource looking for the age, identify the age, and then add 10 years to that value, then send that resource back to the server. Is it easier to have code that expects the age to be at a JSON path of $.age? absolutely...but it's not specifically necessary.

Ok I think I understand what it means.
From Fieldings dissertation:
The central feature that distinguishes the REST architectural style from other network-based styles is its emphasis on a uniform interface between components (Figure 5-6). By applying the software engineering principle of generality to the component interface, the overall system architecture is simplified and the visibility of interactions is improved.
He's saying that the interface between components must be the same. Ie. between client and server and any intermediaries, all of which are components.

Related

RESTful API design - using a resource URI vs an ID

this is my first post, so please bear with me.
I am designing a new RESTful API and I have two design choices in how my clients interact with resources that they create.
As an example, I have a resource: "book", which is a simple, singleton resource.
Creating a new book is very simple:
POST https://api.mydomain.com/book
I know I can also use PUT if I want the operation to be idempotent.
This question is solely about the 200 OK response options, returning either:
an anonymous resource identifier (UUID) of the created "book":
{
book_id = 12345-67890
title = "a fantastic story"
}
a full FQDN URI to the created "book":
{
book_uri = "https://mylibrary.mydomain.com/upstairs/book/12345-67890
title = "a fantastic story"
}
This of course significantly effects the subsequent manipulation of the "book" by the client.
To get the title of the above book, the client API calls would be either:
GET https://api.mydomain.com/book/{book-id}
Example: GET https://api.mydomain.com/book/12345-67890
Notes: The client will always use the same endpoint as the POST call, with the book-id simply appended.
GET {book-uri}
Example: GET https://mylibrary.mydomain.com/upstairs/book/12345-67890
Notes: The client will use the {book-uri} object variable directly from the POST response. Importantly, the returned {book-uri) may be a completely different URI to that of the POST used to create the "book".
So my questions (please) are:
Q1) which is the better model for the client to use and why?
Q2) can you see any issues with using Option 2 in a high volume, commercial system?
Thanks for any help and answers in advance.
can you see any issues with using Option 2 in a high volume, commercial system?
So, Option 2, where the HTTP response includes a URI for the newly created resource, is how the web itself works, and the web seems to be doing pretty well as a high volume commercial system.
Note also that option #2 allows the server to control its URIs. For instance, if you later decide that you want to revise the resource model, and use different spellings for the resource identifiers, then you can do that without needing to make any changes to the client.
You can also introduce, for example, a URI shortening component, because again you've got an identifier with standardized rules for how it works.
You don't necessarily need to use a full URI - we've also got standardized rules for how a URI fragment can be used to compute a URI in a given context, so you'll likely have options like
{
book_uri = "/upstairs/book/12345-67890",
title = "a fantastic story"
}
... depending on whether or not the book resource is staged on the same host as the resource that handles the POST request.
Is this better? That's going to depend on what tradeoffs you need to make, and how much you value each of the benefits versus the costs.
The REST interface is designed to be efficient for large-grain hypermedia data transfer, optimizing for the common case of the Web, but resulting in an interface that is not optimal for other forms of architectural interaction. -- Fielding, 2000

Elasticsearch truly RESTful?

I am designing an API that will need to accept a large amount of data in order to return resources. I thought about using a POST request instead of a GET so I can pass a body with the request. That has been largely frowned upon in the REST community:
Switching to a POST request simply because there's too much data to fit in a GET request makes little sense
https://stackoverflow.com/a/812935/7489702
Another:
Switching to POST discards a number of very useful features though. POST is defined as a non-safe, non-idempotent method. This means that if a POST request fails, an intermediate (such as a proxy) cannot just assume they can make the same request again. https://evertpot.com/dropbox-post-api/
Another: HTTP GET with request body
But contrary to this, Elasticsearch uses POST methods to get around the issue of queries being too long to put in a url.
Both HTTP GET and HTTP POST can be used to execute search with body. Since not all clients support GET with body, POST is allowed as well.https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-body.html
So, is Elasticsearch not truly restful? Or, does the difference between POST and GET not matter as much in modern browsers?
ElasticSearch intent is not to be RESTful but to provide a (pragmatic) Web-API to clients in order to index documents and offer services like fulltext search or aggregations to help the client in its needs.
Not everything that is exposed via HTTP is automatically RESTful. I claim that most of the so called RESTful services aren't as RESTful as they think they are. In order to be RESTful a service has to adhere to a couple of constraints which Fielding, the inventor of REST, precisied further in a blog post.
Basically RESTful services should adhere to and not violate the underlying protocol and put a strong focus on resources and their presentation via media-types. Altough REST is used via HTTP most of the time, it is not restricted to this protocol.
Clients on the other hand should not have initial knowledge or assumptions on the available resources or their returned state ("typed" resource) in an API but learn them on the fly via issued requests and analyzed responses. This gives the server the opportunity to move arround or rename resources easily without breaking a client implementation.
HATEOAS (Hypertext as the engion of aplication state) enriches a resource state with links a client can use to trigger further requests in order to update its knowlege base or perform some state changes. Here a client should determine the semantics of an URI by the given relation name rather than parse the URI as the relation name should not change if the server moves arround a resource for whatever reason.
The client furthermore should use the relation name to determine what content type a resource may have. A relation name like news could force the client to request the resource as application/atom+xml representation while a contact relation might lead to a representation request of media-type text/vcard, vcard+json or vcard+xml.
If you look at an ElasticSearch sample I took from dzone you will see that ES does not support HATEOAS at all:
Request
GET /bookdb_index/book/_search?q=guide
Response:
"hits": [
{
"_index": "bookdb_index",
"_type": "book",
"_id": "1",
"_score": 0.28168046,
"_source": {
"title": "Elasticsearch: The Definitive Guide",
"authors": [
"clinton gormley",
"zachary tong"
],
"summary": "A distibuted real-time search and analytics engine",
"publish_date": "2015-02-07",
"num_reviews": 20,
"publisher": "manning"
}
},
{
"_index": "bookdb_index",
"_type": "book",
"_id": "4",
"_score": 0.24144039,
"_source": {
"title": "Solr in Action",
"authors": [
"trey grainger",
"timothy potter"
],
"summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",
"publish_date": "2014-04-05",
"num_reviews": 23,
"publisher": "manning"
}
}
]
The problem here is, that the response contains ElasticSearch related stuff that obviously is some arbitrary metadata for the returned results. While this could be handled via special media-types that teaches a client what each fields semantics are the actual payload kept in the _source element is still generic. Here you'd need further custom media-type extensions for each possible type.
If ES changes the response format in future clients which assume that _type will determine the type of a resource and _source will define the current state of some object of that type may break and hence stop working. Instead a client should ask the server to return a resource in a format it understands. If the client does not know any of the requested representation formats it will notify the client accordingly. If it knows at least one it will transform the state of the requested resource to a representation the client understands.
Long story short, ElasticSearch is by no means RESTful and it also does not try to be. Instead your "RESTful" service should use it and use the results to generate a response in accordance with the requested representation by the client.
So, is Elasticsearch not truly restful? Or, does the difference between POST and GET not matter as much in modern browsers?
I think ES is not truly restful, because it's query is more complex than normal Web Application.
REST proponents tend to favor URLs, such as
http://myserver.com/catalog/item/1729
but the REST architecture does not require these “pretty URLs”. A GET request with a parameter
http://myserver.com/catalog?item=1729 (Elasticsearch do this)
It is difference POST and GET in modern developer.
GET requests should be idempotent. That is, issuing a request twice should be no different from issuing it once. That’s what makes the requests cacheable. An “add to cart” request is not idempotent—issuing it twice adds two copies of the item to the cart. A POST request is clearly appropriate in this context. Thus, even a RESTful web application needs its share of POST requests.
reference What exactly is RESTful programming?

Is it really practical to use URLs instead of ids in a REST API

The proper design of REST APIs seems to be a controversial topic. As far as I understand it, the purist approach with regard to ids would be that the URL is the only identifier of a resource for the outside world, so neither does the client have to interpret the URL in any way (e.g. knowing that the latest segment is the id) nor does the id have to be included explicitly in the representation returned for a simple GET request.
At first sight this seems to be a good rule because the client does not have to care about generating URLs based on ids, it's just the same thing. The id tells you how to retrieve the resource. However, I doubt that this is really applicable in practice. Some concerns that come to my mind:
What if the URL changes because of a new API version (given that it is part of the URL)
or the protocol changes from http to https.
or the application even moves to another domain for whatever reason
Short Ids are handy for referencing resources in parameters. This would not be possible: /books?author=short.author.id
It just puts too much information into an id that does not really belong there because the ide should not be interpreted by any consumer in such a way.
Is this really done in practice? Are there examples of popular public APIs applying this pattern? Or maybe I don't understand it correctly and this is not what REST purists advocate?
Have a look at Hypermedia Driven RESTFul APIs. In HATEOAS, URIs are discoverable (and not documented) so that they can be changed. That is, unless they are the very entry points into your system (Cool URIs, the only ones that can be hard-coded by clients) - and you shouldn't have too many of those if you want the ability to evolve the rest of your system's URI structure in the future. This is in fact one of the most useful features of REST.
For the remaining non-Cool URIs, they can be changed over time, and your API documentation should spell out the fact that they should be discovered at runtime through hypermedia traversal.
Looking at the Richardson's Maturity Model (level 3), this would be where links come into play. For example, from the top level, say /api/version(/1), you would discover there's a link to the groups. Here's how this could look in a tool like HAL Browser:
Root:
{
"_links": {
"self": {
"href": "/api/root"
},
"api:group-add": {
"href": "http://apiname:port/api/group"
},
"api:group-search": {
"href": "http://apiname:port/api/group?pageNumber={pageNumber}&pageSize={pageSize}&sort={sort}"
},
"api:group-by-id": {
"href": "http://apiname:port/api/group/{id}" (OR "href": "http://apiname:port/api/group?id={id}")
}
}
}
The advantage here would be that the client would only need to know the relationship (link) name (well obviously besides the resource structure/properties), while the server would be mostly free to alter the relationship (and resource) url.

Handling RESTful representation structure difference between POST and GET

I'm designing a REST API and despite trawling a number of best practice guides I can't find much relating to the best practice of handling the disparity between representation structure needed for a POST vs the same representation structure returned from a GET.
GET for a dummy user representation might look like this:
{
"id": 1234,
"created": "2012-04-23T18:25:43.511Z",
"username": "johndoe#example.com",
"name": "John Doe"
}
However, POST for the same dummy user representation cannot specify certain properties (namely the id and created):
{
"username": "johndoe#example.com",
"name": "John Doe"
}
Obviously this is an overly simplified example but given that the user cannot specify certain fields (and it might not always be obvious which ones are pertinent to the applied method) is it best practice to create separate representations for each or to expect the most complete version and handle the data disparity transparently on the server?
Despite the apparent ease of having a single representation and handling the disparity server side I am worried that this would be a bad experience for a user if it wasn't clear which values can be specified (or altered using PUT for example). If the tendency is to create separate representations is there a naming convention to apply to the representation definition?
e.g. i_user for incoming user and o_user for outgoing user. Or user_full and user_min or user and .user etc.
Update: My overly simplified example perhaps didn't properly illustrate the issue. Imagine a representation that has 50 properties (for example a server representation with all its monitoring attributes - cpu, ram, temp, storage_drive_a, storage_drive_b, file_permission etc.) Of these 50 properties, 30 are read only properties and 20 of these are values that can be set.
First of all, the final semantics of the POST method are determined by the targeted resource, not by the HTTP protocol, as with the other methods, so your POST method can do anything you want, as long as you document it properly, and you are not replicating functionality already standardized by other methods.
So, in short, there's nothing wrong with having a different representation for POST and GET method.
However, asking for a best-practice in this case is pointless, because what defines the representation format is the media-type being used, not the method, but most of the so-called REST APIs around the internet use generic media-types for everything and clients rely on URI semantics to know which resource they are dealing with, which is not RESTful at all. Basically, you are asking for the best-practice for a problem that doesn't really exist in REST when things are done properly.
So, to answer your question, you can have different representations with different media-types -- like your complete user representation might have a media-type application/vnd.mycompany.user.full.v1+json, and a simplified user representation might have a media-type application/vnd.mycompany.user.min.v1+json -- or you can have a single representation like application/vnd.mycompany.user.v1+json and your documentation for this media-type might detail how some properties might exist or not, or might have default values if not provided. Your POST method will require one media-type to work, and will respond with 415 Unsupported Media Type if clients send anything else in the Content-Type header. In the same way, a client may choose the representation it wants with the Accept header.
As you can see, what you are asking isn't a problem when you are really doing REST, and not merely using it as a buzzword for an HTTP API.

HATEOAS - Discovery and URI Templating

I'm designing a HATEOAS API for internal data at my company, but have been having troubles with the discovery of links. Consider the following set of steps for someone to retrieve information about a specific employee in this system:
User sends GET to http://coredata/ to get all available resources, returns a number of links including one tagged as rel = "http://coredata/rels/employees"
User follows HREF on the rel from the first request, performing a GET at (for example) http://coredata/employees
The data returned from this last call is my conundrum and a situation where I've heard mixed suggestions. Here are some of them:
That GET will return all employees (with perhaps truncated data), and the client would be responsible for picking the one it wants from that list.
That GET would return a number of URI templated links describing how to query / get one employee / get all employees. Something like:
"_links": {
"http://coredata/rels/employees#RetrieveOne": {
"href": "http://coredata/employees/{id}"
},
"http://coredata/rels/employees#Query": {
"href": "http://coredata/employees{?login,firstName,lastName}"
},
"http://coredata/rels/employees#All": {
"href": "http://coredata/employees/all"
}
}
I'm a little stuck here with what remains closest to HATEOAS. For option 1, I really do not want to make my clients retrieve all employees every time for the sake of navigation, but I can see how using URI templating in example two introduces some out-of-band knowledge.
My other thought was to use the RetrieveOne, Query, and All operations as my cool URLs, but that seems to violate the concept that you should be able to navigate to the resources you want from one base URI.
Has anyone else managed to come up with a good way to handle this? Navigation is dead simple once you've retrieved one resource or a set of resources, but it seems very difficult to use for discovery.
Option 2 is not too bad as you're using RFC 6570 to characterize the URI patterns; while HATEOAS is usually stated in terms of not having clients synthesize URIs, if a server is prepared to make guarantees on the URI template and to tell it to clients explicitly in a standard format, it's acceptable. (I would be tempted to have the “list all employees” URL be without the all suffix, so as to distinguish it from the employee with that ID; the client should not — in principle — know what an employee ID looks like.)
In fact, the main problem is actually that clients have to understand what those tag URIs mean; there's just no real way to guess that “http://coredata/rels/employees#All” means “list all employees”. That's where you get into embedding knowledge in clients, semantic labeling, etc. and HATEOAS doesn't really address those things.
TL;DR: Use OPTIONS method to return programmatically consumable documentation and always implement pagination.
We create a number of internal REST services at my work. We have standardized on the use of the OPTIONS method to return the metadata of a resource. The metadata we return acts a parsable documentation of that resource. It indicates url templates, various options such as PAGE, PAGESIZE and the different methods that the resource supports. We also return rel links so top level resource discovery can occur with the use of OPTIONS without pulling and actual data.
We also implement pagination specifically to prevent issues around returning large amounts of data unnecessarily.
My HATEOAS API returns HTML as well as HAL+JSON, as you are using, and they both use the same URIs, so my JSON responses simply return what a human web user would see (minus all the pretty colours). e.g.
GET /
{"_links": {
"http://coredata/companies": { "href": "/companies?page=1" }
...
}}
GET /companies?page=1
{"_links": {
"next": { "href": "?page=2" }
...
}}