RESTful APIs when multiple actions on the same URI - rest

So far as I know, four kind of methods are used in RESTful APIs:
GET for getting the resource.
POST for updating the resource.
PUT for creating or substituting the resource.
DELETE for deleting the resource.
Assume we have a resource named apple, and we can 'update' it in several ways. For example, pare it, slice it, or make it apple juice.
Each of these three different updating actions takes different arguments, and of their APIs, the common part will be:
POST /apple HTTP/1.1
Host: www.example.com
<different combination of arguments>
In this situation, three APIs share the same URI and the same request method, the only differences of them are arguments. I think this forces the backend to be ready for accepting the union set of those arguments, and to distinguish which action is actually requested, the backend need to check out the combination of the arguments. It's so much complicated and not graceful.
So my question is:
In this apple cases, how to work out an elegant set of RESTful APIs which make the backend easily handle with it.

First of all, try to avoid conflating HTTP methods to CRUD operations. I believe that's the main source of confusion in REST. HTTP methods don't translate to CRUD operations cleanly like that. I have a detailed answer here:
S3 REST API and POST method
In short.
POST is the method used for any operation that isn't standardized by HTTP, and subjects the payload to the target URI.
PUT is used to completely replace the resource at the present URI, and subjects the payload to the service itself.
PATCH is for partial idempotent updates, with a diff between the current and the desired state.
DELETE is used to delete the resource.
GET is used to retrieve the resource.
Now, on the backend side, try to think of REST resources more like a state machine where you can use the methods to force a transition rather than an object with methods. That way you focus the implementation on the resource itself, not on the interaction with the protocol. For instance, you may change an object's attributes straightforwardly from the method's payload, and then have a method that's called to detect what transition is needed.
For instance, you may think of an apple as having three states, whole, pared, sliced and juiced. You transition between states by using the standardized behavior of the methods.
For instance:
GET /apple
{"state": "whole",
"self": "/apple"}
Then you want to slice it. You may do something like:
PUT /apple
{"state": "sliced"}
Or you may do something like:
PATCH /apple
{"from_state": "whole", "to_state": "sliced"}
Or even something like:
POST /apple
{"transition": "slice"}
The idea is that the implementations can be generic enough that you don't have to worry too much about coupling the resource to the HTTP methods.
The PUT version is idempotent, so your clients can choose to use it when they need idempotence.
The PATCH version guarantees the client knows the current state and is trying a valid transition.
The POST version is the most flexible, you can do anything you want, but it needs to be documented in detail. You can't simply assume your clients will know how the method works.
As long as your implementation of the resource understands that when apple.state is changed to something else it should detect what change occurred and perform the adequate transition, you are completely decoupled from the protocol. It doesn't matter what method was used.
I believe this is the most elegant solution, and makes everything easier to handle from the backend side. You can implement your objects without worrying too much about the protocol. As long as the objects can be transitioned between states, they can be used by any protocol that can effect those transitions.

My RESTful HTTP API is rather different from yours. I have:
GET for getting a resource.
POST for appending a new resource to a collection.
PUT for substituting a resource (including truncating collections).
DELETE for deleting a resource.
PATCH for updating a resource.
LINK for indicating a relationship between two resources.
UNLINK for removing a relationship between two resources.
A ‘leaf’ resource can be thought of as a collection too.
For example, say you have /fruits and you POST an apple to that collection resource, that returns
201 Created
Location: /fruits/apple
In the same way, you can treat /fruits/apple as a collection of its properties, so:
GET /fruits/apple
->
colour=red&diameter=47mm
GET /fruits/apple/colour
->
red
GET /fruits/apple/diameter
->
47mm
and therefore:
PUT /fruits/apple/slices
"12"
->
201 Created
GET /fruits/apple
->
colour=red&diameter=47mm&slices=12
So in summary, I would recommend representing your actions as nouns, and locate those nouns as sub-resources of the resource you want to apply the action to.

Think in terms of resources. Here Apple is a resource.
To add one or more apples to list "/apples", use POST. REST style allows posting array.
POST /apples HTTP/1.1
Host: www.example.com
Now suppose you have an apple with ID 123. You can get details using method GET on "/apple/123".
GET /apples/123 HTTP/1.1
Host: www.example.com
To make any change to apple 123, just POST to it directly.
PUT /apples/123 HTTP/1.1
Host: www.example.com
Pare it, slice it, or make it apple juice - all these are basically changing some attributes of apple 123. As you were saying (rightly), PUT different combination of attributes.

I think this is up to the implementor to decide, but I see two approaches. Strictly from a single responsibility perspective it may make sense to provide separate services for these distinct operations.
However if you insist on a single service of all I guess you can pass an object with a action type qualifier to make it easy to delegate the request to different code in the service. The single object can then have other optional parameters to support the data needs of each operation.

Related

How to address ambiguous 404s when designing a RESTful API

I've come across this curious scenario while writing tests + documentation for a REST API I am developing. According to this REST tutorial, a key abstraction to exploit in a RESTful API is the concept of a resource, and a common pattern is to have resources which themselves contain resources of their own. Additionally, returning 404 for an ID'd resource that does not actually exist is just as much of a common pattern.
My questions comes from the fact that a 404 response code can be ambiguous considering the hierarchical nature of a REST API.
For example, assume the data layer our REST API interacts with has the following data:
{
"users": {
"foo": {
"notes": {
"hello": "world"
}
}
}
}
Calls to our REST API that return 200 imply that all resources in the path exist:
GET /users/foo returns 200 because the user foo exists.
GET /users/foo/notes returns 200 for the same reason.
GET /users/foo/notes/hello returns 200 because both the user foo and a note named hello belonging to foo both exist.
There are even expected 404 response codes for particular paths:
GET /users/bar returns 404. That is nonambiguous since the 404 only refers to one resource.
GET /users/bar/notes returns 404. This is just as unambiguous (assuming the API does not return 404 for nonexistent paths).
But consider that the following return 404 for different and ambiguous reasons:
GET /users/bar/notes/baz returns 404 because the user bar does not exists.
GET /users/foo/notes/baz returns 404 because the existing user foo does not have a baz note.
In short, the 404s returned do not inform the client what exactly failed to be found: the user or the note. So my question is as follows:
Is it the responsibility of the server to be nonambiguous with 404 response codes? And if so, how should it differentiate to the client the nonexistence of a user versus the nonexistence of a user's note?
Is it the responsibility of the server to be nonambiguous with 404 response codes? And if so, how should it differentiate to the client the nonexistence of a user versus the nonexistence of a user's note?
By providing a "a representation containing an explanation of the error situation, and whether it is a temporary or permanent condition" as described in RFC 7231.
In other words, put the explanatory details into the document that you include in the HTTP response.
It may help to think more carefully about how all this works with web pages.
The status code is metadata in the transfer of documents over a network domain. The intended audience for that information is the web browser (and other general purpose components - spiders, caches, and so on). It's provided so that your browser (and other general purpose components) can correctly interpret the semantics of the response.
The audience for the "representation of the error" is the human being using the web browser. That's the place where one would provide, for example, information about what specifically has gone wrong, or what corrective actions might be taken.
In modern days, it is often the case that we are expecting bespoke machine clients, rather than humans, to be looking at the "web browser". Free form text or free form text marked up with hypermedia controls aren't likely to be useful. So we probably want to use problem details - a standardized schema for reporting problems.
One difficulty you may be having (not your fault; the literature sucks) is recognizing that identifiers are semantically opaque. /users/foo/notes/baz does not, generally, have any dependency on /users/foo/notes or any of the other prefixes. Nor does the identifier mean that /users/foo/notes/baz has four different parts that need to be satisfied.
Identifiers should be understood like keys into a map/dictionary - 200 means that the key exists in the map, 404 means the key doesn't exist in the map. But that doesn't actually tell you anything about the presence or absence of other keys with similar spellings!
Is your API, which conventionally organizes its resource model into a hierarchy, and chooses identifiers that are closely aligned with that hierarchy, "better" than an API that uses an unconventional resource model and arbitrary identifiers? Probably.
But good resource models and good identifier spelling conventions are not a REST constraint, and the HTTP and URI specifications also support designs that don't follow the current conventions (among other things, backwards compatibility is really important to REST and the web; REST and the web predate these spelling conventions by quite a bit).
(Analogy: we have coding conventions that describe "best practices" around ideas like variable naming and function naming because we use languages that don't restrict us to using "good" names. The machines don't care.)

REST API path using route parameters without identifiers

Using expressjs term route parameters to show my problem, I also see people call that path parameters. The "proper" URL will be
Route path: /users/:userId/books/:bookId
But currently I am taking over a project that design the api like this,
/:userId/:bookId
/:groupId/:userId/some_resurce
...
The obvious problem is when I look at the url from browser I will feel confused with what those parameters mean, like the following. But the project has run for more than one year, I need to know whether it is worth the effort to rewrite it.
So is there other problem with the URL like these ?
So is there other problem with the URL like these ?
They might be making extra work for your operators when reading the access logs?
REST doesn't care about URI spelling conventions - until you get to the origin server, a URI is effectively an opaque string; only the origin server has the authority to decompose the URI into its semantic parts.
Which is to say, general purpose components don't care that there are identifiers encoded into the path, or that the semantics of those identifiers changes depending on other path elements.
In particular, they don't care at all that unrelated identifiers have common elements:
/1/2
/1/2/some_resource
As far as a general purpose component is concerned, the resources identified here have no special relationship to one another. (For example, if you DELETE /1/2, that's not expected to impact /1/2/some_resource in any way).
when I look at the url from browser I will feel confused with what those parameters mean
Yup - this is your primary argument: that the current URI design doesn't consider human affordances.
Unless you can make a case that those human focused considerations (users, operators, tech writers) offset the costs of change, you are probably stuck with it.

REST verb for state change - can we agree on POST?

How to best extend REST with FSM state changes?
No one can know if a state change is idempodent or not, so the wisest thing may be to assume they're not, and as a general rule use POST, ok?
To me and my findings, POST makes more sense than PUT or PATCH.
POST /coffeemachines/{id}/start
or maybe more verbose?
POST /coffeemachines/{id}/state/start
Although start looks like a verb (breaking REST-practices), I think it's not:
The main verb is a poking POST, we want a state-change.
start is just the attribute-value for the requested state-change.
I guess I'm not the first man on the moon here, thankful for any references or thoughts.
You can send a partial update request with HTTP PATCH that contains only the new state.
PATCH /coffeemachines/{id}
{
status: "active"
}
According to Wikipedia:
The PATCH method is a request method supported by the HTTP protocol for making partial changes to an existing resource. The PATCH method provides an entity containing a list of changes to be applied to the resource requested using the HTTP URI. The list of changes are supplied in the form of a PATCH document.
It is also more readable to separate words in the path with hyphens. For example:
PATCH /coffee-machines/{id}
{
status: "active"
}
REST verb for state change - can we agree on POST?
The reference implementation of REST is the World Wide Web, which was catastrophically successful even though HTML (the dominant media type) only specified support for GET and POST.
Using POST for unsafe operations is fine.
Although start looks like a verb (breaking REST-practices)
No -- REST doesn't care about the spellings of URI. That's part of the point: the server can change the URI in links any time it likes because the clients just follow the links.
That said, there is an issue with your proposed identifiers, which you may want to consider
/coffeemachines/{id}
/coffeemachines/{id}/start
As far as REST is concerned, these are different resources. That means that your locally cached copy of /coffeemachines/{id} is not invalidated when you POST a request to /coffeemachines/{id}/start.
If you care to take advantage of the caching support that is already built into the domain agnostic components that are available, then you want the target of the POST to match the target of the GET: /coffeemachines/{id}
/coffeemachines/{id}/start, in this design, isn't the target of the POST, but is instead the identifier of the form resource that submits start messages to /coffeemachines/{id}. Likewise, /coffeemachines/{id}/stop would identify the form resource that submits stop messages.
The representation of the coffee machine would include links to these forms when the transitions are permitted; for instance, when the coffee machine is off, then the representation of the coffee machine returned by GET would include a link to the start form, but not a link to the stop form.
/coffeemachines/{id}/start and /coffeemachines/{id}/stop are different resources from /coffeemachines/{id}, and therefore might have their own caching policies.
Of course, it isn't required that the forms be separate resources -- the mechanism would also work if the forms were part of the representation of the /coffeemachines/{id} resource itself.
Can I ask you to elaborate around POST vs PATCH
I found that this observation by Roy Fielding helped me:
HTTP does not attempt to require the results of a GET to be safe. What it does is require that the semantics of the operation be safe, and therefore it is a fault of the implementation, not the interface or the user of that interface, if anything happens as a result that causes loss of property
PATCH has stricter semantics than POST; that means that clients (and generic components) can make stronger assumptions about what is going on.
So in the following examples:
PATCH /foo HTTP/1.1
Content-Type: application/json-patch+json
POST /foo HTTP/1.1
Content-Type: application/json-patch+json
The server can handle these messages in exactly the same way. Clients that recognize the PATCH method will recognize that the unsafe changes on the server are supposed to be all-or-nothing ("The server MUST apply the entire set of changes atomically...") and can leverage that as they like, but with POST, that additional constraint is missing and cannot be assumed.
The PATCH spec notes:
A comparison to POST is even more difficult, because POST is used in widely varying ways and can encompass PUT and PATCH-like operations if the server chooses. If the operation does not modify the resource identified by the Request-URI in a predictable way, POST should be considered instead of PATCH or PUT.

Triggering an action POST or PUT

I've read PUT vs. POST in REST in full and also w3 docs but still not sure of what's the right approach.
If I want to turn on the heater:
POST /house/123/
{"appliance" : "heater" , "action" : "on"}
or
PUT /house/123/
{"appliance" : "heater" , "action" : "on"}
or I should be using some other method? I think neither of them solve the question at hand since there is no object creation happening here...
EDIT:
What if I it's just just turning on/off. Rather it's a reboot. Think of it as something that needs to happen. Doesn't necessarily need to be a change of state.
/house/123/
{"action" : "reboot-heater"}
Both methods are appropriate, in various circumstances.
POST is the universal method - you can use it for anything, although there are often better choices (ex: GET, when the operation is safe). HTML only supports GET and POST, and it is the lingua franca of the web. So you can infer that POST is fine.
PUT also works; it's analogous to "save", "replace", "upsert". The issue with PUT is that the semantics are that the request payload is a replacement for the current state of the target resource.
In effect, that means that
PUT /house/123/
{"appliance" : "heater" , "action" : "on"}
should be a complete replacement for the state of /house/123. That's probably not what you want, assuming the state of the house includes descriptions of other appliances, rooms, occupants, location, and so on.
You could PATCH the house, describing the change to the heater with a patch document. But that relaxes the idempotent semantics that are an important benefit of PUT.
You could also PUT to a different target resource - but that may not give you suitable caching behavior.
It's important, if you want to get the framing right in your head, to think about the fact that the resources are part of your integration domain. Your REST api is a disguise that your server wears, pretending to be a dumb HTTP key value store.
A good read, if you have time, is RESTful Casuistry, where a bunch of folks discuss what a RESTful protocol for requesting a server shutdown should be.

Is using a verb in URL fundamentally incompatible with REST?

So let's say we have something that does not seem best represented as a resource (status of process that we want to pause, stateless calculation we want to perform on the server, etc).
If in API design we use either process/123/pause or calculations/fibonacci -- is that fundamentally incompatible with REST? So far from what I read it does not seem to, as long as these URLs are discoverable using HATEOAS and media types are standardized.
Or should I prefer to put action in the message as answered here?
Note 1:
I do understand that it is possible to rephrase some of my examples in terms of nouns. However I feel that for specific cases nouns do not work as well as verbs do. So I am trying to understand if having those verbs would be immediately unRESTful. And if it is, then why the recommendation is so strict and what benefits I may miss by not following it in those cases.
Note 2:
Answer "REST does not have any constraints on that" would be a valid answer (which would mean that this approach is RESTful). Answers "it depends on who you ask" or "it is a best practice" is not really answering the question. The question assumes concept of REST exist as a well-defined common term two people can use to refer to the same set of constraints. If the assumption itself is incorrect and formal discussion of REST is meaningless, please do say so.
This article has some nice tips: http://www.vinaysahni.com/best-practices-for-a-pragmatic-restful-api
Quoting from the article:
What about actions that don't fit into the world of CRUD operations?
This is where things can get fuzzy. There are a number of approaches:
Restructure the action to appear like a field of a resource. This works if the action doesn't take parameters. For example an activate action could be mapped to a boolean activated field and updated via a PATCH to the resource.
Treat it like a sub-resource with RESTful principles. For example, GitHub's API lets you star a gist with PUT /gists/:id/star and unstar with DELETE /gists/:id/star.
Sometimes you really have no way to map the action to a sensible RESTful structure. For example, a multi-resource search doesn't really
make sense to be applied to a specific resource's endpoint. In this
case, /search would make the most sense even though it isn't a noun.
This is OK - just do what's right from the perspective of the API
consumer and make sure it's documented clearly to avoid confusion.
I personally like suggestion #2. If you need to pause something, what are you pausing? If it's a process with a name, then try this:
/process/{processName}/pause
It's not strictly about nouns vs. verbs; it's about whether you are:
identifying resources
manipulating resources through representations
What's a resource? Fielding defines it thusly:
The key abstraction of information in REST is a resource. Any information that can be named can be a resource: a document or image, a temporal service (e.g. "today's weather in Los Angeles"), a collection of other resources, a non-virtual object (e.g. a person), and so on. In other words, any concept that might be the target of an author's hypertext reference must fit within the definition of a resource. A resource is a conceptual mapping to a set of entities, not the entity that corresponds to the mapping at any particular point in time."
Now, to your question. You can't just look at a URL and say, "Is such-and-such a URL fundamentally incompatible with REST?" because URLs in a REST system aren't really the important bit. It's more important that the URLs process/123/pause and calculations/fibonacci identify resources by the above definition. If they do, there isn't a REST constraint violation. If they don't, you're violating the uniform interface constraint of REST. Your example leads me to believe it does not fit the resource definition and therefore would violate this constraint.
To illustrate what a resource might be in this system, you could change the status of a process by POSTing it to the paused-processes resource collection. Though that is perhaps an unusual way of working with processes, it's not fundamentally incompatible with the REST architecture style.
In the case of calculations, the calculations themselves might be the resource and that resource might look like this:
Request:
GET /calculations/5
Response:
{
fibonacci: 5,
prime-number: true,
square-root: 2.23607
}
Though again, that's a somewhat unusual concept of a resource. I suppose a slightly more typical use might look like this:
Request:
GET /stored-calculations/12381728 (note that URL is a random identifier)
Response:
{
number: 5,
fibonacci: 5,
prime-number: true,
square-root: 2.23607
}
though presumably you'd want to store additional information about that resource other than a sheer calculation that anyone can do with a calculator...
Response:
{
number: 5,
fibonacci: 5,
prime-number: true,
square-root: 2.23607,
last-accessed-date: 2013-10-28T00:00:00Z,
number-of-retrievals-of-this-resource: 183
}
It's considered bad practice to use verbs in your REST API.
There's some material on SO and elsewhere on why and how to avoid using verbs. That being said, there are plenty of "REST" APIs that use verbs.
For your process API, I would make the resource Process have a state field, which can be modified with a PUT.
Suppose GET /process/$id currently returns:
{
state: "PAUSED"
}
Then you PUT this to /process/$id:
{
state: "RUNNING"
}
which makes the process change state.
In the case of Fibonacci, just have a resource named fibonacci, and use POST with parameters (say n for the first n fibonacci numbers) in the body, or perhaps even GET with a query in the URL.
The HTTP method is the verb: GET, PUT, POST, et cetera, while the URL should always refer to the noun (recipient of the action). Think of it like this: Would two verbs in a sentence make sense? "GET calculate" is nonsense, where "GET state" is good and "GET process" is better ("state" being metadata for a process).