Is using a verb in URL fundamentally incompatible with REST? - rest

So let's say we have something that does not seem best represented as a resource (status of process that we want to pause, stateless calculation we want to perform on the server, etc).
If in API design we use either process/123/pause or calculations/fibonacci -- is that fundamentally incompatible with REST? So far from what I read it does not seem to, as long as these URLs are discoverable using HATEOAS and media types are standardized.
Or should I prefer to put action in the message as answered here?
Note 1:
I do understand that it is possible to rephrase some of my examples in terms of nouns. However I feel that for specific cases nouns do not work as well as verbs do. So I am trying to understand if having those verbs would be immediately unRESTful. And if it is, then why the recommendation is so strict and what benefits I may miss by not following it in those cases.
Note 2:
Answer "REST does not have any constraints on that" would be a valid answer (which would mean that this approach is RESTful). Answers "it depends on who you ask" or "it is a best practice" is not really answering the question. The question assumes concept of REST exist as a well-defined common term two people can use to refer to the same set of constraints. If the assumption itself is incorrect and formal discussion of REST is meaningless, please do say so.

This article has some nice tips: http://www.vinaysahni.com/best-practices-for-a-pragmatic-restful-api
Quoting from the article:
What about actions that don't fit into the world of CRUD operations?
This is where things can get fuzzy. There are a number of approaches:
Restructure the action to appear like a field of a resource. This works if the action doesn't take parameters. For example an activate action could be mapped to a boolean activated field and updated via a PATCH to the resource.
Treat it like a sub-resource with RESTful principles. For example, GitHub's API lets you star a gist with PUT /gists/:id/star and unstar with DELETE /gists/:id/star.
Sometimes you really have no way to map the action to a sensible RESTful structure. For example, a multi-resource search doesn't really
make sense to be applied to a specific resource's endpoint. In this
case, /search would make the most sense even though it isn't a noun.
This is OK - just do what's right from the perspective of the API
consumer and make sure it's documented clearly to avoid confusion.
I personally like suggestion #2. If you need to pause something, what are you pausing? If it's a process with a name, then try this:
/process/{processName}/pause

It's not strictly about nouns vs. verbs; it's about whether you are:
identifying resources
manipulating resources through representations
What's a resource? Fielding defines it thusly:
The key abstraction of information in REST is a resource. Any information that can be named can be a resource: a document or image, a temporal service (e.g. "today's weather in Los Angeles"), a collection of other resources, a non-virtual object (e.g. a person), and so on. In other words, any concept that might be the target of an author's hypertext reference must fit within the definition of a resource. A resource is a conceptual mapping to a set of entities, not the entity that corresponds to the mapping at any particular point in time."
Now, to your question. You can't just look at a URL and say, "Is such-and-such a URL fundamentally incompatible with REST?" because URLs in a REST system aren't really the important bit. It's more important that the URLs process/123/pause and calculations/fibonacci identify resources by the above definition. If they do, there isn't a REST constraint violation. If they don't, you're violating the uniform interface constraint of REST. Your example leads me to believe it does not fit the resource definition and therefore would violate this constraint.
To illustrate what a resource might be in this system, you could change the status of a process by POSTing it to the paused-processes resource collection. Though that is perhaps an unusual way of working with processes, it's not fundamentally incompatible with the REST architecture style.
In the case of calculations, the calculations themselves might be the resource and that resource might look like this:
Request:
GET /calculations/5
Response:
{
fibonacci: 5,
prime-number: true,
square-root: 2.23607
}
Though again, that's a somewhat unusual concept of a resource. I suppose a slightly more typical use might look like this:
Request:
GET /stored-calculations/12381728 (note that URL is a random identifier)
Response:
{
number: 5,
fibonacci: 5,
prime-number: true,
square-root: 2.23607
}
though presumably you'd want to store additional information about that resource other than a sheer calculation that anyone can do with a calculator...
Response:
{
number: 5,
fibonacci: 5,
prime-number: true,
square-root: 2.23607,
last-accessed-date: 2013-10-28T00:00:00Z,
number-of-retrievals-of-this-resource: 183
}

It's considered bad practice to use verbs in your REST API.
There's some material on SO and elsewhere on why and how to avoid using verbs. That being said, there are plenty of "REST" APIs that use verbs.
For your process API, I would make the resource Process have a state field, which can be modified with a PUT.
Suppose GET /process/$id currently returns:
{
state: "PAUSED"
}
Then you PUT this to /process/$id:
{
state: "RUNNING"
}
which makes the process change state.
In the case of Fibonacci, just have a resource named fibonacci, and use POST with parameters (say n for the first n fibonacci numbers) in the body, or perhaps even GET with a query in the URL.

The HTTP method is the verb: GET, PUT, POST, et cetera, while the URL should always refer to the noun (recipient of the action). Think of it like this: Would two verbs in a sentence make sense? "GET calculate" is nonsense, where "GET state" is good and "GET process" is better ("state" being metadata for a process).

Related

Endpoint with two path parameters

I'm learning REST and I have a question.
Is there a scenario where the endpoint person/pathParm1/PathParam2 is legitimate?
For example:
person/ben/stiller
people /2/4
As far as I understand REST, query parameters should be used for searches:
person?firstName=ben&secondName=stiller
or
person/2/order4
REST doesn't care what spelling conventions you use for your resource identifiers.
So if you want to have a URI template with multiple variables to expand, and more than one of those variables are expanded as path segments, that's fine.
For example, you'll notice that your browser has no trouble with this resource identifier:
https://stackoverflow.com/questions/74969638/endpoint-with-two-path-parameters
which might reasonably be produced by expanding variables into a template like
https://stackoverflow.com/questions/{id}/{hint}
As far as I understand REST, query parameters should be used for searches:
That's not a REST constraint, although for the special case of the web it turned out that way. This is primarily a historical accident: we didn't have standards for URI templates when the web was young, which meant that searches came about from the standardized implementation of HTML form submissions (application/x-www-form-urlencoded key value parameters replacing the query part of the form action)
REST does say that we use resource identifiers to... identify resources; and that we all use the same general purpose resources (ie: conforming to the production rules defined in RFC 3986), but without constraints on the spelling or semantics of those identifiers.
Example: URL shorteners work.
(Note: your misunderstanding is a common one, and not at all your fault; the literature sucks. FWIW, I was once where you are; Stefan Tilkov's 2014 talk was the one that really got my own thinking straightened out.)
That said, you might find a "query parameters should be used for searches" constraint coming from somewhere else; a local style guide, for example.
this means I could also make a restful endpoint like this: api/person/{firstName}/{lastName} instead api/person?firstName=ben&lastName=stiller ?
Yes; you can use either of those spellings for your resource identifiers, and all of the general purpose components out there will still "just work" -- because they are treating the resource identifier as semantically opaque.

How to address ambiguous 404s when designing a RESTful API

I've come across this curious scenario while writing tests + documentation for a REST API I am developing. According to this REST tutorial, a key abstraction to exploit in a RESTful API is the concept of a resource, and a common pattern is to have resources which themselves contain resources of their own. Additionally, returning 404 for an ID'd resource that does not actually exist is just as much of a common pattern.
My questions comes from the fact that a 404 response code can be ambiguous considering the hierarchical nature of a REST API.
For example, assume the data layer our REST API interacts with has the following data:
{
"users": {
"foo": {
"notes": {
"hello": "world"
}
}
}
}
Calls to our REST API that return 200 imply that all resources in the path exist:
GET /users/foo returns 200 because the user foo exists.
GET /users/foo/notes returns 200 for the same reason.
GET /users/foo/notes/hello returns 200 because both the user foo and a note named hello belonging to foo both exist.
There are even expected 404 response codes for particular paths:
GET /users/bar returns 404. That is nonambiguous since the 404 only refers to one resource.
GET /users/bar/notes returns 404. This is just as unambiguous (assuming the API does not return 404 for nonexistent paths).
But consider that the following return 404 for different and ambiguous reasons:
GET /users/bar/notes/baz returns 404 because the user bar does not exists.
GET /users/foo/notes/baz returns 404 because the existing user foo does not have a baz note.
In short, the 404s returned do not inform the client what exactly failed to be found: the user or the note. So my question is as follows:
Is it the responsibility of the server to be nonambiguous with 404 response codes? And if so, how should it differentiate to the client the nonexistence of a user versus the nonexistence of a user's note?
Is it the responsibility of the server to be nonambiguous with 404 response codes? And if so, how should it differentiate to the client the nonexistence of a user versus the nonexistence of a user's note?
By providing a "a representation containing an explanation of the error situation, and whether it is a temporary or permanent condition" as described in RFC 7231.
In other words, put the explanatory details into the document that you include in the HTTP response.
It may help to think more carefully about how all this works with web pages.
The status code is metadata in the transfer of documents over a network domain. The intended audience for that information is the web browser (and other general purpose components - spiders, caches, and so on). It's provided so that your browser (and other general purpose components) can correctly interpret the semantics of the response.
The audience for the "representation of the error" is the human being using the web browser. That's the place where one would provide, for example, information about what specifically has gone wrong, or what corrective actions might be taken.
In modern days, it is often the case that we are expecting bespoke machine clients, rather than humans, to be looking at the "web browser". Free form text or free form text marked up with hypermedia controls aren't likely to be useful. So we probably want to use problem details - a standardized schema for reporting problems.
One difficulty you may be having (not your fault; the literature sucks) is recognizing that identifiers are semantically opaque. /users/foo/notes/baz does not, generally, have any dependency on /users/foo/notes or any of the other prefixes. Nor does the identifier mean that /users/foo/notes/baz has four different parts that need to be satisfied.
Identifiers should be understood like keys into a map/dictionary - 200 means that the key exists in the map, 404 means the key doesn't exist in the map. But that doesn't actually tell you anything about the presence or absence of other keys with similar spellings!
Is your API, which conventionally organizes its resource model into a hierarchy, and chooses identifiers that are closely aligned with that hierarchy, "better" than an API that uses an unconventional resource model and arbitrary identifiers? Probably.
But good resource models and good identifier spelling conventions are not a REST constraint, and the HTTP and URI specifications also support designs that don't follow the current conventions (among other things, backwards compatibility is really important to REST and the web; REST and the web predate these spelling conventions by quite a bit).
(Analogy: we have coding conventions that describe "best practices" around ideas like variable naming and function naming because we use languages that don't restrict us to using "good" names. The machines don't care.)

Need feedbck on the quality of REST URL

For getting the latest valid address (of the logged in user), how RESTful is the following URL?
GET /addresses/valid/latest
Probably
GET /addresses?valid=true&limit=1
is the best, but it should then return a list. And, I'd like to return an object rather then a list.
Any other suggestions?
Your url structure doesn't have much to do with how RESTful something is.
So lets assume which one is the 'best'. Also a bit hard to say, pretty subjective.
I would generally avoid a pattern like /addresses/valid/latest. This kinda suggest that there is a 'latest resource' in the 'valid collection', in the 'addresses collection'.
So I like your other suggestion a bit better, because it suggests that you're using an 'addresses' collection, filtering by valid items and only showing 1.
If you don't want all kinds of parameters, I would be more inclined to find a url pattern that's not literally 'addresses, but only the valid, but only the latest', but think about what the purpose is of the endpoint. Maybe something that's easier to remember like /fresh-address =)
how RESTful is the following URL?
Any identifier that satisfies the production rules described by RFC 3986 is RESTful.
General purpose components are not supposed to derive semantics from identifiers, they are opaque. Which means that the server is free to encode information into those identifiers at its own discretion.
Consider Google search: does your browser care what URI is used as the target of the search form? Does your browser care about the href provided by Google with each search result? In both cases, the browser just does what it is told, which is to say it creates an HTTP request based on the representation of application state that was provided by the server.
URI are in the same broad category as variable names in a programming language - the machines don't care so long as the spellings are consistent with some simple constraints. People care, so there are some benefits to having a locally consistent and logical scheme.
But there are contexts in which easily guessed URI are not what you want. See Mark Seemann 2013.
Since the semantic content of the URI is reserved for use by the server only, it follows that the server can choose to encode that information into path segments or the query part. Or both.
Spellings that can be described by a URI Template can be very powerful. The most familiar URI template is probably an HTML form using the GET method, which encodes key value pairs onto the query part of the URI; so you should think about whether that's a use case you want to support.

REST verb for state change - can we agree on POST?

How to best extend REST with FSM state changes?
No one can know if a state change is idempodent or not, so the wisest thing may be to assume they're not, and as a general rule use POST, ok?
To me and my findings, POST makes more sense than PUT or PATCH.
POST /coffeemachines/{id}/start
or maybe more verbose?
POST /coffeemachines/{id}/state/start
Although start looks like a verb (breaking REST-practices), I think it's not:
The main verb is a poking POST, we want a state-change.
start is just the attribute-value for the requested state-change.
I guess I'm not the first man on the moon here, thankful for any references or thoughts.
You can send a partial update request with HTTP PATCH that contains only the new state.
PATCH /coffeemachines/{id}
{
status: "active"
}
According to Wikipedia:
The PATCH method is a request method supported by the HTTP protocol for making partial changes to an existing resource. The PATCH method provides an entity containing a list of changes to be applied to the resource requested using the HTTP URI. The list of changes are supplied in the form of a PATCH document.
It is also more readable to separate words in the path with hyphens. For example:
PATCH /coffee-machines/{id}
{
status: "active"
}
REST verb for state change - can we agree on POST?
The reference implementation of REST is the World Wide Web, which was catastrophically successful even though HTML (the dominant media type) only specified support for GET and POST.
Using POST for unsafe operations is fine.
Although start looks like a verb (breaking REST-practices)
No -- REST doesn't care about the spellings of URI. That's part of the point: the server can change the URI in links any time it likes because the clients just follow the links.
That said, there is an issue with your proposed identifiers, which you may want to consider
/coffeemachines/{id}
/coffeemachines/{id}/start
As far as REST is concerned, these are different resources. That means that your locally cached copy of /coffeemachines/{id} is not invalidated when you POST a request to /coffeemachines/{id}/start.
If you care to take advantage of the caching support that is already built into the domain agnostic components that are available, then you want the target of the POST to match the target of the GET: /coffeemachines/{id}
/coffeemachines/{id}/start, in this design, isn't the target of the POST, but is instead the identifier of the form resource that submits start messages to /coffeemachines/{id}. Likewise, /coffeemachines/{id}/stop would identify the form resource that submits stop messages.
The representation of the coffee machine would include links to these forms when the transitions are permitted; for instance, when the coffee machine is off, then the representation of the coffee machine returned by GET would include a link to the start form, but not a link to the stop form.
/coffeemachines/{id}/start and /coffeemachines/{id}/stop are different resources from /coffeemachines/{id}, and therefore might have their own caching policies.
Of course, it isn't required that the forms be separate resources -- the mechanism would also work if the forms were part of the representation of the /coffeemachines/{id} resource itself.
Can I ask you to elaborate around POST vs PATCH
I found that this observation by Roy Fielding helped me:
HTTP does not attempt to require the results of a GET to be safe. What it does is require that the semantics of the operation be safe, and therefore it is a fault of the implementation, not the interface or the user of that interface, if anything happens as a result that causes loss of property
PATCH has stricter semantics than POST; that means that clients (and generic components) can make stronger assumptions about what is going on.
So in the following examples:
PATCH /foo HTTP/1.1
Content-Type: application/json-patch+json
POST /foo HTTP/1.1
Content-Type: application/json-patch+json
The server can handle these messages in exactly the same way. Clients that recognize the PATCH method will recognize that the unsafe changes on the server are supposed to be all-or-nothing ("The server MUST apply the entire set of changes atomically...") and can leverage that as they like, but with POST, that additional constraint is missing and cannot be assumed.
The PATCH spec notes:
A comparison to POST is even more difficult, because POST is used in widely varying ways and can encompass PUT and PATCH-like operations if the server chooses. If the operation does not modify the resource identified by the Request-URI in a predictable way, POST should be considered instead of PATCH or PUT.

RESTful APIs when multiple actions on the same URI

So far as I know, four kind of methods are used in RESTful APIs:
GET for getting the resource.
POST for updating the resource.
PUT for creating or substituting the resource.
DELETE for deleting the resource.
Assume we have a resource named apple, and we can 'update' it in several ways. For example, pare it, slice it, or make it apple juice.
Each of these three different updating actions takes different arguments, and of their APIs, the common part will be:
POST /apple HTTP/1.1
Host: www.example.com
<different combination of arguments>
In this situation, three APIs share the same URI and the same request method, the only differences of them are arguments. I think this forces the backend to be ready for accepting the union set of those arguments, and to distinguish which action is actually requested, the backend need to check out the combination of the arguments. It's so much complicated and not graceful.
So my question is:
In this apple cases, how to work out an elegant set of RESTful APIs which make the backend easily handle with it.
First of all, try to avoid conflating HTTP methods to CRUD operations. I believe that's the main source of confusion in REST. HTTP methods don't translate to CRUD operations cleanly like that. I have a detailed answer here:
S3 REST API and POST method
In short.
POST is the method used for any operation that isn't standardized by HTTP, and subjects the payload to the target URI.
PUT is used to completely replace the resource at the present URI, and subjects the payload to the service itself.
PATCH is for partial idempotent updates, with a diff between the current and the desired state.
DELETE is used to delete the resource.
GET is used to retrieve the resource.
Now, on the backend side, try to think of REST resources more like a state machine where you can use the methods to force a transition rather than an object with methods. That way you focus the implementation on the resource itself, not on the interaction with the protocol. For instance, you may change an object's attributes straightforwardly from the method's payload, and then have a method that's called to detect what transition is needed.
For instance, you may think of an apple as having three states, whole, pared, sliced and juiced. You transition between states by using the standardized behavior of the methods.
For instance:
GET /apple
{"state": "whole",
"self": "/apple"}
Then you want to slice it. You may do something like:
PUT /apple
{"state": "sliced"}
Or you may do something like:
PATCH /apple
{"from_state": "whole", "to_state": "sliced"}
Or even something like:
POST /apple
{"transition": "slice"}
The idea is that the implementations can be generic enough that you don't have to worry too much about coupling the resource to the HTTP methods.
The PUT version is idempotent, so your clients can choose to use it when they need idempotence.
The PATCH version guarantees the client knows the current state and is trying a valid transition.
The POST version is the most flexible, you can do anything you want, but it needs to be documented in detail. You can't simply assume your clients will know how the method works.
As long as your implementation of the resource understands that when apple.state is changed to something else it should detect what change occurred and perform the adequate transition, you are completely decoupled from the protocol. It doesn't matter what method was used.
I believe this is the most elegant solution, and makes everything easier to handle from the backend side. You can implement your objects without worrying too much about the protocol. As long as the objects can be transitioned between states, they can be used by any protocol that can effect those transitions.
My RESTful HTTP API is rather different from yours. I have:
GET for getting a resource.
POST for appending a new resource to a collection.
PUT for substituting a resource (including truncating collections).
DELETE for deleting a resource.
PATCH for updating a resource.
LINK for indicating a relationship between two resources.
UNLINK for removing a relationship between two resources.
A ‘leaf’ resource can be thought of as a collection too.
For example, say you have /fruits and you POST an apple to that collection resource, that returns
201 Created
Location: /fruits/apple
In the same way, you can treat /fruits/apple as a collection of its properties, so:
GET /fruits/apple
->
colour=red&diameter=47mm
GET /fruits/apple/colour
->
red
GET /fruits/apple/diameter
->
47mm
and therefore:
PUT /fruits/apple/slices
"12"
->
201 Created
GET /fruits/apple
->
colour=red&diameter=47mm&slices=12
So in summary, I would recommend representing your actions as nouns, and locate those nouns as sub-resources of the resource you want to apply the action to.
Think in terms of resources. Here Apple is a resource.
To add one or more apples to list "/apples", use POST. REST style allows posting array.
POST /apples HTTP/1.1
Host: www.example.com
Now suppose you have an apple with ID 123. You can get details using method GET on "/apple/123".
GET /apples/123 HTTP/1.1
Host: www.example.com
To make any change to apple 123, just POST to it directly.
PUT /apples/123 HTTP/1.1
Host: www.example.com
Pare it, slice it, or make it apple juice - all these are basically changing some attributes of apple 123. As you were saying (rightly), PUT different combination of attributes.
I think this is up to the implementor to decide, but I see two approaches. Strictly from a single responsibility perspective it may make sense to provide separate services for these distinct operations.
However if you insist on a single service of all I guess you can pass an object with a action type qualifier to make it easy to delegate the request to different code in the service. The single object can then have other optional parameters to support the data needs of each operation.