REST - get a random number GET or POST? - rest

How should a random number generator properly be implemented in REST?
GET RANDOM/
or..
POST RANDOM/
The server returns a different random number each time.
I can see arguments for both ways.

I'd say this is the same as for a page returned that contains the current time - and many of these are done using GET. Abstractly, fetching a random number (or time) the server's state doesn't change - both time and random numbers can be described as an observation of an external event. E.g. http://random.org use atmospheric noise.
GET seems most appropriate, although caching will need to be disabled via appropriate headers, e.g.
Expires: <Current Time>
Last-Modified: <Current Time>
Cache-Control: no-cache, must-revalidate
Pragma: no-cache
If you want to ensure that the served content is already expired:
To mark a response as "already
expired," an origin server sends an
Expires date that is equal to the Date
header value. (See the rules for
expiration calculations in section
13.2.4.)
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html

Definitely GET. Even though it might modify server-side state (if it uses a pseudo-RNG), that's just an implementation detail the client shouldn't care about.

definition of REST-call with GET: the result have to be the same -> not GET.
definition of REST-call with PUT: the result of the call can be repeatable, the server should not have problem with it -> use PUT
POST is the weakest method and can used if other are not useful.
Why not GET: the result of GET-call can be cachet (cache-header, etag oder transparent proxies) and you dont will get random results ...

Related

REST API - what to return when query for a GET does not find a result

I'm calling a backend REST endpoint that takes in a query param and searches for a matching result /people?name=joe, and I'm wondering what status code and return data I should be returning when no object is found in the DB matching name=joe.
Things I've considered:
If I was directly hitting an endpoint /people/joe and it was not found, then I would definitely return 404.
If I was hitting an endpoint that returned a list of results for a query like if /people?name=joe was supposed to return ALL people named joe, then I would just return 200 with an empty list as the body. But in my case, I can only have one object for each name, so I'm not returning a list, so this doesn't apply here.
So this is a different case where I'm hitting an endpoint and passing in a query param to "search" for some data. And it is expected that in many cases, the data won't exist yet.
This seems pretty similar to the first bullet point above, but I don't like returning a 404 here since this is not necessarily an error.
Should I return a 200 but with an empty object {} as the body, and then my frontend should check if body == {} then take that to mean no data found?
Or should I still return a 404 here? Again, this is not really an error in my case which is why I don't want to use a 404, but if that makes most sense, then I could.
Easy parts first - status codes are metadata of the transfer-of-documents-over-a-network domain (Webber, 2011). In the context of a GET request (which asks for the current selected representation of a resource), a 200 response indicates that the response content is a representation of the resource (as opposed, for example, to being a representation of an error).
Furthermore, URI are opaque: general purpose HTTP components do not make assumptions about the semantics of resources based on the spelling of their identifiers. In other words, the "rules" are exactly the same for both
/people/joe
/people?joe
/people?name=joe
...
So at the HTTP level, the answers to your question are easy: if there's a current representation, then you reply to GET requests with a 200 status and copy the current representation into the response content.
The hard parts are deciding when there is a current representation, and what it looks like. REST and HTTP don't have anything to say about that, really. It's a resource design concern.
For example, this is interaction "follows all the rules":
GET /people?name=dave HTTP/1.1
HTTP/1.1 200 OK
Content-Location: /people?name=dave
Content-Type: text/plain
Dave's not here, man
HTTP is a general purpose mechanism for asking for documents/transmitting documents over a network, but it is agnostic about what documents look like and what keys we use to identify documents in the store.
If you are dealing with representations that describe zero or exactly one things, it can still be reasonable to use a list which is either empty or contains exactly one element (if you are familiar with Option/Optional/Maybe: same idea, we're presenting something with the semantics of an iterable collection)
HTTP/1.1 200 OK
Content-Location: /people?name=dave
Content-Type: application/json
[]
HTTP/1.1 200 OK
Content-Location: /people?name=bob
Content-Type: application/json
[{
...
}]
I agree that 200 and empty collection is better than 404 in your scenario. I don't like the idea of looking for {}, it is not explicit enough.
Possible ways of doing this:
200 ok
{
items:[]
}
200 ok
{
size:0//,
//items:undefined
}
200 ok
[]
206 Partial Content
Accept-Ranges: items
Content-Range: items 0-0/0
// []

Best REST practice for responce "GET" method status?

I didn't find useful information about which methods status is correct for absent object in db.
For example I have deleted user with id = 1, but someone try to get it's information thought GET method with query params id=1.
Which status will be right: 404, 204, 400,406 or 410?
I didn't find useful information about which methods status is correct for absent object in db.
Yup, that's right - HTTP status codes don't tell you anything about rows in a database, what they tell you about are documents ( "resources" ) in a document store.
More precisely, the HTTP status code is metadata that tells general purpose components (like a web browser, or a cache) what's in the message-body of the response.
Depending on what document you put into the message-body, the appropriate status code could be any of:
200
404
410
200 announce that the message-body is a document (more broadly, a current representation of the resource). 404 and 410 (and all 4xx and 5xx status codes) announce that the message-body is a representation of the explanation of the error.
404 indicates that the document identified by the effective target uri of the request doesn't exist right now, but it might exist later; you can attach caching metadata to communicate when the might check again.
410 indicates that the document identified by the effective target uri of the request doesn't exist right now, and that condition is likely permanent. That permanence implies that clients can delete bookmarks, and remote links to the document should be removed, and so on.
If you recycle ids, or if deletes ids can be restored, then 410 isn't an appropriate choice.
In some APIs, resources have current representations even when there is no matching information in the database.
In other words, the current representation of the resource might be an empty document
200 OK HTTP/2
Content-Type: text/plain
Content-Length: 0
or it could be a null object
200 OK HTTP/2
Content-Type: application/json
Content-Length: 4
null
or it could be an empty list
200 OK HTTP/2
Content-Type: application/json
Content-Length: 2
[]
or an empty object
200 OK HTTP/2
Content-Type: application/json
Content-Length: 2
{}
or a meme
200 OK HTTP/2
Content-Type: text/plain
Content-Length: 36
This space intentionally left blank.
The status code to use follows from the decision to use a sort of "default" representation of our document when there is no specific information available.
The more common decision, of course, is to choose not to provide default representations, but instead announce that the client has made a mistake (in which case the 4xx class of status code is the correct starting point).
Isn't it write to return 204(NO CONTENT) status or something similar? 'Cause I think 200 is not fully informative status
Maybe - there's some ambiguity in the HTTP standard, and because of that ambiguity I tend to be biased against 204 (today; if you look up some of my older answers, I was much likely to try 204 in the past).
RFC 7231, Section 6.3.1
Aside from responses to CONNECT, a 200 response always has a payload, though an origin server MAY generate a payload body of zero length. If no payload is desired, an origin server ought to send 204 (No Content) instead.
So we have two different ways to send zero bytes of data back to the client; either 200 with Content-Length set to zero, or 204.
Are those two things the same?
The answer seems to be "not quite"; there's a subtle difference documented in section 6.3.5
The 204 response allows a server to indicate that the action has been successfully applied to the target resource, while implying that the user agent does not need to traverse away from its current "document view" (if any).
Now, think about that in the context of a web browser. If I click a link that points to an empty file, a 200 response would mean that the browser would traverse away from the current "document view" to show me the empty file. But the language of 204 suggests that instead the browser should stay put, and just indicate that the empty file was successfully downloaded.
Note: I haven't done any experiments to see if browsers do act that way; my only claim is that staying in place is the specified behavior.
My reading of the specification is that 204 is designed to support a use case that only arises in the context of unsafe actions, like PUT. You can see hints of that as far back as HTTP/1.0
This response is primarily intended to allow input for scripts or other actions to take place without causing a change to the user agent's active document view. The response may include new metainformation in the form of entity headers, which should apply to the document currently in the user agent's active view.
In short, responding with a 204 to a GET request is placing a bet that the authors of general purpose components have interpreted an ambiguous part of the specification the same way that you do -- and I don't like that bet at all. Much more reliable to use the well understood 200 response, and avoid the unnecessary ambiguity.

ReST low latency - how should I reply to a GET while an upload is pending?

I am designing a ReST API which follows the basic CRUD pattern.
My API can receive a request to update a resource which may take a short time to process. Ideally I would like to inform clients that a new version is about to be available and that there is some uncertainty over when the version I have cached actually expires.
So the process I intend to use something like this (improvements welcome):
client: GET /some/item
myapi: 200 OK
last-modified: time-stamp-of-v1
etag: some-hash-relating-to-v1-of-my-item-in-this-format
content: json or whatever
data/for/some/item/v1...
client: PUT /some/item
if-match: some-hash-relating-to-v1-of-my-item-in-this-format
content: json or whatever
data/for/some/item/v2...
myapi: 202 ACCEPTED,
content: json or whatever
time-accepted: time-stamp-after-v1-but-before-v2
your item will be at /some/item
here is a URI /some/taskid to track progress
while upload is pending:
client: GET /some/item
myapi: 200 OK
some/item ...
last-modified: time-stamp-of-v1
etag: some-hash-relating-to-v1-of-my-item-in-this-format
>>>> expires: time-stamp-after-v1-but-before-v2 <<<
>>>> warning: 110 Response is stale <<<<
content: json or whatever
data/for/some/item/v1...
client: GET /some/task/id
myapi: 200 OK
content: json or whatever
time-accepted: time-stamp-after-v1-but-before-v2
your item will be at /some/item
status/of/upload/v2...
after task completed:
client: GET /some/item
myapi: 200 OKAY
some/item/v2 ...
last-modified: time-stamp-of-v2
etag: some-hash-relating-to-v2-of-my-item-in-this-format
content: json or whatever
data/for/some/item/v2...
client: GET /some/task/id
myapi: 303 SEE OTHER
look-here: /some/item
If you are a proxy and know know your content is stale you can put "warning: 110 - response is stale" in the header.
However, in this case the data is not actually invalid yet.
I would like to say that I can guarantee it is valid up until the time I received and passed on the upload request (time-stamp-after-v1-but-before-v2 or later as if I am in contact with the upload server). It hasn't really expired at the time I receive the upload request. I just expect its going to.
(In fact if the request fails it might not be updated at all).
Now the default choice is just to serve the old content and let the client catch up on its own. This has high latency. If possible, I would like to do better.
For example, if the client knows the document is about to expire it could poll more often or it could try to upgrade the connection to a web-socket and get sent an update the moment I get it (would that still count as ReST?)
There is another case where using expired data must be avoided at all costs. For that scenario I think I want to tell the client that the resource is temporarily unavailable. Using the warning and expires fields as I have above seems correct there. Though it might be better to send a 503 with a suitable retry-after header.
So the question is: how should I reply to a GET while the upload of a new version is pending?
In anticipation of answers along the lines of use a messaging framework like AMQP or zeroMQ instead for low latency, I should point out this API is acting as a AMQP gateway/proxy for clients unwilling to use AMQP directly. Information on using webhooks or websockets would be still be interesting.
Some related useful content is:
How to proper design a restful API to invalidate a cache?
https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
HTTP status code for temporarily unavailable pages
http://www.albertoleal.me/posts/how-to-prevent-race-conditions-in-restful-apis.html
(the etag prevents races from simultaneously uploads)
Tl;Dr;
While upload is pending send:
client: GET /some/item
myapi: 200 OK
some/item ...
last-modified: time-stamp-of-v1
etag: some-hash-relating-to-v1-of-my-item-in-this-format
expires: time-stamp-after-v1-but-before-v2
stale-while-revalidate: 100
warning: 110 Response is stale
content: json or whatever
data/for/some/item/v1...
At first sight it looks like using Warning is not correct. See https://www.rfc-editor.org/rfc/rfc7234#section-5.5.0
In this case the server is acting as a proxy (though not an HTTP proxy).
It is not disconnected from AMQP and "A proxy MUST NOT send stale responses" unless it is disconnected.
This is annoying as it looked like the right thing to do here.
4.2.4. Serving Stale Responses
A "stale" response is one that either has explicit expiry
information or is allowed to have heuristic expiry calculated, but
is not fresh according to the calculations in Section 4.2.
A cache MUST NOT generate a stale response if it is prohibited by
an explicit in-protocol directive (e.g., by a "no-store" or
"no-cache" cache directive, a "must-revalidate"
cache-response-directive, or an applicable "s-maxage" or
"proxy-revalidate" cache-response-directive; see Section 5.2.2).
**> A cache MUST NOT send stale responses unless it is disconnected
(i.e., it cannot contact the origin server or otherwise find a
forward path) or doing so is explicitly allowed (e.g., by the
max-stale request directive; see Section 5.2.1).**
A cache SHOULD generate a Warning header field with the 110
warn-code (see Section 5.5.1) in stale responses. Likewise, a
cache SHOULD generate a 112 warn-code (see Section 5.5.3) in stale
responses if the cache is disconnected.
A cache SHOULD NOT generate a new Warning header field when
forwarding a response that does not have an Age header field, even if
the response is already stale. A cache need not validate a response
that merely became stale in transit.
Also
4.4. Invalidation
Because unsafe request methods (Section 4.2.1 of [RFC7231]) such as
PUT, POST or DELETE have the potential for changing state on the
origin server, intervening caches can use them to keep their contents
up to date.
**> A cache MUST invalidate the effective Request URI (Section 5.5 of
[RFC7230]) as well as the URI(s) in the Location and Content-Location
response header fields (if present) when a non-error status code is
received in response to an unsafe request method.**
However a warning is required if stale-while-revalidate is used (see https://www.rfc-editor.org/rfc/rfc5861)
The stale-while-revalidate Cache-Control Extension
When present in an HTTP response, the stale-while-revalidate Cache-
Control extension indicates that caches MAY serve the response in
which it appears after it becomes stale, up to the indicated number
of seconds.
stale-while-revalidate = "stale-while-revalidate" "=" delta-seconds
If a cached response is served stale due to the presence of this
extension, the cache SHOULD attempt to revalidate it while still
serving stale responses (i.e., without blocking).
I thought this was unclear so I submitted an errata. This was rejected (though at the time of writing its still showing as reported) on the grounds that the cache control extensions in rfc5861 override the MUST NOT in rfc7234 ("doing so is explicitly allowed" see above).
It is okay to use expires but its not very helpful as it doesn't imply anything.
5.3. Expires
The "Expires" header field gives the date/time after which the
response is considered stale. See Section 4.2 for further discussion
of the freshness model.
**> The presence of an Expires field does not imply that the original
resource will change or cease to exist at, before, or after that
time.**

REST service for stateless computation

I need to create a method in my REST API that will be used to perform some computation. For sake of simplicity, assume that I need to implement a method that for a given list of objects will return its length.
It should only compute the length and return to the client, so no resource will be modified server side. As it does not modify any resources, one would expect that it should be a GET request. However, as the list may be large and the objects may be complex, it looks like I need to make it as a POST request. This would however violate the standard that POST is used in REST to create resources.
What would be your solution to this problem?
First solution
According to RESTful Web Services Cookbook you can treat your computation as a resource and just GET it. For example:
-- Request
GET /computations?param1=2&param2=2 HTTP/1.1
-- Response
HTTP/1.1 200 OK
Content-Type: application/json
{
"result": 4
}
It sounds good for fast computations with small amount of input parameters. But if your computation isn't that then you can use second solution.
Second solution
Treat both computation and result as resources. POST computation and GET result. For example:
First you create a computation
-- Request
POST /computations HTTP/1.1
Content-Type: application/json
{
"param1": 2,
"param2": 2
}
-- Response
HTTP/1.1 201 Created
Location: /computations/1
Then you can get this computation
-- Request
GET /computations/1 HTTP/1.1
-- Response
HTTP/1.1 200 OK
Content-Type: application/json
{
"param1": 2,
"param2": 2
}
But also you can get a result (/computations/1/result) of this computation
-- Request
GET /computations/1/result HTTP/1.1
-- Response
HTTP/1.1 204 No Content
Cache-Control: max-age=3600,must-revalidate
But oh no! There is no result yet. Server tells us to come back in an hour (Cache-Control: max-age=3600,must-revalidate) and try again. The second solution allows you to make computation asynchronous (in case when it takes a lot of time) or you can compute it once, store the result in some DB and serve it quickly when requested for the next time.
-- Request
GET /computations/1/result HTTP/1.1
-- Response
HTTP/1.1 200 OK
Content-Type: application/json
{
"result": 4
}
Pragmatic answer: use POST.
Weasely answer: use POST. Your request contains a resource (or set of resources) that you want the server to temporarily create (the list of objects). If the server happens to delete that resource immediately after the POST has been successfully dealt with, what of it?
The moral of the story here is that with REST you always assume a resource, even if in reality one is not needed and likely not created. In the examples above, what if I want to store neither the computation resource (a JSON object that consists of parameters and some sort of operation) nor the computation result resource? I cannot. I can either
Treat this as a fake resource with GET (pretend that it exists) or
Create two resources with a single POST - one for the computation and one for its result - and then use the corresponding GET to retrieve either of those.
A sneaky assumption here also is that there is NO POST for /computation/<computaiton_id>/result because results are only created via computations.

HTTP Status 202 - how to provide information about async request completion?

What is the appropriate way of giving an estimate for request completion when the server returns a 202 - Accepted status code for asynchronous requests?
From the HTTP spec (italics added by me):
202 Accepted
The request has been accepted for processing, but the processing has not been completed. [...]
The entity returned with this response SHOULD include an indication of the request's current status and either a pointer to a status monitor or some estimate of when the user can expect the request to be fulfilled.
Here are some of thoughts:
I have glanced at the max-age directive, but using it would be abusing Cache-Control?
Return the expected wait time in the response body?
Add an application specific X- response header, but the X-headers was deprecated in RFC 6648?
Add a (non X-) specific response header? If so, how should it be named? The SO question Custom HTTP headers : naming conventions gave some ideas, but after the deprecation it only answers how HTTP headers are formatted, not how they should be named.
Other suggestions?
Definitely do not abuse existing HTTP headers for this. Since it's your own server, you get to define what the response looks like. You can (and should) pick whatever response works best for the intended recipient of this information and return the actual information in the response body.
For example, if you are only interested in displaying a human-readable message then you could return text/plain saying "Your request is likely to be processed in the next 30 minutes.".
At the other end of the spectrum, you might want to go all the REST way and return application/json, perhaps formatted like this (I totally made this up on the spot):
{
"status": "pending",
"completion": {
"estimate": "Thu Sep 08 2011 12:00:00 GMT-0400",
"rejected-after": "Fri Sep 09 2011 12:00:00 GMT-0400",
},
"tracking": {
"url": "http://server/status?id=myUniqueId"
}
}
You can use the Location header to specify the URL of the status monitor. Things like current status and estimate can either go in custom headers (which noone but your own software would use), or in the response body (which a web browser would display to a user, at least).
Although not explicitly mentioned specifically for the 202 - Accepted response code, the Retry-After header seems to be a suitable option. From the documentation:
The Retry-After response-header field can be used [...] to indicate how long the service is expected to be unavailable to the requesting client.