Idempotentency of GET verb in an RESTful API - rest

As it was mentioned here https://restfulapi.net/http-methods/ (and in other places as well):
GET APIs should be idempotent, which means that making multiple
identical requests must produce same result everytime until another
API (POST or PUT) has changed the state of resource on server.
How to make this true in an API that return time for example? or that return data that is affected by time.
In other words, each time I use GET http://ip:port/get-time-now/, it is going to return a different response. However, I did not send any POST or PUT between two sequenced GET's
Does this make the previous statement wrong? Did I misunderstand something?

Idempotency is a promise to clients/intermediaries that the request can be reissued in case of network failures or the like without any further considerations and not so much that the data will never change.
If you take a POST request for example, in case of a network failure you do not know if the previous request reached the server but the response got lost midway or if the initial request didn't even reach the server at all. If you re-issue the request you might create a further resource actually, hence POST is not idempotent. PUT on the other side has the contract that it replaces the current representation with the one contained in the request. If you send the same request twice the content of the resource should be the same after any of the two PUT requests was processed. Note that the actual result can still differ as the service is free to modify the received entity to a corresponding representation. Also, between sending the data via PUT and retrieving it via GET a further client could have updated the state in between, so there is no guarantee that you will actually receive the exact representation you've sent to the service.
Safetiness is an other promise that only GET, HEAD and OPTIONS supports. It promises the invoker that it wont modify any state at all hence clients/intermediaries are safe on issuing such request without having to fear that it will modify any state. In practice this is an important promise to crawlers which blindly invoke any URLs in order to learn their content. In case of violating such promises, i.e. by deleting data while processing a GET request the only one to blame is the service implementor but not the invoker. If a crawler invokes such URLs and hence removes some data it is not the crawlers fault actually but only the service implementor.
As you have a dynamic value in your response, you might want to prevent caching of responses though as otherwise intermediaries might return an old state for your resource

The main basic concept of idempotent and safe methods of HTTP:-
Idempotent Method:- The method can called multiple times with same input and it produce same result.
Safe Method:- The method can called multiple times with same input and it doesn't modify the resource onto the server side.
Http methods are categorized into following 3 groups-
GET,HEAD,OPTIONS are safe and idempotent
PUT,DELETE are not safe but idempotent
POST,PATCH are neither safe & nor idempotent

Related

How to handle PUT endpoint with immutable resource

A microservice I develop exposes a single endpoint, PUT /deployments/{uuid} . The endpoint is used to initiate a potentially expensive deployment operation, so we only ever want it to happen once, which is why we chose PUT + UUID over POST (for uniqueness). The deployment is immutable, so it can never be updated, so we currently raise an exception if the PUT is called more than once with the same uuid.
As a person who loves bikeshedding and therefore cares deeply about restfulness, this grinds my gears. PUT is supposed to be idempotent, so raising an exception after making the same request multiple times is an antipattern. However, we have a requirement to not allow sequential identical requests to generate new deployments, so the usual POST is out.
While the best solution is one that works, I'd like ours to be a little more elegant, if possible. I've posited a POST with the UUID in the payload, but my team seems to think that's worse than the current solution. I'm considering just returning a 200 OK from a PUT to the same UUID rather than a 201 CREATED, but I'm not sure if that has the same problem as non-idempotent-put in not semantically conveying the information I want.
Is there a "best solution" here? Or am I doomed to be "that guy" on my team if I pursue this further (joke's on you i'm already that guy).
tl;dr What is the correct RESTful API signature for a /deployments endpoint that is immutable, and required to not allow the same request to be processed twice?
Idempotent does not mean "2 identical requests should yield the same response". It means: "The server state after 2 identical requests should be the same as when only 1 is made".
A similar example, if you call DELETE on a resource and get a 204 No Content back, and call DELETE again and get a 404, this doesn't violate the idempotency requirement. After the second delete the resource is still removed, just like it was after the first.
So multiple identical idempotent requests are allowed to give different responses.
That said though, I think it might be nicer to the second identical request to also get a 2xx status back. It doesn't have to be the same as the first.
The use-case is if a client sent a HTTP request but got disconnected before it got a response. The client should retry and if the server detects the request is the same as the first, the server can just give the client a success response (but don't do anything).
This is generally a good idea, because if the client got an error back for the second request, it might be harder to know if the request failed because it succeeded earlier, or for other reasons.
That all being said though, there is also a way here to have your cake and eat it too.
A client could send the following header along with the PUT request:
If-None-Match: *
If the client omits the header, you can always return 424 Precondition Required.
If the resource does not yet exist, it's a success response. If the resource was created earlier, you can return 412 Precondition Failed.
Using this mechanism a client has a standard way to figure out that the request failed because a successful one was made earlier.
Based on the docs here, PUT is the best method to use. The response should be 201 when it triggers the deployment, and either 200 or 204 when nothing is changed. It shouldn't be POST because calling a POST endpoint twice should trigger the effect both times.

Which HTTP Verb should I use to claim and lock an item in a job queue?

I plan on using an HTTP REST interface to connect to a Job Control service.
One key operation is to request a computational Job.
The caller does not know the ID of the Job; that is what it will be told.
The job will be marked in the database as locked by the service.
The data needed for processing of the job will be returned to the caller.
Later on, when the caller is done processing the job, it will send the results back via another REST call.
Now it knows the ID of the record to be updated.
The second REST call will update the Job record with the results.
and change the Job's status and release the lock.
Only the Success/Fail status needs to be returned.
I am leaning towards using PUT for each operation because no new record is being created; it is being updated in both cases.
Is this proper? Can the first PUT return a large JSON payload with the Job data or does it just return an HTTP status? Should I use a POST instead, even though I am not creating a record, just updating it?
I would have used a GET for the first operation, but a GET is not supposed to change any objects on the service, and I am locking it, which is a change. Is locking a record acceptable in a GET request?
Which HTTP Verb should I use to claim and lock an item in a job queue?
Key idea: a REST API is a facade - your application/service pretends to be an HTTP compliant document store. All of the interesting things that happen are side effects triggered by modifying documents. See Jim Webber, 2011.
With that in mind...
POST is fine. It's okay to use POST.
PUT/PATCH are a good for remote authoring; the client fetches your representation of a resource, makes edits to his local copy, and sends you a copy of the representation (PUT) or a patch document describing the changes (PATCH). The server can then apply those edits to its copy, or not.
So for your specific example, I would expect the client to GET a representation of your resource, change the information in that representation from unlocked to locked, and then to PUT the changed representation back to your server. You server would be expected to update your copy of the representation to match.
It may remind you of a declarative style - the client tells the server what the representation should look like, and it's up to the server to figure out how to do that.
Included for Completeness, NOT Recommened:
The HTTP method registry also includes a method LOCK, with a corresponding UNLOCK. The semantics for these method tokens are defined by the WebDAV specification. If your meaning of LOCK matches that of WebDAV, then using that might be an answer. Note that the specification includes comments like
Any resource that supports the LOCK method MUST, at minimum, support the XML request and response formats defined herein.
Unless you are already in a space where people are expecting to be able to use general-purpose WebDAV clients to interact with your API, that's probably not a good fit.
The HTTP method registry is extendable. So you could define the semantics of your own method token, then push to have it adopted as a standard.

How to handle network connectivity loss in the middle of REST POST request?

REST POST is used to create resources.
Let's say we have resource url
"http://example.com/cars"
We want to create a new car.
We POST to "http://example.com/cars" with JSON payload containing car properties (color, weight, model, etc).
Server receives the request, creates a new car, sends a response over the network.
At this point network fails (let's say router stops working properly and ignores every packet).
Client fails with TCP timeout (like 90 seconds).
Client has no idea whether car was created or not.
Also client haven't received car resource id, so it can't GET it to check if it was created.
Now what?
How do you handle this?
You can't simply retry creating, because retrying will just create a duplicate (which is bad).
REST POST is used to create resources.
HTTP POST is used for lots of things. REST doesn't particularly care; it just wants resources that support a uniform interface, and hypermedia.
At this point network fails
Bummer!
Now what? How do you handle this? You can't simply retry creating, because retrying will just create a duplicate (which is bad).
This is a general messaging concern, not directly related to REST. The most common solution is to use the Idempotent Receiver pattern. In short, you
need to define your messages so that the receiver has enough information to recognize the request as something that has already been done.
Ideally, this is being supported at the business level.
Idempotent collections of values are often straight forward; we just need to be thinking sets, rather than lists.
Idempotent collections of entities are trickier; if the request includes an identifier for the new entity, or if we can compute one from the data provided, then we can think of our collection as a hash.
If none of those approaches fits, then there's another possibility. Instead of performing an idempotent mutation of the collection, we make the mutation of the collection itself idempotent. Think "compare and swap" - we encode into the request information that identifies the current state of the collection; is that state is still current when the request arrives, then the mutation is applied. If the condition does not hold, then the request becomes a no-op.
Translating this into HTTP, we make a small modification to the protocol for updating the collection resource. First, we GET the current representation; and in the meta data the server provides validators that can be used in subsquent requests. Having obtained the validator, the client evaluates the current representation of the resource to determine if it needs to be changed. If the client decides to make a change, then submits the change with an If-Match or an If-Unmodified-Since header including the validator. The server, before processing the requests, then considers the validator, immediately abandoning the request with 412 Precondition Failed.
Thus, if a conditional state-changing request is lost, the client can at its own discretion repeat the request without concern that server will misunderstand the client's intent.
Retry it a limited number of times, with increasing delays between the attempts, and make sure the transaction concerned is idempotent.
because retrying will just create a duplicate (which is bad).
It is indeed, and it needs fixing, see above. It should be impossible in your system to create two entries with the same attributes. This is easily accomplished at the database level. You can attain idempotence by having the transaction return the same thing whether the entry already existed or was newly created. Or else just have it return EXISTS if the entry already exists, and adjust your client accordingly.

Batch processing via REST service

Are there any best practices for performing BATCH operations via REST for POST, PUT, PATCH verbs?
The current paradigm I am following is that the JSON payload is specified in the body for all 3 operations:
a) POST to return the location of the created resource
b) PUT / PATCH return a 201 if the update is successful
For a batch operation, I intend to accept a collection of JSON objects in the payload body but am trying to figure what to return to the client.
While processing the batch, the operation may succeed for some of the items but might fail for others.
Taking this into account, my take is that the best thing to do is to return a collection of objects indicating the Success/Failure status of each item from the payload.
But this deviates from my paradigm outlined in (a) and (b) above.
Instead, does it make sense to return an identifier representing an ID of the Batch operation itself to the client?
The client would then issue a subsequent GET to get the result of the operation it requested.
Does this approach sound reasonable? If so, does it make sense to block the client on the subsequent GET if the operation hasn't completed OR does it make sense to always return the most current state i.e. a collection of responses for each of the items that the client requested to process.
Ideas/Thoughts/Suggestions?
Since REST is an architectural style with no necessarily clear "guidelines" and no
mandate on how the actions for HTTP verbs should be implements, clearly there is no right or wrong answer here.
I am looking for a solution that is elegant, natural and intuitive.
REST operations are supposed to be atomic as seen from the outside. That is, if one part of the request fails, then the whole state of the server should revert to the pre-request state and a 4xx or 5xx response returned (so, for example, the request could be repeated in whole without ill effects if it failed the first time). This however has nothing to do with batch operations per seā€”such a request could be any kind of request.
Batch operations violate a different REST constraint, that of the uniform interface (defined by HTTP's methods and their operation upon a resource at the specified URL).
If you want to do batch operations, give up on trying to call your API RESTful, because you have already lost out on the benefits that REST imparts, and are just lying to yourself.
If you want to retain those benefits, give up on batch operations.
REST is an architectural style.
I implement APIs to always return the result of what was updated. So in the case of a POST it will return the created entity, with PATCH and PUT it will return the updated entity.
Depending on the batch size I would either return an array of what was processed, or alternatively an array of identifiers of what was processed.
If the batch operation is long running return an identifier for the batch, but make it painfully clear that the endpoint is different from other endpoints
ex. if you create a user by posting to
http://somesite.com/users
send batch requests to
http://somesite.com/batch/users
On a get return the status of the batch operation while it is still running, on complete return an array of record that were updated.
The most important thing is consistency, whatever you choose, always follow the same approach with batch operation throughout the system.

What's the correct way to view idempotency in terms of HTTP DELETE?

I have spent a lot of time recently reading the HTTP 1.1 specification and relating it to REST. I have found that there are two interpretations of the HTTP DELETE method in regards to its "idempotency" and safety. Here are the two camps:
If you delete a resource with HTTP DELETE, and it succeeds (200 OK), and then you try to delete that resource N number of times, you should get back a success message (200 OK) for each and every one of those delete calls. This is its "idempotencyness".
If you delete a resource with HTTP DELETE, and it succeeds (200 OK), and then you try to delete that resource again, you should get back an error message (410 Gone) because the resource was deleted.
The specification says DELETE is idempotent, sure, but it also says that sequences of idempotent events can still produce side effects. I really feel like the second camp is correct, and the first is misleading. What "safety" have we introduced by allowing clients to think they were the cause for deleting a resource previously deleted?
There are a LOT of people in the first camp, including several authors on the subject, so I wanted to check if there was some compelling reason other than emotions that lead people into the first camp.
Being idempotent does not mean that a request is not allowed to have side-effects (that's what the 'safe' property describes). It just mean that issuing the same request multiple times will not result in different or additional side-effects.
In my opinion, the subsequent DELETE request should return an error - it's still idempotent because the state of the server is that same as if only one DELETE request were made. Then again returning the 200 OK status should be OK as well - I don't think being idempotent requires the returning of an error code for the subsequent DELETE requests - it's just that returning the error status seems to make more sense to me.
#MichaelBurr is correct about idempotency and side-effects.
My opinion is that there are 2 states involved in a given REST request, the client's state and the server's state. REST is all about transferring these states between the server and the client, such that the client's state maps to a subset of the server's state, in other words, the subset stays consistent with the server. Because of that idempotency should mean that subsequent idempotent requests will not result in either state being different than it would be from only making the request once. With the first DELETE you would imagine that the server deletes the resource and lets the client know it can delete the resource as well (as the resource "doesn't exist anymore"). Now both states should be identical to before with minus the item that was deleted. For the client to do anything different when it tries to delete the item after it has already been deleted, then the state that is transfered from the server to the client must contain different information. The server can do things slightly differently with the information that the resource was already deleted, but once it responds with something different idempotency of the methods is essentially broken.
For idempotent function:
delete(client_state) -> client_state - {item}
delete(delete(client_state)) -> client_state - {item}
delete(client_state) = delete(delete(client_state))
The best way to guarantee this idempotency is if the server's response is identical, that means the only way for the client's state to break the idempotency is for there to be non-determinacy or side effects in the client's handling of the response (which probably points to an incorrect implementation of handling the response).
If there is an agreement between the client and server that the status codes exist outside of the representation of the state being transferred (REST), then it is possible to inform the client that the item "doesn't exists anymore" (as it would in the first request) with the extra comment that it had previously been deleted. What the client does with this information is unclear, but it shouldn't effect the resulting client state. But then the status code can't be used to communicate state, or rather if it does also communicate state in other situations (like maybe "you don't have permission to delete this item" or "item was not deleted"), then there's some introduced ambiguity or confusion. So, you at least need a pretty good reason for introducing more confusion into the communication if you want to say that DELETE is idempotent and still have the server's response depend on previous DELETE requests that are identical.
HTTP requests involve remove methods, so the function might resemble
delete(client_state) = send_delete(client_state) -> receive_delete(client_state)
-> respond_to_delete(informative_state)
-> handle_response(informative_state)
-> client_state - {item}
Wikipedia defines Idempotence as an operation that:
can be applied multiple times without changing the result beyond the initial application.
Notice that they talk about the result of the operation. To me, this includes both the server state and the response code.
The HTTP specification is a bit more vague on the matter. It defines it specifies that HTTP methods are Idempotent:
if the intended effect of multiple identical requests is the same as for a single request.
If you interpret effect as result in the Wikipedia definition then they mean the same. In any case, I question the practical benefit of telling clients that the resource as already been deleted.
Final point: Idempotence is defined in terms of a single client. Once you start introducing concurrent requests by other clients, all bets are off. You are supposed to use conditional-update headers (such as If-Match-ETag) to deal with such cases.
To reiterate: you should return the same return code, whether the resource just got deleted, was deleted by a previous request, or never existed at all.