http verb to invoke services / methods - rest

what is the best practice in defining web service that represent a non REST command invocation?
For REST, basically we use POST to create new record(s), GET to retrieve record(s), PUT to update record(s) and DELETE to remove record(s). Which http verb should I use if I just want to invoke some other non resource function, for example - to flush a system cache?

Which http verb should I use if I just want to invoke some other non resource function, for example - to flush a system cache?
HTTP request methods should be selected based on their alignment with their defined semantics.
The most important of these is to determine whether or not the semantics are safe
Request methods are considered "safe" if their defined semantics are essentially read-only; i.e., the client does not request, and does not expect, any state change on the origin server as a result of applying a safe method to a target resource. Likewise, reasonable use of a safe method is not expected to cause any harm, loss of property, or unusual burden on the origin server.
Advertising a safe link invites consumers to pre-fetch a link, or to crawl and index the representation found there.
If having Google and a billion of her closest friends flushing your system cache sounds expensive, then you probably don't want a safe method.
PUT and PATCH are unsafe methods with semantics of manipulating representations. So if you had a schema that described a system cache, a client might PUT a representation of an empty cache in the entity body, and send that to you, whereupon you could flush the cache. You could achieve a similar things with PATCH, sending a list of the edits needed to make the change.
Both of these rely on the illusion that your resources are just documents. I GET a representation of your resource, I load that into my generic editor, make changes, send my edited representation back to you, and then it's up to you to manifest those changes (or not).
But they aren't required -- if you want to simply document that
PUT /df1645af-f960-4cc4-ad7a-d0ddd29903f8
Content-Length: 0
has the side effect of flushing the system cache, the REST Police aren't going to come after you just because you've introduced a bit of RPC into the mix.
Of course, if you were doing this with HTML, then your only choice would be POST.
The POST method requests that the target resource process the representation enclosed in the request according to the resource's own specific semantics.
Which is to say, POST is always an option.
It's easy enough to imagine the flow -- you load up some bookmark, follow a system cache link, find a form with a flush cache button, and submit. The browser would create the request as described in the form elements and submit it.
So that's going to be fine too. And the REST police won't bother you for that, because that protocol is actually RESTful.
If those answers are unsatisfying, or if you are just surveying the space to know what options are available, you can review the HTTP Method Registry. To be honest, I've never found anything there I've wanted to use. But if WebDAV is your jam....

Related

Which HTTP method to use to build a REST API to perform following operation?

I am looking for a REST API to do following
Search based on parameters sent, if results found, return the results.
If no results found, create a record based on search parameters sent.
Can this be accomplished by creating one single API or 2 separate APIs are required?
I would expect this to be handled by a single request to a single resource.
Which HTTP method to use
This depends on the semantics of what is going on - we care about what the messages mean, rather than how the message handlers are implemented.
The key idea is the uniform interface constraint it REST; because we have a common understanding of what HTTP methods mean, general purpose connectors in the HTTP application can do useful work (for example, returning cached responses to a request without forwarding them to the origin server).
Thus, when trying to choose which HTTP method is appropriate, we can consider the implications the choice has on general purpose components (like web caches, browsers, crawlers, and so on).
GET announces that the meaning of the request is effectively read only; because of this, general purpose components know that they can dispatch this request at any time (for instance, a user agent might dispatch a GET request before the user decides to follow the link, to make the experience faster).
That's fine when you intend the request to provide the client with a copy of your search results, and the fact that you might end up making changes to server local state is just an implementation detail.
On the other hand, if the client is trying to edit the results of a particular search (but sometimes the server doesn't need to change anything), then GET isn't appropriate, and you should use POST.
A way to think about the difference is to consider what action you want to be taken when an intermediate cache holds a response from an earlier copy of "the same" request. If you want the cache to reuse the response, GET is the best; on the other hand, if you want the cache to throw away the old response (and possibly store the new one), then you should be using POST.

Modelling a endpoint for a REST API that need save data for every request

A few time ago I participate from a interview where had a question about REST modelling, and how the best way to implement it. The question was:
You have an REST API where you expose a method to consult the distance between two point, although you must save each request to this method to expose the request history.
And I was questioned about which HTTP method should be used on this case, for me the logic answer in that moment was the GET method (to execute the both actions). After this the interviewer asked me why, because since we are also storing the request, this endpoint is not idempotent anymore, after that I wasn't able to reply it. Since this stills on my mind, so I decided to verify here and see others opinions about which method should be used for this case (or how many, GET and POST for example).
You have an REST API where you expose a method to consult the distance between two point, although you must save each request to this method to expose the request history.
How would you do this on the web? You'd probably have a web page with a form, and that form would have input controls to collect the start and end point. When you submit the form, the browser would use the data in the controls, as well as the form metadata and standard HTML processing rules to create a request that would be sent to the server.
Technically, you could use POST as the method of the form. It's completely legal to do that. BUT, as the semantics of the request are "effectively read only", a better choice would be to use GET.
More precisely, this would mean having a family of similar resources, the representation of which includes information about the two points described in the query string.
That family of similar resources would probably be implemented on your origin server as a single operation/route, with a parser extracting the two points from the query string and passing them along to the function as arguments.
the interviewer asked me why, because since we are also storing the request, this endpoint is not idempotent anymore
This is probably the wrong objection - the semantics of GET requests are safe (effectively read only). So the interview might argue that saving the request history is not read only. However, this objection is invalid, because the semantic constraints apply to the request message, not the implementation.
For instance, you may have noticed that HTTP servers commonly add an entry to their access log for each request. Clearly that's not "read only" - but it is merely an implementation detail; the client's request did not say "and also log this".
GET is still fine here, even though the server is writing things down.
One possible objection would be that, if we use GET, then sometimes a cache will return an previous response rather than passing the request all the way through to the origin server to get logged. Which is GREAT - caches are a big part of the reason that the web can be web scale.
But if you don't want caching, the correct way to handle that is to add metadata to the response to inhibit caching, not to change the HTTP method.
Another possibility, which is more consistent with the interviewer's "idempotent" remark, is that they wanted this "request history" to be a resource that the client could edit, and that looking up distances would be a side effect of that editing process.
For instance, we might have some sort of an "itinerary" resource with one or more legs provided by the client. Each time the client modifies the itinerary (for example, by adding another leg), the distance lookup method is called automatically.
In this kind of a problem, where the client is (logically) editing a resource, the requests are no longer "effectively read only". So GET is off the table as an option, and we have to look into the other possibilities.
The TL;DR version is that POST would always be acceptable (and this is how we would do it on the web), but you might prefer an API style where the client edits the representation of the resource locally, in which case you would let the client choose between PUT and PATCH.

RESTful GET that can change system state?

I am building a service that caches short lived data objects. The object creation process is expensive, so this service will cache them and other downstream applications can use them without managing their lifecycle.
The plan is that downstream apps will make a GET call to this service to fetch object. If the object is expired, the service will fetch a new object, cache it, and return it to the caller.
And Here is my dilemma - This way the GET operation changes system state, by fetching new object. I am sure that I am violating REST principles here, or is there a valid justification for this? Should I just change the method to POST?
This way the GET operation changes system state, by fetching new object. I am sure that I am violating REST principles here, or is there a valid justification for this? Should I just change the method to POST?
The short version: this is fine.
Longer version: REST says that our resources have common "uniform" semantics - the meaning of messages doesn't depend on which resource you reference.
In the case of HTTP, the primary discriminator for requests is the method. For the GET method, the semantics are (currently) described by RFC 7231. GET is explicitly identified as being safe
Request methods are considered "safe" if their defined semantics are essentially read-only; i.e., the client does not request, and does not expect, any state change on the origin server as a result of applying a safe method to a target resource.
If you, the server, need to change a bunch of your private information stores to compute the current representation of the resource, that's an implementation detail hidden behind the HTTP facade. You can do what you like.
Fundamentally, what safe means is that anybody who knows the identifier can ask for the current representation of the resource at any time. This allows browser to retry requests when the network is flaky, or for spiders to crawl around indexing the net, knowing that their requests do no harm (or more precisely, that the fault of any harms inflicted by those requests is properly assigned to the server).
If that's OK, then GET is a perfectly "RESTful" method to use for these requests.

Should this be a GET or a PATCH request?

if i do a request to get some data from a database without sending any updates, however i'm marking the record in the database to say the data has been fetched, does that make it a PATCH request or a GET?
Short answer: No, it is still a GET.
RFC 7231 defines safe
Request methods are considered "safe" if their defined semantics are essentially read-only....
This definition of safe methods does not prevent an implementation
from including behavior that is potentially harmful, that is not
entirely read-only, or that causes side effects while invoking a safe
method. What is important, however, is that the client did not
request that additional behavior and cannot be held accountable for
it. For example, most servers append request information to access
log files at the completion of every response, regardless of the
method, and that is considered safe even though the log storage might
become full and crash the server. Likewise, a safe request initiated
by selecting an advertisement on the Web will often have the side
effect of charging an advertising account.
So if the client is trying to retrieve a current representation of the resource,
the fact that your implementation happens to do a bit of bookkeeping on the side doesn't change the semantics of the request.
Part of the point of an HTTP front end is that clients are completely insulated from the underlying implementation details of the server -- everything looks like a dumb web site from the outside.
The HTTP spec is rather clear on that if you read through the definition of safe:
Request methods are considered "safe" if their defined semantics are essentially read-only; i.e., the client does not request, and does not expect, any state change on the origin server as a result of applying a safe method to a target resource. Likewise, reasonable use of a safe method is not expected to cause any harm, loss of property, or unusual burden on the origin server.
This definition of safe methods does not prevent an implementation from including behavior that is potentially harmful, that is not entirely read-only, or that causes side effects while invoking a safe method. What is important, however, is that the client did not request that additional behavior and cannot be held accountable for it. For example, most servers append request information to access log files at the completion of every response, regardless of the method, and that is considered safe even though the log storage might become full and crash the server. Likewise, a safe request initiated by selecting an advertisement on the Web will often have the side effect of charging an advertising account.
...
So a state change through a GET triggered download is fine as long as the client is not aware of that state change.
In certain situations though, exposing a state change via GET may be risky. Just think of a crawler that invokes a couple of URIs that order some Pizza or the like. According to the spec this is fine and the crawler must not made accountable for that order. This is simply telling you that it was your fault.
With that being said, you can always use POST if you feel uncomfortable with certain HTTP operations as POST literally allows you to process the request according to the resources own semantics.
Which leads me to the next point of re-thinking your design. Returning some document that includes it own state is somehow strange in my opinion. Usually such information is meta-data about a document but not the resource itself. Here you could either use HTTP headers to communicate such information to the client or design the state of that resource as yet a further resource you can hint a client about providing it a link to look it up if it is interested.
Anyway, while not elegant performing a state change on retrieving a resource via GET is not forbidden. I would though invest a couple more thoughts on whether you want to include the state within the resource itself or expose it via its own resource.

PUT POST being idempotent (REST)

I don't quite get how the HTTP verbs are defined as idempotent. All I've read is GET and PUT is idempotent. POST is not idempotent. But you could create a REST API using POST that doesn't change anything (in the database for example), or create a REST API for PUT that changes every time it is called.
Sure, that probably is the wrong way to do things but if it can be done, why is PUT labeled as idempotent (or POST as not) when it is up to the implementation? I'm not challenging this idea, I'm probably missing something and I ask to clear my understanding.
EDIT:
I guess one way to put my question is: What would be the problem if I used PUT to make a non-idempotent call and POST to do so?
You are right in pointing out there is nothing inherent within the HTTP protocol that enforces the idempotent attribute of methods/verbs like PUT and DELETE. HTTP, being a stateless protocol, retains no information or status of each request that the user makes; every single request is treated as independent.
To quote Wikipedia on the idempotent attribute of HTTP methods (emphasis mine):
Note that whether a method is idempotent is not enforced by the
protocol or web server. It is perfectly possible to write a web
application in which (for example) a database insert or other
non-idempotent action is triggered by a GET or other request. Ignoring
this recommendation, however, may result in undesirable consequences,
if a user agent assumes that repeating the same request is safe when
it isn't.
So yes, it is possible to deviate from conventional implementation, and rollout things like non-changing POST implementation, non-idempotent PUT etc. probably with no significant, life-threatening technical problems. But you might risk upsetting other programmers consuming your web services, thinking that you don't know what you're doing.
Here's an important quote from RFC2616 on the HTTP methods being safe (emphasis mine):
Implementors should be aware that the software represents the user in
their interactions over the Internet, and should be careful to allow
the user to be aware of any actions they might take which may have an
unexpected significance to themselves or others.
In particular, the convention has been established that the GET and
HEAD methods SHOULD NOT have the significance of taking an action
other than retrieval. These methods ought to be considered "safe".
This allows user agents to represent other methods, such as POST, PUT
and DELETE, in a special way, so that the user is made aware of the
fact that a possibly unsafe action is being requested.
Naturally, it is not possible to ensure that the server does not
generate side-effects as a result of performing a GET request; in
fact, some dynamic resources consider that a feature. The important
distinction here is that the user did not request the side-effects, so
therefore cannot be held accountable for them.
UPDATE: As pointed out by Julian, RFC 2616 has been replaced by RFC 7231. Here's the corresponding section.
So when you publish a web service as a PUT method, and I submit a request that looks like:
PUT /users/<new_id> HTTP/1.1
Host: example.com
I will expect a new user resource to be created. Likewise, if my request looks like:
PUT /users/<existing_id> HTTP/1.1
Host: example.com
I will expect the corresponding existing user to be updated. If I repeat the same request by submitting the form multiple times, please don't pop up a warning dialog (because I like the established convention).
Conversely, as a consumer of a POST web service, I will expect requests like:
POST /users/<existing_id> HTTP/1.1
Host: example.com
to update the corresponding existing user, while a request that looks like:
POST /users/<new_id> HTTP/1.1
Host: example.com
to raise an error because the URL doesn't exist yet.
Indeed, an implementation can do anything it wants. However, if that is incorrect according to the protocol spec, surprising things might happen (such as as library or intermediary repeating a PUT if this first attempt failed).
hope the link helps to you:HTTP Method idempotency
Be careful when dealing with safe methods as well: if a seemingly safe method like GET will change a resource, it might be possible that any middleware client proxy systems between you and the server, will cache this response. Another client who wants to change this resource through the same URL(like: http://example.org/api/article/1234/delete), will not call the server, but return the information directly from the cache. Non-safe (and non-idempotent) methods will never be cached by any middleware proxies.