Rest Services conventions - rest

In Rest Services,we normally use 'GET' requests when we want to retrieve some data from the server, however we can also retrieve data using a 'POST' request.
We use 'POST' to create, 'PUT' to update, and 'DELETE' to delete, however we can even create new data using a 'DELETE' request.
So I was just wondering what is the real reason behind for that, why these conventions are used?

So I was just wondering what is the real reason behind for that, why these conventions are used?
So the world doesn't fall apart!
No but seriously, why are any protocols or standards created? Take this historical scenario. Back in the early days of Google, many developers (relative to nowadays) weren't too savvy on the HTTP protocol. What you might've caught was a bunch of sites who just made use of the well known (maybe only known) GET method. So there would be links that would be GET requests, but would perform operations that were meant to be POST request, that would change the state of the server (sometimes very important changes of state). Enter Google, who spends its days crawling the web. So now you have all these links that Google is crawling, all these links that are GET requests, but changing the state of the server. So all these companies are getting a bunch of hits on their servers changing state. They all think they're being attacked! But Google isn't doing anything wrong. HTTP semantics state that GET requests should not have state changing behaviors. It should be a "read only" method. So finally these companies smartened up, and started following HTTP semantics. True story.
The moral of the story: follow protocols, that's what they're there for - to follow.
You seem to be looking at it from the perspective of server implementation. Yeah you can implement your server to accept DELETE request to "get" something. That's not really the matter at hand. When implementing the server, you need to think about what the client expects. I mean ultimately, you are creating an API. Look at it from a code API perspective
public class Foo {
public Bar bar;
public Bar deleteBar() {
return bar; // Really?!
}
public void getBar() {
bar = null; // What the..??!
}
}
I don't know how long a developer would last in the game, writing code like that. Any callers expecting to "get" Bar (simply by naming semantics) has another thing coming. Same goes for your REST services. It is ultimately a WEB API, and should follow the semantics of the protocol (namely HTTP) on which it is built. Those who understand the protocol, will have an idea of what the API does (at least in the CRUD sense), simply based on the type of request they make.
My suggestion to you or anyone trying to learn REST, is to get a good handle on HTTP. I would keep the following document handy. Read it once, then keep it as a reference
Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content

GET cached by proxies POST and DELETE not!
Yes you can create data with GET but now you have to destroy that cached.Why to do extra job.
Also maximum header sizes accepted are different because of purpose of usage.

I recommend reading the spec which clearly states how each each http-method should be used.
why these conventions are used?
They are conventions, that is, best practices that have been adopted as a standard. You do not have to adhere to the standard, but most consumers of a REST service assume you do. That way it is easier to understand the implementation / interface.

Related

Modelling a endpoint for a REST API that need save data for every request

A few time ago I participate from a interview where had a question about REST modelling, and how the best way to implement it. The question was:
You have an REST API where you expose a method to consult the distance between two point, although you must save each request to this method to expose the request history.
And I was questioned about which HTTP method should be used on this case, for me the logic answer in that moment was the GET method (to execute the both actions). After this the interviewer asked me why, because since we are also storing the request, this endpoint is not idempotent anymore, after that I wasn't able to reply it. Since this stills on my mind, so I decided to verify here and see others opinions about which method should be used for this case (or how many, GET and POST for example).
You have an REST API where you expose a method to consult the distance between two point, although you must save each request to this method to expose the request history.
How would you do this on the web? You'd probably have a web page with a form, and that form would have input controls to collect the start and end point. When you submit the form, the browser would use the data in the controls, as well as the form metadata and standard HTML processing rules to create a request that would be sent to the server.
Technically, you could use POST as the method of the form. It's completely legal to do that. BUT, as the semantics of the request are "effectively read only", a better choice would be to use GET.
More precisely, this would mean having a family of similar resources, the representation of which includes information about the two points described in the query string.
That family of similar resources would probably be implemented on your origin server as a single operation/route, with a parser extracting the two points from the query string and passing them along to the function as arguments.
the interviewer asked me why, because since we are also storing the request, this endpoint is not idempotent anymore
This is probably the wrong objection - the semantics of GET requests are safe (effectively read only). So the interview might argue that saving the request history is not read only. However, this objection is invalid, because the semantic constraints apply to the request message, not the implementation.
For instance, you may have noticed that HTTP servers commonly add an entry to their access log for each request. Clearly that's not "read only" - but it is merely an implementation detail; the client's request did not say "and also log this".
GET is still fine here, even though the server is writing things down.
One possible objection would be that, if we use GET, then sometimes a cache will return an previous response rather than passing the request all the way through to the origin server to get logged. Which is GREAT - caches are a big part of the reason that the web can be web scale.
But if you don't want caching, the correct way to handle that is to add metadata to the response to inhibit caching, not to change the HTTP method.
Another possibility, which is more consistent with the interviewer's "idempotent" remark, is that they wanted this "request history" to be a resource that the client could edit, and that looking up distances would be a side effect of that editing process.
For instance, we might have some sort of an "itinerary" resource with one or more legs provided by the client. Each time the client modifies the itinerary (for example, by adding another leg), the distance lookup method is called automatically.
In this kind of a problem, where the client is (logically) editing a resource, the requests are no longer "effectively read only". So GET is off the table as an option, and we have to look into the other possibilities.
The TL;DR version is that POST would always be acceptable (and this is how we would do it on the web), but you might prefer an API style where the client edits the representation of the resource locally, in which case you would let the client choose between PUT and PATCH.

Using GET verb to update in rest api?

I know the use of http verbs is based on standard specification. But my question if I use "GET" for update operations and write a code logic to update, does it create issues in any scenario? Apart from the standard, what else could be the reason to use these verbs for a specific purpose only?
my question if I use "GET" for update operations and write a code logic to update, does it create issues in any scenario?
Yes.
A simple example - suppose the network between the client and the server is unreliable; specifically, for a time, HTTP responses are being lost. A general purpose component (like a web proxy) might time out, and then, noticing that the method token of the request is GET, resend the request a second/third/fourth time, with your server performing its update on every GET request.
Let us further assume that these multiple update operations lead to an undesirable outcome; where do we properly affix blame?
Second example: you send someone a copy of the link to the update operation, so that they can send you a request at the appropriate time. But suppose you send that link to them in an email, and the email client recognizes the uri and (as a performance optimization) pre-fetches the link, triggering your update operation too early. Where do we properly affix the blame?
HTTP does not attempt to require the results of a GET to be safe. What it does is require that the semantics of the operation be safe, and therefore it is a fault of the implementation, not the interface or the user of that interface, if anything happens as a result that causes loss of property -- Fielding, 2002
In these, and other examples, blame is correctly affixed to your server, because GET has a standardized meaning which include the constraint that the semantics of the request are safe.
That's not to say that you can't have side effects when handling a GET request; "hit counters" are almost as old as the web itself. You have a lot of freedom in your implementation; so long as you respect the uniform interface, there won't be too much trouble.
Experience report: one of our internal tools uses GET requests to trigger scheduling; in our carefully controlled context (which is not web scale), we get away with it, and have for a very long time.
To borrow your language, there are certainly scenarios that would give us problems; but given our controls we manage to avoid them.
I wouldn't like our chances, though, if requests started coming in from outside of our carefully controlled context.
I think it's a decent question. You're asking a hypothetical: is there any value to doing the right other than that's we agree to use GET for fetching? e.g.: is there value beyond the fact that it's 'semantically nice'. A similar question in HTML might be: "Is it ok to use a <div> with an onclick instead of a <button>? (the answer is no).
There certainly is. Clients, servers and intermediates all change their behavior depending on what method is used. Even if your server can process GET for updates, and you build a client that uses this, your browser might still get confused.
If you are interested in this subject, don't ask on a forum; read the spec. The HTTP specification tells you what clients, servers and proxies should do when they encounter certain methods, statuses and headers.
Start at RFC7231

Are DELETE/PUT http verbs needed?

I don't see a huge benefit of designing a REST API service that uses PUT and DELETE to update and delete data. These 2 verbs can be easily replaced by using a POST and a unique url.
The only benefit I see of using PUT and DELETE is it cuts down the number of url's that the REST API service uses which isn't much. Are there other benefits that I'm overlooking?
This actually has not so much to do with REST, but more with the Hypertext Transfer Protocol (HTTP).
The basic idea of HTTP is that you perform a certain action (one of the HTTP methods, such as GET, POST, PUT and DELETE) on a certain resource (identified by its URL). So if you start deviating from that by adding the action into the URL (such as http://example.com/api/books/123/delete) and using a different HTTP method, you are violating the HTTP protocol.
The downsides of this might not be instantly visible (because it will still work), and might be limited if you are only using the API by yourself. However, if other programmers are using the API as well, you are creating certain expectations by stating that you have a RESTful HTTP API while you are actually not adhering to the protocol.
For instance, if calling GET /api/books/123 returns a representation of the book resource, a developer expects to be able to call PUT on that same URL to update the information on that book, and DELETE to remove the book alltogether. (of course, if he does not have permissions to do that, you don't actually remove the book but you return a 403 'Forbidden' or 405 'Method Not Allowed' status code - which are also part of the HTTP specification)
However, if you deviate from the protocol (basically inventing your own protocol), you will have to describe somewhere that instead of calling DELETE /api/books/123 developers have to call POST /api/books/123/remove/the/book. They will have to be aware of all your API's custom rules because you are not following the standard.
Another good point is made by Darrel Miller in his answer. The HTTP methods all have a certain meaning and also certain characteristics. For instance, GET is supposed to be a safe method, used to retrieve information without making any changes on the server side. And PUT and DELETE are idempotent, which means that (even though they are not safe methods like GET) you can make as many requests as you like without any unwanted side effects (deleting a book one or ten times has the same effect: the book is gone). POST however is not idempotent: if you do ten POST requests to create a new book, you will most likely end up with 10 duplicate book entries.
Because of these characteristics, developers and tools are able to make certain assumptions. For instance, a search indexer such as Googlebot will only do GET requests, in order to not break anything on the server. However, if you would be violating the HTTP specification by having a URL http://example.com/api/books/123/delete in order to delete a book, you might notice at one day that your database is completely empty because Google has been indexing your site. That wouldn't be Google's fault but yours, because you didn't follow the specification.
So to make a long story short: using a protocol or standard (such as HTTP) sets certain expectations, and if you then deviate from the protocol you will create unexpected behavior and possible unwanted side effects.
It is not required to use PUT and DELETE to get the benefits of a REST based system.
Using the more precise methods can be advantageous in certain cases. Usually it is intermediary components that take advantage of the semantics. PUT and DELETE are idempotent methods, so if some generic component receives a 503, in theory it could retry a PUT/DELETE until it gets a successful response. With a POST method, the intermediary can't do that.
With a DELETE method a client knows not to send a body. With a PUT, it knows to send a complete representation. With a POST method, you need to communicate to the client how to make the request in some other way, like a link relation.
You can live without PUT and DELETE.
This article tells you why: http://www.artima.com/lejava/articles/why_put_and_delete.html
..BUT (quote from Fielding): "A REST API should not contain any changes to the communication protocols aside from filling-out or fixing the details of underspecified bits of standard protocols, such as HTTP’s PATCH method or Link header field. Workarounds for broken implementations (such as those browsers stupid enough to believe that HTML defines HTTP’s method set) should be defined separately, or at least in appendices, with an expectation that the workaround will eventually be obsolete. [Failure here implies that the resource interfaces are object-specific, not generic.] "
(http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven)

Writing a client for a RESTful (hypermedia) API

I've been reading up on 'real' RESTful APIs for a few days now, and I think I'm near to groking what it's about.
But one of the things that I stumble on is that I can't even begin to imagine how one would write a client for a 'real' hypermedia API:
Most of the examples I've read talk about browsers and spiders, but that's not especially helpful: one is human-directed and 'intelligent', the other is dumb and 'random'. As it stands, I kind of get the impression that you'd need to learn AI to get a client working.
One thing that isn't clear to me is how the client knows which verb to use on any given link? Is that implicit in the 'rel' type of the uri? The alternative (reading here) seems to be using xhtml and having a client which can parse and post forms.
How likely is it that a link will change, but not the route to the link?
In most examples you see around, the route and the link are the same:
eg. if I want to set up a client which will bring me back the list of cakes from Toni's Cake Shop:
http://tonis.com
{ link: { type : "cakes" ; uri : "http://tonis.com/cakes" } }
What happens when Toni's becomes Toni's Food Shop, and the link becomes http://tonis.com/desserts/cakes?
Do we keep the initial cakes link at the root, for reverse-compatibility? And if not, how do we do a 'redirect' for the poor little agent who has been told "go to root, look for cakes"?
What am I missing?
Ok, I'm not a REST expert either, I've been reading much related stuff lately, so what I'm going to write is not my experience or opinion but rather a summary of what I read, especially, the REST In Practice book.
First of all, you can't escape from having some initial agreement between client and server, the goal of REST is to make them agree on the very minimum of things which are relevant to both of them and let each party care about their own stuff themselves. E.g., client should not care about links layout or how the data is stored on server, and server should not care about a client's state. What they agree on in advance (i.e. before the interaction begins) is what aforementioned book's authors call "Domain Application Protocol" (DAP).
The important thing about DAP is that it's stateful, even though HTTP itself is not (since any client-service interaction has state, at least, begin and end). This state can be described in terms of "What a client can/may/is expected to do next": "I've started using the service, what now? Ok, I can search items. Search this item, what's next? Ok, I can order this and that... etc"
The definition of Hypermedia content-type is being able to handle both data exchange and interaction state. As I already mentioned, state is described in terms of possible actions, and as comes from "Resource" in REST, all actions are described in terms of accessible resources. I guess, you have seen the acronym "HATEOAS (Hypermedia as the engine of application state), so that's what it apparently means.
So, to interact with the service, a client uses a hypermedia format they both understand, which can be standard, homegrown or a mixture of those (e.g. XML/XHTML-based). In addition to that, they must also share the protocol, which is most likely HTTP, but since some details are omitted from the standard, there must be some idioms of its usage, like "Use POST to create a resource and PUT to update". Also, such protocol would include the entry points of the service (again, in terms of accessible resources).
Those three aspects fully define the domain protocol. In particular, a client is not supposed to know any internal links before it starts using the service or remember them after the interaction completes. As a result, any changes in the internal navigation, like renaming /cakes to /f5d96b5c will not affect the client as soon as it adhere the initial agreement and enters the shop through the front door.
#Benjol
You must avoid to program clients against particular URI's. When you describe a link main importance has it's meaning and not URI itself. You can change the URI any time, though this shouldn't break your client.
I'd change your example this way:
{"link": {
"rel": "collection http://relations.your-service.com/cakes",
"href": "http://tonis.com/cakes",
"title": "List of cakes",
"type": "application/vnd.yourformat+json"
}}
if there is a client which consumes your service, it needs to understand:
link structure itself
link relationships(in this case "collection" which is RFC and
"http://relations.your-service.com/cakes" which is your domain
specific link relation)
In this case client can just dereference address specified by "href" attribute and display list of cakes. Later, if you change the cake list provider URI client will continue to work, this implies that client still understands semantics of your media type.
P.S.
See registered link relation attributes:
http://www.iana.org/assignments/link-relations/link-relations.xml
Web Linking RFC: https://www.rfc-editor.org/rfc/rfc5988

RESTful Web Services: method names, input parameters, and return values?

I'm trying to develop a simple REST API. I'm still trying to understand the basic architectural paradigms for it. I need some help with the following:
"Resources" should be nouns, right? So, I should have "user", not "getUser", right?
I've seen this approach in some APIs: www.domain.com/users/ (returns list), www.domain.com/users/user (do something specific to a user). Is this approach good?
In most examples I've seen, the input and output values are usually just name/value pairs (e.g. color='red'). What if I wanted to send or return something more complex than that? Am I forced to deal with XML only?
Assume a PUT to /user/ method to add a new user to the system. What would be a good format for input parameter (assume the only fields needed are 'username' and 'password')? What would be a good response if the user is successful? What if the user has failed (and I want to return a descriptive error message)?
What is a good & simple approach to authentication & authorization? I'd like to restrict most of the methods to users who have "logged in" successfully. Is passing username/password at each call OK? Is passing a token considered more secured (if so, how should this be implemented in terms of expiration, etc.)?
For point 1, yes. Nouns are expected.
For point 2, I'd expect /users to give me a list of users. I'd expect /users/123 to give me a particular user.
For point 3, you can return anything. Your client can specify what it wants. e.g. text/xml, application/json etc. by using an HTTP request header, and you should comply as much as you can with that request (although you may only handle, say, text/xml - that would be reasonable in a lot of situations).
For point 4, I'd expect POST to create a new user. PUT would update an existing object. For reporting success or errors, you should be using the existing HTTP success/error codes. e.g. 200 OK. See this SO answer for more info.
the most important constraint of REST is the hypermedia constraint ("hypertext as the engine of application state"). Think of your Web application as a state machine where each state can be requested by the client (e.g. GET /user/1).Once the client has one such state (think: a user looking at a Web page) it sees a bunch of links that it can follow to go to a next state in the application. For example, there might be a link from the 'user state' that the client can follow to go to the details state.
This way, the server presents the client the application's state machine one state at a time at runtime. The clever thing: since the state machine is discovered at runtime on state at a time, the server can dynamically change the state machine at runtime.
Having said that...
on 1. the resources essentially represent the application states you want to present to the client. The will often closely match domain objects (e.g. user) but make sure you understand that the representations you provide for them are not simply serialized domain objects but states of your Web application.
Thinking in terms of GET /users/123 is fine. Do NOT place any action inside a URI. Although not harmful (it is just an opaque string) it is confusing to say the least.
on 2. As Brian said. You might want to take a look at the Atom Publishing Protocol RFC (5023) because it explains create/read/update cycles pretty well.
on 3. Focus on document oriented messages. Media types are an essential part of REST because they provide the application semantics (completely). Do not use generic types such as application/xml or application/json as you'll couple your clients and servers around the often implicit schema. If nothing fits your needs, just make up your own type.
Maybe you are interested in an example I am hacking together using UBL: http://www.nordsc.com/blog/?cat=13
on 4. Normally, use POST /users/ for creation. Have a look at RFC 5023 - this will clarify that. It is an easy to understand spec.
on 5. Since you cannot use sessions (stateful server) and be RESTful you have to send credentials in every request. Various HTTP auth schemes handle that already. It is also important with regard to caching because the HTTP Authorization header has special specified semantics to caches (no public caching). If you stuff your credentials into a cookie, you loose that important piece.
All HTTP status codes have a certain application semantic. Use them, do not tunnel your own error semantics through HTTP.
You can come visit #rest IRC or join rest-discuss on Yahoo for detailed discussions.
Jan