violate REST specs in favour of consistency - rest

I am trying to develop a RESTful web service that will be used for entities like Users, Products, and the like.
To create new user I want to use
[POST] site/user
as REST specs say
However, I also want to search for users. According to REST specs that would be
[GET] site/user?name=Shuaib&city=Dhaka
So far so good. But what if I want to enter large JSON data as part of the search parameters? If I use get in that case
-> my url will look clumsy
-> since there is a restriction on GET request url size large JSON data might exceed url size
Because of these problems, I want to use POST for searching for user.
[POST] site/user
Is this a good development practice?

No. This is not a good development approach, especially if you are concerned with other developers using your API. If you can search with URI parameters, a developer might be surprised that he can submit a huge URI with lots of parameter and it still works, but there's no inconsistency or barrier to understanding in that.
On the other hand, if you make an operation standardized to GET to be made through a POST just because you don't want those huge URIs, then you have to document that, developers will have to be familiar with your decision, and this will be a problem for understanding.
Keep in mind that the HTTP standards don't establish any limits on URI size, so being huge in size shouldn't affect your API design decisions. Sure, almost 100% of the clients and servers are broken in some way and have some limit on URI size. If you actually hit that limit, the RESTful way to solve the problem is to use some workaround that's loosely coupled to your service and it's explicitly documented as a workaround to a broken implementation. For instance, a pre-processor that rewrites POST requests with a X-HTTP-Method-Override: GET header to a GET request, like it's done by the Google Translate API.

Related

Can GET, PUT and PATCH be replaced with POST HTTP method?

POST , PUT, PATCH and GET are all different. Idempotent and safety being the key difference makers.
While writing RESTFul APIs , I encountered guidelines on when and where to use one of the HTTP methods. Since I am using Java for the back-end implementation, I can control the behavior of the HTTP methods on the persistent data.
For example , GET v1/book/{id} can be replaced with POST v1/book (with "id" in body) now with that id I can perform a query on db , fetching that particular book. (assuming book with that id already exists).
Similarly , I can achieve the workings of PATCH and PUT with POST itself.
Now, coming to the question , why don't we just use POST instead of GET , PUT and PATCH almost every time, ALMOST, when we can control the idempotent and safety behavior in the back-end?
Or , Is it just a guideline mentioned in RESTFul docs somewhere or stated by Roy fielding and we all are blindly following? Even if the guidelines are so what is the major idea behind them?
https://restfulapi.net/rest-put-vs-post/
https://restful-api-design.readthedocs.io/en/latest/methods.html
https://www.keycdn.com/support/put-vs-post
Above resources just mention either what does all the methods do or their differences. Articles mention the workings as if they were some guidelines , none of the docs online speak about the reason behind them.
None of them says , what if I used POST instead of PUT, PATCH and GET, what would be the side-effects? (as I can control their behaviors in the back-end)
Http methods are designed in the way that each method holds some responsibility. I will say that REST are the standards which are conventions and not the obligation. The convention doesn't stress us to follow the rules but they are designed for our code betterment. You can tweak the things and can use them in your way but that would be a bad idea. Like in this case if you are performing all the three actions with one method it would create great confusion in code (As the simple definition of POST is to create an object and that is what understood by everyone) and also degrade our coding standards.
I strongly discourage to replace three methods with one.
If you do that, you can't say you are "writing RESTFul APIs".
Whoever knows the RESTFul standard, will be confused about the behaviour of your apis.
If you fit the standard, then you will have an easier life.
After all, you have no real benefit in your approach.
HTTP is a transport protocol which as its name suggest is responsible for transfering data such as files or db entries across the wire to or from a remote system. In version 0.9 you basically only had the GET operation at your disposal while in HTTP 1.0 almost all of the current operations were added to the spec.
Each of these methods fulfills its own purpose. POST i.e. does process the payload according to the server's own semantics, whatever they will be. In theory it could be used therefore for retrieving, updating or removing content. Though, to a client it is basically unclear what a server actually does with the payload. There is no guarantee whether invoking a URI with that method is safe (the remote resource being altered) or not. Think of a crawler that is just invoking any URIs it finds and one of the links is an order link or a link where you perform a payment process. Do you really want a crawler to trigger one of your processes? The spec is rather clear that if something like that happens, the client must not made accountable for that. So, if a crawler ordered 10k products as one of your links, did trigger such a process, and the products are created in that process, you can't claim refund from the crawler's maintainer.
In additon to that, a response from a GET operation is cacheable by default. So if you invoke the same resource twice in a certain amount of time, chances are that the resource does not need to be fetched again a second (third, ...) time as it can be reused from the cache. This can reduce the load on the server quite significantly if used propperly.
As you've mentioned Fielding and REST. REST is an architectural style which you should use if you have plenty of different clients connecting to your services that are furthermore not under your control. Plenty of so-called REST APIs aren't adhering to REST as they follow a more simple and pragmatic RPC approach with external documentations such as Swagger and similar. RESTs main focus is on the decoupling of clients from servers which allow the latters to evolve freely without having to fear breaking clients. Clients on the other hand get more robust to changes.
Fielding only added few constraints a REST architecture has to adhere to. One of them is support for caching. Though Fielding later on wrote a well-cited blog-post where he explains what API designers have to consider before calling their API REST. True decoupling can only occur if all of the constraints are followed strictly. If only one clients violates these premises it won't benefit from REST at all.
The main premise in REST is (and should always be): Server teaches clients what they need and clients only use what they are served with. In the browsable Web, the big cousin of REST, a server will teach a client i.e. on what data the server expects via Web Forms through HTML and links are annotated with link-relation names to give the browser some hints on when to invoke that URI. On a Web page a trash bin icon may indicate a delition while a pencil icon may indicate an edit link. Such visual hints are also called affordacne. Such visual hints may not be applicable in machine to machine communication though such affordances may hint on other things they may provide. Think of a stylesheet that is annotated with preload. In HTTP 2 i.e. such a resource could be attempted to be pushed by the server or in HTTP 1.1 the browser could alread load that stylesheet while the page is still parsed to speed things up. In order to gain whitespread knowledge of those meanings, such values should be standardized at IANA. Through custom extensions or certain microformats such as dublin core or the like you may add new relation names that are too specific for common cases but are common to the domain itself.
The same holds true for media-types client and server negotiate about. A general applicable media-type will probably reach wider acceptance than a tailor-made one that is only usable by a single company. The main aim here is to reach a point where the same media-type can be reused for various areas and APIs. REST vision is to have a minimal amount of clients that are able to interact with a plethora of servers or APIs, similar to a browser that is able to interact with almost all Web sites.
Your ultimate goal in REST is that a user is following an interaction protocol you've set up, which could be something similar to following an order process or playing a text game or what not. By giving a client choices it will progress through a certain process which can easily be depicted as state machine. It is following a kind of application-driven protocol by following URIs that caught the clients attention and by returning data that was taught through a form like representation. As, more or less, only standardized representation formats should be used, there is no need for out-of-band information on how to interact with the API necessary.
In reality though, plenty of enterprises don't really care about long-lasting APIs that are free to evolve over the years but in short-term success stories. They usually also don't care that much whether they use the propper HTTP operations at all or stay in bounds with the HTTP spec (i.e. sending payloads with HTTP GET requesst). Their primary intent is to get the job done. Therefore pragmatism usually wins over design and as such plenty of developers follow the way of short success and have to adept their work later on, which is often cumbersome as the API is now the driving factor of their business and therefore they can't change it easily without having to revampt the whole design.
... why don't we just use POST instead of GET , PUT and PATCH almost every time, ALMOST, when we can control the idempotent and safety behavior in the back-end?
A server may know that a request is idempotent, but the client does not. Properties such as safe and idempotency are promisses to the client. Whether the server satisfies these or not is a different story. How should a client know whether a sent payment request reached the server and the response just got lost or the initial request didn't make it to the server at all in case of a temporary connection issue? A PUT requests does guarante idempotency. I.e. you don't want to order the same things twice if you resubmit the same request again in case of a network issue. While the same request could also be sent via POST and the server being smart enough to not process it again, the client doesn't know the server's behavior unless it is externally documented somehwere, which violates REST principles again also somehow. So, to state it differently, such properties are more or less promisses to the client, less to the server.

Does REST only cover CRUD?

I'm writing an AngularJS application that's communicating with an API, and right now that API is following the REST architecture.
I know the basics of REST, but I've still not understood if REST only covers the CRUD operations? For example, if I'm building a community website and I want to make it possible for people to add each other as friends, is this covered by REST in any way? What about search queries? If not, is there any other architecture that's recommended to follow, or should I roll my own?
Also, should I even be using REST for a community website? There are a lot of cases where it seems like it's not the optimal design, but when I google around I only get results saying that REST is the best practice. For example PUT /api/user/:id wouldn't be very useful, since the only user you're able to update (unless you're an admin) is yourself.
It all depends, REST is just an architectural style and (in many forms unfortunately) is used all over the world. I also follow REST rules in all type of applications but try to stay at the second level of Richardson's Maturity Model. Why? Since I consider HAL, HATEOAS and all the API discoverability as an unnecessary buzz - unfortunately documentation is still very important.
What you need to consider while designing an API is if it's going to public or not. If it's not, you can probably whatever you want/need (of course this is not good idea). If it is going to be public the consistency starts to play a great role - API needs to be designed in such a way that it will be both intuitive and easy to use. E.g. this is not good idea to introduce new endpoint every time you need a new operation - thus following CRUD REST rules seems to be reasonable option. When it comes to to going beyond CRUD - yes, I've created APIs with verbs in endpoints - but it was almost always the last resort and to be honest I don't feel guilty.
I think the question is a bit too broad, but I'll try to answer.
REST only covers the CRUD operations?
No, it covers other operations as well. You have to transform your operation into a HTTP method and a resource. The resource can have identifiers: URIs. An URI with a HTTP method compose a hyperlink. This hyperlink can be followed by the client. You can attach the operation name, etc... to the hyperlink as meta-data, so it can be used by the client to recognize the operation. At least that's how it should work.
What about search queries?
General queries are not supported currently, because there is no standard RDF vocab which could be used to describe a general query. There are non-standard workaround, you can use them or for example a SPARQL endpoint. More fixed queries can be used with URI templates.
Also, should I even be using REST for a community website?
As far as I know facebook uses it for 3rd party clients, so you can develop a facebook application using their REST API. Another advantage that it scales better than SOAP. If you don't need these features currently, then you can use something else you are more familiar with.

Can I have a REST element URI without a collection URI?

a basic REST question.. I design a REST API and would like to be able to get a list of book recommendations based on a book id (i.e. client sends book id=w to server and server replies with a list of recommended books, id=x,y,z).
I see two ways to do this:
/recommendation?bookId=thetitle
/recommendation/thetitle
Option 2 seems a bit cleaner to me but I'm not sure if it would be considered good REST design? Because /recommendation/thetitle looks like an element URI, not a collection URI (although in this case it would return a collection). Also, the first part of the resource (/recommendation) would not make any sense by itself.
Thankful for any advice.
URL patterns of this kind have nothing to do with REST. None of the defining properties of REST requires readable URLs.
At the same time, one of the core principles (HATEOAS), if followed properly, allows API clients (applications, not people!) to browse the API and obtain every link required to perform a desired transition of application state or resource state based on a well known message format.
If you feel your API must have readable URLs, it's a good sign that its design probably isn't RESTful at all. This implies the need for a developer to understand the URL structure and hardcode it somewhere in a client application. Something that REST is supposed to avoid by principle.
To quote Roy Fielding's blog post on the subject:
A REST API must not define fixed resource names or hierarchies (an obvious coupling of client and server). Servers must have the freedom to control their own namespace. Instead, allow servers to instruct clients on how to construct appropriate URIs, such as is done in HTML forms and URI templates, by defining those instructions within media types and link relations. [Failure here implies that clients are assuming a resource structure due to out-of band information, such as a domain-specific standard, which is the data-oriented equivalent to RPC’s functional coupling].
Obviously, nothing stops you from actually making URLs meaningful regardless of how RESTful your API actually is. Even if it's for a purpose not dictated by REST itself (viewing the logs left by a client of a properly RESTful API could be easier for a human if they're readable, off the top of my head).
Finally, if you're fine with developing a Web API that's not completely RESTful and you expect developers of clients to read this kind of docs and care about path building, you might actually benefit from comprehensible URLs. This can be very useful in APIs of the so-called levels 0-3, according to Richardson's maturity model.
What's important in terms of REST is how you're leveraging the underlying protocol (HTTP in this case) and what it allows you to do. If we consider your examples from this perspective, /recommendation/thetitle seems preferable. This is because the use of query parameters may prevent responses from being cached by browsers (important if you're writing a JS client) or proxies, making it harder to reuse existing tools and infrastructure.

Should a Netflix or Twitter-style web service use REST or SOAP? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I've implemented two REST services: Twitter and Netflix. Both times, I struggled to find the use and logic involved in the decision to expose these services as REST instead of SOAP. I hope somebody can clue me in to what I'm missing and explain why REST was used as the service implementation for services such as these.
Implementing a REST service takes infinitely longer than implementing a SOAP service. Tools exist for all modern languages/frameworks/platforms to read in a WSDL and output proxy classes and clients. Implementing a REST service is done by hand and - get this - by reading documentation. Furthermore, while implementing these two services, you have to make "guesses" as to what will come back across the pipe as there is no real schema or reference document.
Why write a REST service that returns XML anyway? The only difference is that with REST you don't know the types each element/attribute represents - you are on your own to implement it and hope that one day a string doesn't come across in a field you thought was always an int. SOAP defines the data structure using the WSDL so this is a no-brainer.
I've heard the complaint that with SOAP you have the "overhead" of the SOAP Envelope. In this day and age, do we really need to worry about a handful of bytes?
I've heard the argument that with REST you can just pop the URL into the browser and see the data. Sure, if your REST service is using simple or no authentication. The Netflix service, for instance, uses OAuth which requires you to sign things and encode things before you can even submit your request.
Why do we need a "readable" URL for each resource? If we were using a tool to implement the service, do we really care about the actual URL?
A canary in a coal mine.
I have been waiting for a question like this for close to a year now. It was inevitable that this day would come and I am sure we are going to see many more questions like this in the coming months.
The warning signs
You are absolutely correct, it does take longer to build RESTful clients than SOAP clients. The SOAP toolkits take away lots of boilerplate code and make client proxy objects available with almost no effort. With a tool like Visual Studio and a server URL I can be accessing remote objects of arbitrary complexity, locally in under five minutes.
Services that return application/xml and application/json are so annoying for client developers. What are we supposed to do with that blob of data?
Fortunately, lots of sites that provide REST services also provide a bunch of client libraries so that we can use those libraries to get access to a bunch of strongly typed objects. Seems kind of dumb though. If they had used SOAP we could have code-gen’d those proxy classes ourselves.
SOAP overhead, ha. It’s latency that kills. If people are really concerned about the number of excess bytes going across the wire then maybe HTTP is not the right choice. Have you seen how many bytes are used by the user-agent header?
Yeah, have you ever tried using a web browser as debugging tool for anything other than HTML and javascript. Trust me it sucks. You can only use two of the verbs, the caching is constantly getting in the way, the error handling swallows so much information, it’s constantly looking for a goddamn favicon.ico. Just shoot me.
Readable URL. Only nouns, no verbs. Yeah, that’s easy as long as we are only doing CRUD operations and we only need to access a hierarchy of objects in one way. Unfortunately most applications need a wee bit more functionality than that.
The impending disaster
There are a metric boatload of developers currently developing applications that integrate with REST services who are in the process of coming to the same set of conclusions that you have. They were promised simplicity, flexibility, scalability, evolvabilty and the holy grail of serendipitous reuse. The characteristics of the web itself, how can things go wrong.
However, they are finding that versioning is just as much of a problem, but the compiler doesn’t help detect issues. The hand written client code is a pain to maintain as the data structures evolve and URLs get refactored. Designing APIs around just nouns and four verbs can be really hard, especially with RESTful Url zealots telling you when you can and cannot use query strings.
Developers are going to start asking why are we wasting our effort on support both Json formats and Xml formats, why not just focus our efforts on one and do it well?
How did things go so wrong
I’ll tell you what went wrong. We as developers let the marketing departments take advantage of our primary weakness. Our eternal search for the silver bullet blinded us to the reality of what REST really is. On the surface REST seems so easy and simple. Name your resources with Urls and use GET, PUT, POST and DELETE. Hell, us devs already know how to do that, we have been dealing with databases for years that have tables and columns and SQL statements that have SELECT, INSERT, UPDATE and DELETE. It should have been a piece of cake.
There are other parts of REST that some people discuss, such as self-descriptiveness, and the hypermedia constraint, but these constraints are not so simple as resource identification and the uniform interface. The seem to add complexity where the desired goal is simplicity.
This watered down version of REST became validated in developer culture in many ways. Server frameworks were created that encouraged Resource Identification and the uniform interface, but did nothing to support the other constraints. Terms started to float around differentiating the approaches, (HI-REST vs LO-REST, Corporate REST vs Academic REST, REST vs RESTful).
A few people scream out that if you don’t apply all of the constraints it’s not REST. You will not get the benefits. There is no half REST. But those voices were labelled as religious zealots who were upset that their precious term had been stolen from obscurity and made mainstream. Jealous people who try to make REST sound more difficult than it is.
REST, the term, has definitely become mainstream. Almost every major web property that has an API supports "REST". Twitter and Netflix are two very high profile ones. The scary thing is that I can only think of one public API that is self-descriptive and there are a handful that truly implement the hypermedia constraint. Sure some sites like StackOverflow and Gowalla support links in their responses, but there are huge gaping holes in their links. The StackOverflow API has no root page. Imagine how successful the web site would have been if there was no home page for the web site!
You were misled I’m afraid
If you have made it this far, the short answer to your question is those APIs (Netflix and Twitter) do not conform to all of the constraints and therefore you will not get the benefits that REST apis are supposed to bring.
REST clients do take longer to build than SOAP clients but they are not tied to one specific service, so you should be able to re-use them across services. Take the classic example, of a web browser. How many services can a web browser access? What about a Feed Reader? Now how many different services can the average Twitter client access? Yes, just one.
REST clients are not supposed to be built to interface with a single service, they are supposed to be built to handle specific media types that could be served by any service. The obvious question to that is, how can you build a REST client for a service that delivers application/json or application/xml. Well you can’t. That’s because those formats are completely useless to a REST client. You said it yourself,
you have to make "guesses" as to what
will come back across the pipe as
there is no real schema or reference
document
You are absolutely correct for services like Twitter. However, the self-descriptive constraint in REST says that the HTTP content type header should describe exactly the content that is being transmitted across the wire. Delivering application/json and application/xml tells you nothing about the content.
When it comes to considering the performance of REST based systems it is necessary look at the bigger picture. Talking about envelope bytes is like talking about loop unwinding when comparing a quick-sort to a shell-sort. There are scenarios where SOAP can perform better, and there are scenarios where REST can perform better. Context is everything.
REST gains much of its performance advantage by being very flexible about what media types it supports and by having sophisticated support for caching. For caching to work well though nearly all of the constraints must be adhered to.
Your last point about readable urls is by far the most ironic. If you truly commit to the hypermedia constraint, then every URL could be a GUID and the client developer would lose nothing in readability.
The fact that URIs should be opaque to the client is one of the most key things when developing REST systems. Readable URLs are convenient for the server developer and well structured URLs make it easier for the server framework to dispatch requests, but those are implementation details that should have no impact on the developers consuming the API.
The Twitter API is not even close to being RESTful and that is why you are unable to see any benefit to using it over SOAP. The Netflix API is much closer but it’s use of generic media types demonstrates that failing to adhere to even a single constraint can have a profound impact on the benefits derived from the service.
It may not be all their fault
I’ve done a whole lot of dumping on the service providers, but it takes two to dance RESTfully. A service may follow all of the constraints religiously and a client can still easily undo all of the benefits.
If a client hard codes urls to access certain types of resources then it is preventing the server from changing those urls. Any kind URL construction based on implicit knowledge of how the service structures its urls is a violation.
Making assumptions about what type of representation will be returned from a link can lead to problems. Making assumptions about the content of the representation based on knowledge that is not explicitly stated in the HTTP headers is definitely going to create coupling that will cause pain in the future.
Should they have used SOAP?
Personally, I don’t think so. REST done right allows a distributed system to evolve over the long term. If you are building distributed systems that have components that are developed by different people and need to last for many years, then REST is a pretty good option.
SOAP is an object-oriented, remote procedure call technology stack. It works by building a new abstraction on top of an existing protocol (HTTP).
REST is a document oriented approach, that simply uses the features of an existing protocol (HTTP). "REST" is just a buzzword -- the concept is this: Just use the web the way it was designed to work!
In response to edits to question:
"Implementing a REST service takes infinitely longer than implementing a SOAP service."
Um, no, it can't be infinitely longer. And in cases where what you are trying to retrieve is already a document or file, it's actually much faster. For example, the OGC spec for WMS (Web Mapping Service) defines both a SOAP and REST version of the protocol, and there's a reason why almost nobody implements the SOAP version -- it's because if you're trying to get a map, it's a lot easier to just build a URL and fetch image bytes from that URL than it is to bother with encapsulating it into a SOAP message. But yes, I will agree that if the point of the web service is to transfer some strongly-typed object in a domain object model, SOAP is better suited for that use.
"Why write a REST service that returns XML anyway?"
Well, yes, that can be silly. But it depends on what the XML is. If there's a clearly defined schema for it somewhere, then there's no ambiguity. For example, you can think of WSDL URLs as being a kind of RESTful web service for retrieving information about a web service. In this case, adding the overhead of another SOAP request would be pointless.
In general, REST wins when the content that is being transferred can be thought of as a file, as a single unit. SOAP wins when the content needs to be treated as an object with members.
"I've heard the complaint that with SOAP you have the "overhead" of the SOAP Envelope. In this day and age, do we really need to worry about a handful of bytes?"
Yes. Not in every circumstance, but there are sites with a great deal of traffic where it makes a difference. Is it enough of a difference to outweigh the semantic differences of using SOAP instead of REST? I doubt it. If you're doing an object remoting protocol and the number of bytes is making a difference, SOAP is probably not the tool for you anyway -- maybe you should be using CORBA or DCOM instead.
"I've heard the argument that with REST you can just pop the URL into the browser and see the data."
Yes, and this is a large argument in favor of REST if it makes sense to view the data in a browser. For example, with image data, it's an easy way to debug the service -- just paste the URL into your browser's address bar and see what the image looks like. Or if the data returned is in XML, and you have a referenced XML stylesheet that renders into readable HTML in the browser, then you get the benefit of semantic markup and easy visualization all in one package. But you are correct, this benefit mostly evaporates when working with more complex authentication schemes. If you can't encode all your authentication information into each HTTP request, then I would argue that it doesn't count as REST at all.
"Why do we need a "readable" URL for each resource? If we were using a tool to implement the service, do we really care about the actual URL?"
Well, it depends. Why do we need readable URLs for any resource on the web? You can read Tim Berners-Lee's essay Cool URIs Don't Change for the rationale, but basically, as long as the resource may still be useful in the future, the URI for that resource should stay the same.
Obviously, for transient resources (like the "today's Money" link in the essay) there is no need for it, since the need to reference the resource goes away if the corresponding resource goes away. But for more permanent resources (like StackOverflow questions, for example, or movies on IMDB), you want to have a URL that will work forever. When you're designing a web service, you need to decide if the resources themselves could outlive your service, and if so, then REST is probably the right way to go.
For the record, yes, I've been developing web pages since well before NetFlix or Twitter existed. And no, I've not yet had any need or opportunity to implement a client to either NetFlix or Twitter's services. But even if their services are atrociously difficult to work with, that doesn't mean the technology they implemented their services on top of is bad -- only that those two implementations are bad.
To make a long story short: REST and SOAP are just tools. They each have strengths and weaknesses. If the only tool you have is a hammer, then every problem looks like a nail. So get to know both tools, and learn how to use them correctly, and then choose the right tool for each job.
An honest question deserves an honest answer. But first, why did you use the text of this question as an answer to another question if you did not think it was rhetorical in nature?
Anyway:
"Tools exist for all modern languages/frameworks/platforms to read in a WSDL and output proxy classes and clients. Implementing a REST service is done by hand by reading documentation."
Just like browser vendors have read and re-read the HTML 4.01 specification up and down to try to implement a consistent browsing experience. Have you reflected on the fact that browsers were invented long before internet banking and stackoverflow, and yet, you can use a browser to do just those things. This is made possible because of the sole reason that everybody agrees to use HTML (and related formats like CSS, JS, JPEG etc).
Blogging is actually not that new, and someone came up with AtomPub, which allows any blogging software to access and update posts in a blog, much like any web browser can access any web page. That's pretty neat, and works because of the RESTful constraints imposed by the protocol.
But for Twitter and Netflix, there is no universal agreement that "all microblogs in existence shall use the media type application/tweet", mainly because microblogging is so new. Maybe in a few years time a few microblogging services settle on the same API so that Twitter, Facebook, Identica and can interoperate. None of their existing APIs are anywhere near RESTful, however much they claim, so I don't expect it to happen real soon.
"Furthermore, while implementing these two services, you have to make "guesses" as to what will come back across the pipe as there is no real schema or reference document."
You've hit the nail on the head. REST is all about distributed and hypermedia, and that pretty much sums it up. A browser looks at what it gets from a request and shows it to the user. A HTML page usually spawns a lot more GET requests, for example CSS, scripts and images. An image is typically only rendered to the screen, JavaScript is executed, and so on. Each time, the browser does what it does because it found the link in an <img> or <style> tag and the response media type was image/jpeg or text/css.
If Twitter makes a hypermedia based API, it will probably always return an application/tweet every time you follow a link to a tweet, but the client should never assume it, and always check what it gets before acting on it.
"Why write a REST service that returns XML anyway?"
This all boils down to media types. Like HTML, if you see an element that you've no idea what actually means, the HTML spec instructs you to ignore them, and process the "body" of the tag if it has one. Likewise, the atom spec instructs you to ignore unknown elements and foreign markup (from different namespaces) and not process the body (IIRC).
Designing media types for generic problem domains (as in the HTML media type for the rich text problem domain) is very hard. Making media types for very narrow problem domains is probably a lot easier (like a tweet). But it's always a good idea to design for extensibility and specify how clients (and servers) are supposed to react when they see elements or data items that don't match the spec. JPEG, for example has an Application-specific record type (e.g. APP1) which is used to contain all sorts of meta data.
"I've heard the complaint that with SOAP you have the "overhead" of the SOAP Envelope. In this day and age, do we really need to worry about a handful of bytes?"
No, we don't. REST is absolutely not about being efficient over the wire, it's actually trading wire efficiency in. REST's efficiency comes from the possibilities of caching enabled by all the other constraints: Fielding's dissertation notes: The trade-off, though, is that a uniform interface degrades efficiency, since information is transferred in a standardized form rather than one which is specific to an application's needs. The REST interface is designed to be efficient for large-grain hypermedia data transfer, optimizing for the common case of the Web, but resulting in an interface that is not optimal for other forms of architectural interaction. I don't think that the SOAP Envelope byte count overhead is a valid concern.
"I've heard the argument that with REST you can just pop the URL into the browser and see the data."
Yes, that's also an invalid argument. It doesn't work that way. Even if it did work, most narrow REST APIs out there use media types that browsers have no idea about and it still won't work.
But there are a lot more possibilities than a browser to test a HTTP based API, like command line utilities or browser extensions that allow you to control almost any aspect of a HTTP request, inspect response headers and discover links for you to follow. But even so, this is nowhere near as easy as generating WSDL stubs and making a three line program to call the function anyway.
"Why do we need a "readable" URL for each resource? If we were using a tool to implement the service, do we really care about the actual URL?"
If you look at how the web works, I'm pretty sure that humans are by and large glad that the URI for a wikipedia page looks like this, http://en.wikipedia.org/wiki/Stack_overflow instead of http://en.wikipedia.org/wiki/?oldid=376349090. But it actually is not important to REST. The important thing to try to get right is to choose to place relevant data in the URI that is not likely to change. You might think that the database ID will never change, but what happens when two data sets need to be merged? All your primary keys change. The page title (Stack_overflow) will not change.
Sorry for the long response, but I believe this question is valid, and hasn't been addressed before here on SO. I'm sure Darrel Miller will add his answer once he's back too.
Edit: formatting
Martin Fowler has a post on the Richardson Maturity Model which does a great job explaining the difference between SOAP and REST.
WSDL and other document level protocols are redundant. The HTTP protocol supports a much richer set of operations besides just serving documents and submitting forms.
Supporters of REST are uncomfortable with that redundancy.

What is the advantage of using REST instead of non-REST HTTP?

Apparently, REST is just a set of conventions about how to use HTTP. I wonder which advantage these conventions provide. Does anyone know?
I don't think you will get a good answer to this, partly because nobody really agrees on what REST is. The wikipedia page is heavy on buzzwords and light on explanation. The discussion page is worth a skim just to see how much people disagree on this. As far as I can tell however, REST means this:
Instead of having randomly named setter and getter URLs and using GET for all the getters and POST for all the setters, we try to have the URLs identify resources, and then use the HTTP actions GET, POST, PUT and DELETE to do stuff to them. So instead of
GET /get_article?id=1
POST /delete_article id=1
You would do
GET /articles/1/
DELETE /articles/1/
And then POST and PUT correspond to "create" and "update" operations (but nobody agrees which way round).
I think the caching arguments are wrong, because query strings are generally cached, and besides you don't really need to use them. For example django makes something like this very easy, and I wouldn't say it was REST:
GET /get_article/1/
POST /delete_article/ id=1
Or even just include the verb in the URL:
GET /read/article/1/
POST /delete/article/1/
POST /update/article/1/
POST /create/article/
In that case GET means something without side-effects, and POST means something that changes data on the server. I think this is perhaps a bit clearer and easier, especially as you can avoid the whole PUT-vs-POST thing. Plus you can add more verbs if you want to, so you aren't artificially bound to what HTTP offers. For example:
POST /hide/article/1/
POST /show/article/1/
(Or whatever, it's hard to think of examples until they happen!)
So in conclusion, there are only two advantages I can see:
Your web API may be cleaner and easier to understand / discover.
When synchronising data with a website, it is probably easier to use REST because you can just say synchronize("/articles/1/") or whatever. This depends heavily on your code.
However I think there are some pretty big disadvantages:
Not all actions easily map to CRUD (create, read/retrieve, update, delete). You may not even be dealing with object type resources.
It's extra effort for dubious benefits.
Confusion as to which way round PUT and POST are. In English they mean similar things ("I'm going to put/post a notice on the wall.").
So in conclusion I would say: unless you really want to go to the extra effort, or if your service maps really well to CRUD operations, save REST for the second version of your API.
I just came across another problem with REST: It's not easy to do more than one thing in one request or specify which parts of a compound object you want to get. This is especially important on mobile where round-trip-time can be significant and connections are unreliable. For example, suppose you are getting posts on a facebook timeline. The "pure" REST way would be something like
GET /timeline_posts // Returns a list of post IDs.
GET /timeline_posts/1/ // Returns a list of message IDs in the post.
GET /timeline_posts/2/
GET /timeline_posts/3/
GET /message/10/
GET /message/11/
....
Which is kind of ridiculous. Facebook's API is pretty great IMO, so let's see what they do:
By default, most object properties are returned when you make a query.
You can choose the fields (or connections) you want returned with the
"fields" query parameter. For example, this URL will only return the
id, name, and picture of Ben:
https://graph.facebook.com/bgolub?fields=id,name,picture
I have no idea how you'd do something like that with REST, and if you did whether it would still count as REST. I would certainly ignore anyone who tries to tell you that you shouldn't do that though (especially if the reason is "because it isn't REST")!
Simply put, REST means using HTTP the way it's meant to be.
Have a look at Roy Fielding's dissertation about REST. I think that every person that is doing web development should read it.
As a note, Roy Fielding is one of the key drivers behind the HTTP protocol, as well.
To name some of the advandages:
Simple.
You can make good use of HTTP cache and proxy server to help you handle high load.
It helps you organize even a very complex application into simple resources.
It makes it easy for new clients to use your application, even if you haven't designed it specifically for them (probably, because they weren't around when you created your app).
Simply put: NONE.
Feel free to downvote, but I still think there are no real benefits over non-REST HTTP. All current answers are invalid. Arguments from the currently most voted answer:
Simple.
You can make good use of HTTP cache and proxy server to help you handle high load.
It helps you organize even a very complex application into simple resources.
It makes it easy for new clients to use your application, even if you haven't designed it specifically for them (probably, because they weren't around when you created your app).
1. Simple
With REST you need additional communication layer for your server-side and client-side scripts => it's actually more complicated than use of non-REST HTTP.
2. Caching
Caching can be controlled by HTTP headers sent by server. REST does not add any features missing in non-REST.
3. Organization
REST does not help you organize things. It forces you to use API supported by server-side library you are using. You can organize your application the same way (or better) when you are using non-REST approach. E.g. see Model-View-Controller or MVC routing.
4. Easy to use/implement
Not true at all. It all depends on how well you organize and document your application. REST will not magically make your application better.
IMHO the biggest advantage that REST enables is that of reducing client/server coupling. It is much easier to evolve a REST interface over time without breaking existing clients.
Discoverability
Each resource has references to other resources, either in hierarchy or links, so it's easy to browse around. This is an advantage to the human developing the client, saving he/she from constantly consulting the docs, and offering suggestions. It also means the server can change resource names unilaterally (as long as the client software doesn't hardcode the URLs).
Compatibility with other tools
You can CURL your way into any part of the API or use the web browser to navigate resources. Makes debugging and testing integration much easier.
Standardized Verb Names
Allows you to specify actions without having to hunt the correct wording. Imagine if OOP getters and setters weren't standardized, and some people used retrieve and define instead. You would have to memorize the correct verb for each individual access point. Knowing there's only a handful of verbs available counters that problem.
Standardized Status
If you GET a resource that doesn't exist, you can be sure to get a 404 error in a RESTful API. Contrast it with a non-RESTful API, which may return {error: "Not found"} wrapped in God knows how many layers. If you need the extra space to write a message to the developer on the other side, you can always use the body of the response.
Example
Imagine two APIs with the same functionality, one following REST and the other not. Now imagine the following clients for those APIs:
RESTful:
GET /products/1052/reviews
POST /products/1052/reviews "5 stars"
DELETE /products/1052/reviews/10
GET /products/1052/reviews/10
HTTP:
GET /reviews?product_id=1052
POST /post_review?product_id=1052 "5 stars"
POST /remove_review?product_id=1052&review_id=10
GET /reviews?product_id=1052&review=10
Now think of the following questions:
If the first call of each client worked, how sure can you be the rest will work too?
There was a major update to the API that may or may not have changed those access points. How much of the docs will you have to re-read?
Can you predict the return of the last query?
You have to edit the review posted (before deleting it). Can you do so without checking the docs?
I recommend taking a look at Ryan Tomayko's How I Explained REST to My Wife
Third party edit
Excerpt from the waybackmaschine link:
How about an example. You’re a teacher and want to manage students:
what classes they’re in,
what grades they’re getting,
emergency contacts,
information about the books you teach out of, etc.
If the systems are web-based, then there’s probably a URL for each of the nouns involved here: student, teacher, class, book, room, etc. ... If there were a machine readable representation for each URL, then it would be trivial to latch new tools onto the system because all of that information would be consumable in a standard way. ... you could build a country-wide system that was able to talk to each of the individual school systems to collect testing scores.
Each of the systems would get information from each other using a simple HTTP GET. If one system needs to add something to another system, it would use an HTTP POST. If a system wants to update something in another system, it uses an HTTP PUT. The only thing left to figure out is what the data should look like.
I would suggest everybody, who is looking for an answer to this question, go through this "slideshow".
I couldn't understand what REST is and why it is so cool, its pros and cons, differences from SOAP - but this slideshow was so brilliant and easy to understand, so it is much more clear to me now, than before.
Caching.
There are other more in depth benefits of REST which revolve around evolve-ability via loose coupling and hypertext, but caching mechanisms are the main reason you should care about RESTful HTTP.
It's written down in the Fielding dissertation. But if you don't want to read a lot:
increased scalability (due to stateless, cache and layered system constraints)
decoupled client and server (due to stateless and uniform interface constraints)
reusable clients (client can use general REST browsers and RDF semantics to decide which link to follow and how to display the results)
non breaking clients (clients break only by application specific semantics changes, because they use the semantics instead of some API specific knowledge)
Give every “resource” an ID
Link things together
Use standard methods
Resources with multiple representations
Communicate statelessly
It is possible to do everything just with POST and GET? Yes, is it the best approach? No, why? because we have standards methods. If you think again, it would be possible to do everything using just GET.. so why should we even bother do use POST? Because of the standards!
For example, today thinking about a MVC model, you can limit your application to respond just to specific kinds of verbs like POST, GET, PUT and DELETE. Even if under the hood everything is emulated to POST and GET, don't make sense to have different verbs for different actions?
Discovery is far easier in REST. We have WADL documents (similar to WSDL in traditional webservices) that will help you to advertise your service to the world. You can use UDDI discoveries as well. With traditional HTTP POST and GET people may not know your message request and response schemas to call you.
One advantage is that, we can non-sequentially process XML documents and unmarshal XML data from different sources like InputStream object, a URL, a DOM node...
#Timmmm, about your edit :
GET /timeline_posts // could return the N first posts, with links to fetch the next/previous N posts
This would dramatically reduce the number of calls
And nothing prevents you from designing a server that accepts HTTP parameters to denote the field values your clients may want...
But this is a detail.
Much more important is the fact that you did not mention huge advantages of the REST architectural style (much better scalability, due to server statelessness; much better availability, due to server statelessness also; much better use of the standard services, such as caching for instance, when using a REST architectural style; much lower coupling between client and server, due to the use of a uniform interface; etc. etc.)
As for your remark
"Not all actions easily map to CRUD (create, read/retrieve, update,
delete)."
: an RDBMS uses a CRUD approach, too (SELECT/INSERT/DELETE/UPDATE), and there is always a way to represent and act upon a data model.
Regarding your sentence
"You may not even be dealing with object type resources"
: a RESTful design is, by essence, a simple design - but this does NOT mean that designing it is simple. Do you see the difference ? You'll have to think a lot about the concepts your application will represent and handle, what must be done by it, if you prefer, in order to represent this by means of resources. But if you do so, you will end up with a more simple and efficient design.
Query-strings can be ignored by search engines.