Why do we need anything more than HTTP GET, PUT, POST? - rest

What is the practical benefit of using HTTP GET, PUT, DELETE, POST, HEAD? Why not focus on their behavioral benefits (safety and idempotency), forgetting their names, and use GET, PUT or POST depending on which behavior we want?
Why shouldn't we only use GET, PUT and POST (and drop HEAD, DELETE)?

The [REST][1] approach uses POST, GET, PUT and DELETE to implement the CRUD rules for a web resource. It's a simple and tidy way to expose objects to requests on the web. It's web services without the overheads.
Just to clarify the semantic differences. Each operation is rather different. The point is to have nice HTTP methods that have clear, distinct meanings.
POST creates new objects. The URI has no key; it accepts a message body that defines the object. SQL Insert. [Edit While there's no technical reason for POST to have no key, the REST folks suggest strongly that for POST to have distinct meaning as CREATE, it should not have a key.]
GET retrieves existing objects. The URI may have a key, depends on whether you are doing singleton GET or list GET. SQL Select
PUT updates an existing object. The URI has a key; It accepts a message body that updates an object. SQL Update.
DELETE deletes an existing object. The URI has a key. SQL Delete.
Can you update a record with POST instead of PUT? Not without introducing some ambiguity. Verbs should have unambiguous effects. Further, POST URI's have no key, where PUT must have a key.
When I POST, I expect a 201 CREATED. If I don't get that, something's wrong. Similarly, when I PUT, I expect a 200 OK. If I don't get that, something's wrong.
I suppose you could insist on some ambiguity where POST does either POST or PUT. The URI has to be different; also the associated message could be different. Generally, the REST folks take their cue from SQL where INSERT and UPDATE are different verbs.
You could make the case that UPDATE should insert if the record doesn't exist or update if the record does exist. However, it's simpler if UPDATE means UPDATE and failure to update means something's wrong. A secret fall-back to INSERT makes the operation ambiguous.
If you're not building a RESTful interface, then it's typical to only use GET and POST for retrieve and create/update. It's common to have URI differences or message content differences to distinguish between POST and PUT when a person is clicking submit on a form. It, however, isn't very clean because your code has to determine if you're in the POST=create case or POST=update case.

POST has no guarantees of safety or idempotency. That's one reason for PUT and DELETE—both PUT and DELETE are idempotent (i.e., 1+N identical requests have the same end result as just 1 request).
PUT is used for setting the state of a resource at a given URI. When you send a POST request to a resource at a particular URI, that resource should not be replaced by the content. At most, it should be appended to. This is why POST isn't idempotent—in the case of appending POSTS, every request will add to the resource (e.g., post a new message to a discussion forum each time).
DELETE is used for making sure that a resource at a given URI is removed from the server. POST shouldn't normally be used for deleting except for the case of submitting a request to delete. Again, the URI of the resource you would POST to in that case shouldn't be the URI for the resource you want to delete. Any resource for which you POST to is a resource that accepts the POSTed data to append to itself, add to a collection, or to process in some other way.
HEAD is used if all you care about is the headers of a GET request and you don't want to waste bandwidth on the actual content. This is nice to have.

Why do we need more than POST? It allows data to flow both ways, so why would GET be needed? The answer is basically the same as for your question. By standardizing the basic expectations of the various methods other processes can better know what to do.
For example, intervening caching proxies can have a better chance of doing the correct thing.
Think about HEAD for instance. If the proxy server knows what HEAD means then it can process the result from a previous GET request to provide the proper answer to a HEAD request. And it can know that POST, PUT and DELETE should not be cached.

No one posted the kind of answer I was looking for so I will try to summarize the points myself.
"RESTful Web Services" chapter 8 section "Overloading POST" reads: "If you want to do without PUT and DELETE altogether, it’s entirely RESTful to expose safe operations on resources through GET, and all other operations through overloaded POST. Doing this violates my Resource-Oriented Architecture, but it conforms to the less restrictive rules of REST."
In short, replacing PUT/DELETE in favor of POST makes the API harder to read and PUT/DELETE calls are no longer idempotent.

In a word:
idempotency
In a few more words:
GET = safe + idempotent
PUT = idempotent
DELETE = idempotent
POST = neither safe or idempotent
'Idempotent' just means you can do it over and over again and it will always do exactly the same thing.
You can reissue a PUT (update) or DELETE request as many times as you want and it will have the same effect every time, however the desired effect will modify a resource so it is not considered 'safe'.
A POST request should create a new resource with every request, meaning the effect will be different every time. Therefore POST is not considered safe or idempotent.
Methods like GET and HEAD are just read operations and are therefore considered 'safe' aswell as idempotent.
This is actually a pretty important concept because it provides a standard/consistent way to interpret HTTP transactions; this is particularly useful in a security context.

Not all hosters don't support PUT, DELETE.
I asked this question, in an ideal world we'd have all the verbs but....:
RESTful web services and HTTP verbs

HEAD is really useful for determining what a given server's clock is set to (accurate to within the 1 second or the network round-trip time, whichever is greater). It's also great for getting Futurama quotes from Slashdot:
~$ curl -I slashdot.org
HTTP/1.1 200 OK
Date: Wed, 29 Oct 2008 05:35:13 GMT
Server: Apache/1.3.41 (Unix) mod_perl/1.31-rc4
SLASH_LOG_DATA: shtml
X-Powered-By: Slash 2.005001227
X-Fry: That's a chick show. I prefer programs of the genre: World's Blankiest Blank.
Cache-Control: private
Pragma: private
Connection: close
Content-Type: text/html; charset=iso-8859-1
For cURL, -I is the option for performing a HEAD request. To get the current date and time of a given server, just do
curl -I $server | grep ^Date

To limit ambiguity which will allow for better/easier reuse of our simple REST apis.

You could use only GET and POST but then you are losing out on some of the precision and clarity that PUT and DELETE bring. POST is a wildcard operation that could mean anything.
PUT and DELETE's behaviour is very well defined.
If you think of a resource management API then GET, PUT and DELETE probably cover 80%-90% of the required functionality. If you limit yourself to GET and POST then 40%-60% of your api is accessed using the poorly specified POST.

Web applications using GET and POST allow users to create, view, modify and delete their data, but do so at a layer above the HTTP commands originally created for these purposes. One of the ideas behind REST is a return to the original intent of the design of the Web, whereby there are specific HTTP operations for each CRUD verb.
Also, the HEAD command can be used to improve the user experience for (potentially large) file downloads. You call HEAD to find out how large the response is going to be and then call GET to actually retrieve the content.

See the following link for an illustrative example. It also suggests one way to use the OPTIONS http method, which hasn't yet been discussed here.

There are http extensions like WebDAV that require additional functionally.
http://en.wikipedia.org/wiki/WebDAV

The web server war from the earlier days probably caused it.
In HTTP 1.0 written in 1996, there were only GET, HEAD, and POST. But as you can see in Appendix D, vendors started to add their own things. So, to keep HTTP compatible, they were forced to make HTTP 1.1 in 1999.
However, HTTP/1.0 does not sufficiently take into consideration
the effects of hierarchical proxies, caching, the need for
persistent connections, or virtual hosts. In addition, the proliferation
of incompletely-implemented applications calling themselves
"HTTP/1.0" has necessitated a protocol version change in order for
two communicating applications to determine each other's true capabilities.
This specification defines the protocol referred to as "HTTP/1.1". This protocol includes more stringent requirements than HTTP/1.0 in order
to ensure reliable implementation of its features.

GET, PUT, DELETE and POST are holdovers from an era when sophomores thought that a web page could be reduced to a few hoighty-toity principles.
Nowadays, most web pages are composite entities, which contain some or all of these primitive operations. For instance, a page could have forms for viewing or updating customer information, which perhaps spans a number of tables.
I usually use $_REQUEST[] in php, not really caring how the information arrived. I would choose to use GET or PUT methods based on efficiency, not the underlying (multiple) paradigms.

Related

REST collections of collections, promoting and demoting

We have a resource called tasks. With the following endpoints, we can list all tasks, and create tasks:
GET: /tasks
POST: /tasks
The problem is that tasks can contain sub-tasks, and we wish to embed the functionality to both support promoting sub-tasks to their own tasks and demoting tasks to sub-tasks of another task.
The naïve approach should be to delete the sub-task and create it again as a task and vice-versa, but we find this a tad too naive.
The second option we have come up with is to support endpoints such as the following, where {id} is the ID of the task, and {sid} is the ID of the sub-task:
POST: /tasks/{id}/add/{sid}
POST: /tasks/{id}/upgrade/{sid}
The first endpoint should add a task to another task, thereby deleting the first task and adding a copy as a sub-task to the second task. The second endpoint should upgrade a sub-task to a task from another task, thereby deleting the sub-task and adding a copy as a task.
So our question is if this would be considered good practice, or if there are some other options which we have not considered?
It's important to highlight that REST doesn't care about the URI spelling at all. However, once the central piece of the REST architectural style is the resource, it makes sense that the URIs uses nouns instead of verbs.
The REST architectural style is protocol independent, but it's commonly implemented on the top of the HTTP protocol. In this approach, the HTTP method is meant to express the operation that will be performed over the resource identified by a given URI.
Your question doesn't state what your domain looks like. But let's consider you have something like:
+-------------------+
| Task |
+-------------------+
|- id: Long |
|- title: String |
|- parent: Task |
|- children: Task[] |
+-------------------+
Creating a task could be expressed with a POST request:
POST /tasks
Host: example.org
Content-Type: application/json
{
"title": "Prepare dinner"
}
HTTP/1.1 201 Created
Location: /tasks/1
Creating a subtask could expressed with a POST request indicating the parentId in the payload:
POST /tasks
Host: example.org
Content-Type: application/json
{
"parentId": 1
"title": "Order a pizza"
}
HTTP/1.1 201 Created
Location: /tasks/2
Promoting a subtask to a task could be achieved by setting the parentId to null using a PATCH request:
PATCH /tasks/2
Host: example.org
Content-Type: application/json-patch+json
[
{
"op": "replace", "path": "/parentId", "value": null
}
]
HTTP/1.1 204 No Content
Updating a task to become a subtask could be achieved by setting the parentId using a PATCH request:
PATCH /tasks/2
Host: example.org
Content-Type: application/json-patch+json
[
{
"op": "replace", "path": "/parentId", "value": 1
}
]
HTTP/1.1 204 No Content
The above examples use JSON Patch (application/json-patch+json) as payload for the PATCH. Alternatively, you could consider JSON Merge Patch (application/merge-patch+json). I won't go through the differences of such formats once it will make this answer overly long. But you can click the links above and check them by yourself.
For handling errors, refer to the error handling section of the RFC 5789, the document that defines the PATCH method.
I also appreciate that some APIs avoid using PATCH for a number of reasons that are out of the scope of this answer. If that's the case, you could consider PUT instead. In this approach, the state of the resource will be replaced with the state defined in the representation sent in the request payload.
Alternatively you could have an endpoint such as /tasks/{id}/parent, supporting PUT. In this approach, the client just have to send a representation of the parent (such as an id) instead of a full representation of the task.
Pick the approach that suits you best.
REST is an abbrevation for the transfer of a resource' current state in a representation format supported by the client. As REST is a generalization of the World Wide Web, the same concepts you use for the Web also apply to applications following the REST architecture model. So the basic question resolves around: How would you design your system to work on Web pages and apply the same steps to your design.
As Cassio already mentioned that the spelling of URIs is not of importance to clients, as a URI remains a URI and you can't deduce from a URI whether the system is "RESTful" or not. Actually, there is no such thing as "RESTful" or "RESTless" URIs as a URI, as stated above, remains a URI. It is probably better to think of a URI as a key used for caching responses in a local or intermediary cache. Fielding made support of caching even a constraint and not just an option.
As REST is not a protocol but just an architectural style, you are basically not obligated to implement it stringent, though you will certainly miss out on the promised benefits, such as the decoupling of clients from APIs, the freedom to evolve the server side without breaking clients and making clients in general more robust against changes. Fielding even stated that applications that violate the constraints he put on REST shouldn't be termed REST at all, to avoid confusions.
I don't agree with user991710 in one of his comments that REST can't be used to represent processes, I agree however that REST shouldn't attempt to create new verbs either. As mentioned before, REST is about transfering a resources current state in a supported representation format. If a task can be represented as a stream of data then it can be presented to a client as well. The self descriptiveness of messages, i.e. by using a media type that defines the rules on how to process the data payload, guarantees that a client will be able to make sense of the data. I.e. your browser is able to render images, play videos, show text and similar stuff as it knows how to interpret the data stream accordingly. Support for such fields can be added via special addons or plugins, that may be loaded dynamically during runtime without even having to restart the application.
If I had to design tasks for Web pages, I'd initially return a pagable view of existing tasks, probably rendered in a table, with a link to a page that is able to create new tasks or a link in each row to update or delete an existing task. The create-new and update pages may use the same HTML form to enter or update a tasks information. If a task should be assigned as sub-task to an other task, you may be able to either select a parent task from a given set of tasks, i.e. in a drop-down text field, or enter the URI of the parent in a dedicated field. On submitting the task, HTTP method POST would be used that will perform the operation based on the servers own semantic, so wheter a new resource is created or one or multiple ones are updated is up to the server. The quintesence here is, that everything a client may be able to do, is taught by the server. The form to add new or update existing tasks just informs the client which fields are supported (or expected) by the server. There is actually no external, out-of-band knowledge needed by a client in order to perform a request. A server has still the option to reject incomplete payloads or payloads that violate certain constraints and a client will know by receiving an appropriate error message.
As clients shouldn't parse or interpret URIs, they will use certain text describing what the URIs do. In a browser a picture of a dustbin may be used as symbol for deletion, while a pencil may be used as symbol for updating an existing entry (or the like). As humans we quickly realize what these links are intended for whithout having to read the characters of the actual URI.
The last two paragraphs just summed up how the interaction model on a common Web page may look like. The same concepts should be used in a REST architecture as well. The content you exchange may vary more, ideally with standardized representation formats, compared to the big brother, the Web, though still the concepts of links are used to reference from one resource to other resources and server teaches clients what it needs apply here. Compared to HTML, however, you can utilize more HTTP methods than just POST and GET. A deletion of a resource's representation would probably make more sense with a DELETE method than with a POST method. Also, for updates either PUT or PATCH may make more sense, depending on the situation. In contrast of using sensible pictures for hinting users on what this link might be good for, link-relation names should be used that hint clients about the purpose of the link. Such link-relation names should be standardized, or at least express common sense such as expressed though special ontologies, or use absolute URIs as extension mechanism.
You may add dedicated links to show a collection of all tasks based on certain filters, i.e. all tasks or only parent tasks and what not, so a client can iterate through the task it is interested in. On selecting a task you may add links to all sub-tasks a client can invoke to learn what these subtasks are, if interested. Here the URIs of the tasks may remain unchanged, which supports caching by default. Note that I didn't mention anything in regards to how you should store the data in your backend as this is just some implementation details a client is usually not interested in. This is just a perfect case where a domain model does not necessarily have to be similar to a resources state representation.
In regards to which HTTP operation to perform a task promotion or demotion is basically some design choice that also may depend on the representation format the payload is exchanged for. As HTTP only supports POST and GET, such a change could be requested via POST, other media types might support other HTTP methods, such as PUT, which, according to its specification (last paragraph page 27), is allowed to have side-effects, or PATCH, which needs to be applied atomically - either fully or not at all. PATCH actually is similar to patching software, where a set of instructions should be applied on some target code. William Durand summarized this concept in a quite cited blog-post. However, he later on updated his blog post to mention that via application/merge-patch+json a more natural and intuitive way of "updating" resources could be used. In regards to form support, there exist a couple of drafts such as hal-forms, halo-json (halform), Ion or hydra that offer such definition but lack currently wider library support or a final standard. So a bit of work needs yet to put into an accepted standard.
To wrap this post up, you should design your API like you'd design your system to work for Web pages, apply the same concepts you use for interaction on typical Web pages, such as links and forms, and used them in your responses. Which HTTP operation you perform the actual promotion or demotion with, may be dependent on the actual media type and what HTTP operations it supports. POST should work in all cases, PUT is probably more intuitive to typical updates done in Web forms while patching would require your client to actually calculate the steps needed to transform the current resource' representation to the desired one (if you use application/json-patch+json in particular). application/merge-patch+json may be applicable as well, which would simplify the patching notably, as the current data contained in the form could just be sent to the server and default rules would decide whether a field got removed, added or updated.
you can divide your route with :
POST : /tasks => for create or add tasks
PUT/PATCH : /tasks => for upgrading your tasks
or even be more specific about it you can pass :id to params. I think the best approach is whenever you need to change delete update or get a specific object you must use :id in your params
PUT/PATCH/GET : /tasks/:id

REST delete multiple items in the batch

I need to delete multiple items by id in the batch however HTTP DELETE does not support a body payload.
Work around options:
1. #DELETE /path/abc?itemId=1&itemId=2&itemId=3 on the server side it will be parsed as List of ids and DELETE operation will be performed on each item.
2. #POST /path/abc including JSON payload containing all ids. { ids: [1, 2, 3] }
How bad this is and which option is preferable? Any alternatives?
Update: Please note that performance is a key here, it is not an option execute delete operation for each individual id.
Along the years, many people fell in doubt about it, as we can see in the related questions here aside. It seems that the accepted answers ranges from "for sure do it" to "its clearly mistreating the protocol". Since many questions was sent years ago, let's dig into the HTTP 1.1 specification from June 2014 (RFC 7231), for better understanding of what's clearly discouraged or not.
The first proposed workaround:
First, about resources and the URI itself on Section 2:
The target of an HTTP request is called a "resource". HTTP does not limit the nature of a resource; it merely defines an interface that might be used to interact with resources. Each resource is identified by a Uniform Resource Identifier (URI).
Based on it, some may argue that since HTTP does not limite the nature of a resource, a URI containing more than one id would be possible. I personally believe it's a matter of interpretation here.
About your first proposed workaround (DELETE '/path/abc?itemId=1&itemId=2&itemId=3') we can conclude that it's something discouraged if you think about a resource as a single document in your entity collection while being good to go if you think about a resource as the entity collection itself.
The second proposed workaround:
About your second proposed workaround (POST '/path/abc' with body: { ids: [1, 2, 3] }), using POST method for deletion could be misleading. The section Section 4.3.3 says about POST:
The POST method requests that the target resource process the representation enclosed in the request according to the resource's own specific semantics. For example, POST is used for the following functions (among others): Providing a block of data, such as the fields entered into an HTML form, to a data-handling process; Posting a message to a bulletin board, newsgroup, mailing list, blog, or similar group of articles; Creating a new resource that has yet to be identified by the origin server; and Appending data to a resource's existing representation(s).
While there's some space for interpretation about "among others" functions for POST, it clearly conflicts with the fact that we have the method DELETE for resources removal, as we can see in Section 4.1:
The DELETE method removes all current representations of the target resource.
So I personally strongly discourage the use of POST to delete resources.
An alternative workaround:
Inspired on your second workaround, we'd suggest one more:
DELETE '/path/abc' with body: { ids: [1, 2, 3] }
It's almost the same as proposed in the workaround two but instead using the correct HTTP method for deletion. Here, we arrive to the confusion about using an entity body in a DELETE request. There are many people out there stating that it isn't valid, but let's stick with the Section 4.3.5 of the specification:
A payload within a DELETE request message has no defined semantics; sending a payload body on a DELETE request might cause some existing implementations to reject the request.
So, we can conclude that the specification doesn't prevent DELETE from having a body payload. Unfortunately some existing implementations could reject the request... But how is this affecting us today?
It's hard to be 100% sure, but a modern request made with fetch just doesn't allow body for GET and HEAD. It's what the Fetch Standard states at Section 5.3 on Item 34:
If either body exists and is non-null or inputBody is non-null, and request’s method is GET or HEAD, then throw a TypeError.
And we can confirm it's implemented in the same way for the fetch pollyfill at line 342.
Final thoughts:
Since the alternative workaround with DELETE and a body payload is let viable by the HTTP specification and is supported by all modern browsers with fetch and since IE10 with the polyfill, I recommend this way to do batch deletes in a valid and full working way.
It's important to understand that the HTTP methods operate in the domain of "transferring documents across a network", and not in your own custom domain.
Your resource model is not your domain model is not your data model.
Alternative spelling: the REST API is a facade to make your domain look like a web site.
Behind the facade, the implementation can do what it likes, subject to the consideration that if the implementation does not comply with the semantics described by the messages, then it (and not the client) are responsible for any damages caused by the discrepancy.
DELETE /path/abc?itemId=1&itemId=2&itemId=3
So that HTTP request says specifically "Apply the delete semantics to the document described by /path/abc?itemId=1&itemId=2&itemId=3". The fact that this document is a composite of three different items in your durable store, that each need to be removed independently, is an implementation details. Part of the point of REST is that clients are insulated from precisely this sort of knowledge.
However, and I feel like this is where many people get lost, the metadata returned by the response to that delete request tells the client nothing about resources with different identifiers.
As far as the client is concerned, /path/abc is a distinct identifier from /path/abc?itemId=1&itemId=2&itemId=3. So if the client did a GET of /path/abc, and received a representation that includes itemIds 1, 2, 3; and then submits the delete you describe, it will still have within its own cache the representation that includes /path/abc after the delete succeeds.
This may, or may not, be what you want. If you are doing REST (via HTTP), it's the sort of thing you ought to be thinking about in your design.
POST /path/abc
some-useful-payload
This method tells the client that we are making some (possibly unsafe) change to /path/abc, and if it succeeds then the previous representation needs to be invalidated. The client should repeat its earlier GET /path/abc request to refresh its prior representation rather than using any earlier invalidated copy.
But as before, it doesn't affect the cached copies of other resources
/path/abc/1
/path/abc/2
/path/abc/3
All of these are still going to be sitting there in the cache, even though they have been "deleted".
To be completely fair, a lot of people don't care, because they aren't thinking about clients caching the data they get from the web server. And you can add metadata to the responses sent by the web server to communicate to the client (and intermediate components) that the representations don't support caching, or that the results can be cached but they must be revalidated with each use.
Again: Your resource model is not your domain model is not your data model. A REST API is a different way of thinking about what's going on, and the REST architectural style is tuned to solve a particular problem, and therefore may not be a good fit for the simpler problem you are trying to solve.
That doesn’t mean that I think everyone should design their own systems according to the REST architectural style. REST is intended for long-lived network-based applications that span multiple organizations. If you don’t see a need for the constraints, then don’t use them. That’s fine with me as long as you don’t call the result a REST API. I have no problem with systems that are true to their own architectural style. -- Fielding, 2008

REST API: How to deal with processing logic

I read (among others) the following blog about API design: https://www.thoughtworks.com/insights/blog/rest-api-design-resource-modeling. It helped me to better understand a lot of aspects, but I have one question remaining:
How do I deal with functionality that processes some data and gives a response directly. Think, verbs like translate, calculate or enrich. Which noun should they have and should they be called by GET, PUT or POST?
P.S. If it should be GET, how to deal with the maximum length of a GET request
This is really a discussion about naming more so than functionality. Its very much possible to have processed logic in your API, you just need to be careful about naming it.
Imaginary API time. Its got this resource: /v1/probe/{ID} and it responds to GET, POST, and DELETE.
Let's say we want to launch our probes out, and then want the probe to give us back the calculated flux variation of something its observing (totally made up thing). While it isn't a real thing, let's say that this has to be calculated on the fly. One of my intrepid teammates decides to plunk the calculation at GET /v1/1324/calculateflux.
If we're following real REST-ful practices... Oops. Suddenly we're not dealing with a noun, are we? If we have GET /v1/probe/1324/calculateflux we've broken RESTful practices because we're now asking for a verb - calculateflux.
So, how do we deal with this?
You'll want to reconsider the name calculateflux. That's no good - it doesn't name a resource on the probe. **In this case, /v1/probe/1324/fluxvalue is a better name, and /v1/probe/1324/flux works too.
Why?
RESTFUL APIs almost exclusively use nouns in their URIs - remember that each URI needs to describe a specific thing you can GET POST PUT or DELETE or whatever. That means that any time there is a processed value we should give the resource the name of the processed (or calculated) value. This way, we remain RESTful by adhering to the always-current data (We can re-calculate the Flux value any time) and we haven't changed the state of the probe (we didn't save any values using GET).
Well, I can tell you that I know about this.
GET // Returns, JUST return
DELETE // Delete
POST // Send information that will be processed on server
PUT // Update a information
This schema is for laravel framework. Will be most interesting that you read the link in ref
Ref:
https://rafaell-lycan.com/2015/construindo-restful-api-laravel-parte-1/
You should start with the following process:
Identify the resources (nouns) in your system.
They should all respond to GET.
Let's take your translation example. You could decide that every word in the source language is a resource. This would give:
http://example.com/translations/en-fr/hello
Which might return:
Content-Type: text/plain
Content-Language: fr
bonjour
If your processes are long-running, you should create a request queue that clients can POST to, and provide them with another (new) resource that they can query to see if the process has completed.

Correct HTTP request method for nullipotent action that takes binary data

Consider a web API method that has no side effects, but which takes binary data as a parameter. An example would be a method that tells the user whether or not their image is photoshopped, but does not permanently store the image or the result on its servers.
Should such a method be a GET or a POST?
GET doesn't seem to have a recommended way of sending data outside of URL parameters, but the behavior of the method implies a GET, which according to the HTTP spec is for safe, stateless responses. This becomes particularly constraining under the semantics of REST, which imply that POST methods create a new object on the server.
This becomes particularly constraining under the semantics of REST, which imply that POST methods create a new object on the server.
While a POST request means that the entity sent will be treated "as a new subordinate of the resource identified by the Request-URI", there is no requirement that this result in the creation of a new permanent object or that any such new object be identified by a URI (so no new object as far as the client knows). An object can be transient, representing the results of e.g. "Providing a block of data, such as the result of submitting a form, to a data-handling process" and not persisting after the entity representing that object has been sent.
While this means that a POST can create a new resource, and is certainly the best way to do so when it is the server that will give that new resource its URI (with PUT being the more appropriate method when the client dictates the new URI) it is can also be used for cases that delete objects (though again if it's a deletion of a single* resource identifiable by a URI then DELETE is far more appropriate), both create and delete objects, change multiple objects, it can mean that your kitchen light turns on but that the response is the same whether that worked or failed because the communication from the webserver to the kitchen light doesn't allow for feedback about success. It really can do anything at all.
But, your instincts are good in wanting this to be a GET: While the looseness of POST means we can make a case for it for just about every request (as done by approaches that use HTTP for an RPC-like protocol, essentially treating HTTP as if it was a transport protocol), this is inelegant in theory, inefficient in practice and clumsy in definition. You have an idempotent function that depends on solely on what the client is interested in, and that maps most obviously the GET in a few ways.
If we could fit everything in a URI then GET would be easy. E.g we can define a simple integer addition with something like http://example.net/addInts?x=1;y=2 representing the addition of 71 and 2 and hence being a permanent immutable resource representing the number 3 (since the results of GET can vary with changes to a resource over time, but this resource never changes) and then use a mechanism like HTML's <form> or javascript to allow the server to inform the client as to how to construct the URIs for other numbers (to maintain the HATEOS and/or COD constraints). Simples!
Your problem here is that you do not have input that can be represented as concisely as the numbers 1 and 2 can above. In theory you could do something like http://example.net/photoshoppedCheck?image=… and hence create a URI that represents the resource of the results of the check. This URI will though will have 4 characters for every 3 bytes in the image. While there's no absolute limit on URI length both the theory and the practice allow this to fail (in theory HTTP allows for proxies and servers to set a limit on URI length, and in practice they do).
An argument could be made for using GET and sending a request body the same way as you would with a POST, and some webservers will even allow you to do this. However, GET is defined as returning an entity describing the resource identified in the URI with headers restricting how that entity does that describing: Since the request body is not part of that definition it must be ignored by your code! If you were tempted to bend this rule then you must consider that:
Some webservers will refuse the request or strip the body, so you may not be able to.
If your webserver does allow it, the fact that its not specified means you can't be sure an upgrade won't "fix" this and so break your code.
Some proxies will refuse or strip the request.
Some client libraries will most certainly refuse to allow developers to send a request body along with a GET.
So it's a no-no in both theory and practice.
About the only other approach we could do apart from POST is to have a URI that we consider as representing an image that was not photoshopped. Hence if you GET that you get an entity describing the image (obviously it could be the actual image, though it could also be something else if we stretch the concept of content-negotiation) and then PUT will check the image and if its deemed to not be photoshopped it responds with the same image and a 200 or just a 204 while if it is deemed to be photoshopped it responds with a 400 because we've tried to PUT a photoshopped image as a resource that can only be a non-photoshopped image. Because we respond immediately, there's no race-condition with simultaneous requests.
Frankly, this would be darn-right horrible. While I think I've made a case for it by the letter of the specs, it's just nasty: REST is meant to help us design clear APIs, not obtuse APIs we can offer a too-clever-for-its-own-good justification of.
No, in all the way to go here is to POST the image to a fixed URI which then returns a simple entity describing the analysis.
It's perfectly justifiable as REST (the POST creates a transient object based on that image, and then responds with an entity describing that object, and then that object disappears again). It's straight-forward. It's about as efficient as it could be (we can't do any HTTP caching† but most of the network delay is going to be on the upload rather than the download anyway). It also fits into the general use-case of "process something" that POST was first invented for. (Remember that first there was HTTP, then REST described why it worked so well, and then HTTP was refined to better play to those strengths).
In all, while the classic mistake that moves a web application away from REST is to abuse POST into doing absolutely everything when GET, PUT and DELETE (and perhaps the WebDAV methods) would be superior, don't be afraid to use its power when those don't meet the bill, and don't think that the "new subordinate of the resource" has to mean a full long-lived resource.
*Note that a "single" resource here might be composed of several resources that may have their own URIs, so it can be easy to have a single DELETE that deletes multiple objects, but if deleting X deletes A, B & C then it better be obvious that you can't have A, B or C if you don't have X or your API is not going to be understandable. Generally this comes down to what is being modelled, and how obvious it is that one thing depends on another.
†Strictly speaking we can, as we're allowed to send cache headers indicating that sending an identical entity to the same URI will have the same results, but there's no general-purpose web-software that will do this and your custom client can just "remember" the opinion about a given image itself anyway.
It's a difficult one. Like with many other scenarios there is no absolutely correct way of doing it. You have to try to interpret RESTful principles in terms of the limitations of the semantics of HTTP. (Incidentally, I don't think it's right to think of REST having semantics, REST is an architectural style which is commonly used with HTTP services, but can be used for any type of interface.)
I've faced a similar situation in my current project. We chose to use a POST but with the response code being a 200 (OK) rather than the 201 (Resource Created) usually returned by RESTful Web APIs.

RESTful resource - accepts a list of objects

I'm building a collection of RESTful resources that work like the following: (I'll use "people" as an example):
GET /people/{key}
- returns a person object (JSON)
GET /people?first_name=Bob
- returns a list of person objects who's "first_name" is "Bob" (JSON)
PUT /people/{key}
- expects a person object in the payload (JSON), updates the person in the
datastore with the {key} found in the URL parameter to match the payload.
If it is a new object, the client specifies the key of the new object.
I feel pretty comfortable with the design so far (although any input/criticism is welcome).
I'd also like to be able to PUT a list of people, however I'm not confident in the RESTfulness of my design. This is what I have in mind:
PUT /people
- expects a list of objects in JSON form with keys included in the object
("key":"32948"). Updates all of the corresponding objects in the datastore.
This operation will be idempotent, so I'd like to use "PUT". However its breaking a rule because a GET request to this same resource will not return the equivalent of what the client just PUT, but would rather return all "people" objects (since there would be no filters on the query). I suspect there are also a few other rules that might be being broken here.
Someone mentioned the use of a "PATCH" request in an earlier question that I had: REST resource with a List property
"PATCH" sounds fantastic, but I don't want to use it because its not in wide use yet and is not compatible with a lot of programs and APIs yet.
I'd prefer not to use POST because POST implies that the request is not idempotent.
Does anyone have any comments / suggestions?
Follow-up:::
While I hesitated to use POST because it seems to be the least-common-denominator, catch-all for RESTful operations and more can be said about this operation (specifically that it is idempotent), PUT cannot be used because its requirements are too narrow. Specifically: the resource is not being completely re-written and the equivalent resource is not being sent back from a GET request to the same resource. Using a PUT with properties outside of it's specifications can cause problems when applications, api's, and/or programmers attempt to work with the resource and are met with unexpected behavior from the resource.
In addition to the accepted answer, Darrel Miller had an excellent suggestion if the operation absolutely had to be a PUT and that was to append a UUID onto the end of the resource path so an equivalent GET request will return the equivalent resource.
POST indicates a generic action other than GET, PUT, and DELETE (the generic hashtable actions). Because the generic hashtable actions are inappropriate, use POST. The semantics of POST are determined by the resource to which an entity is POSTed. This is unlike the semantics of the generic hashtable methods, which are well-known.
POST /people/add-many HTTP/1.1
Host: example.com
Content-Type: application/json
[
{ "name": "Bob" },
{ "name": "Robert" }
]
Using PUT is definitely the wrong verb in this case. POST is meant to do exactly what you are asking. From the HTTP specification:
The fundamental difference between the POST and PUT requests is reflected in the different meaning of the Request-URI. The URI in a POST request identifies the resource that will handle the enclosed entity. That resource might be a data-accepting process, a gateway to some other protocol, or a separate entity that accepts annotations. In contrast, the URI in a PUT request identifies the entity enclosed with the request -- the user agent knows what URI is intended and the server MUST NOT attempt to apply the request to some other resource...
As such, if you want to update multiple resources in a single call, you have to use POST.
Just be cause PUT is required to be idempotent and POST is not, does not mean that POST cannot be idempotent. Your choice of HTTP verb should not be based on that, but based on the relationship of the requested resource and the resource acted upon. If your application is directly handling the resource requested, use PUT. If it is acting on some other resource (or resources, as in your case), use POST.
I really don't see any easy way you could use PUT to create an arbitrary set of people. Unless, you are prepared to have the client generate a GUID and do something like,
PUT /PeopleList/{1E8157D6-3BDC-43b7-817D-C3DA285DD606}
On the server side you could take the people from the list and add them to the /People resource.
A slight variation to this approach would be to get the server to include a link such as
<link rel="AddList" href="/PeopleList/{1E8157D6-3BDC-43b7-817D-C3DA285DD606}"/>
in the People resource. The client would need to know that it needs to PUT a list of people to the AddList link. The server would need to make sure that each time it renders the /People resource it creates a new url for the AddList link.
Regarding Darren Miller's suggestion of using PUT to a GUID (I can't comment...), the point of using PUT would be to achieve idempotency for the operation. The litmus test if idempotency would be this conversation between the client and server:
PUT /PeopleList/{1E8157D6-3BDC-43b7-817D-C3DA285DD606}
204 NO CONTENT (indicates that it all went well)
client loses connection and doesn't see the 204
client automatically retries
PUT /PeopleList/{1E8157D6-3BDC-43b7-817D-C3DA285DD606}
How would the server differentiate the two? If the GUID is "used up" so to speak then the server would have to respond 404 or 410. This introduces a tiny bit of conversational state on the server to remember all the GUIDs that have been used.
Two clients would often see the same I guess, because of caching or simply keeping stale responses around.
I think a smart solution is to use POST to create an (initially empty, short lived) holding area for a resource to which you can PUT, i.e. clients need to POST to create the GUID resource instead of discovering it via a link:
POST /PeopleList/CreateHoldingArea
201 CREATED and Location: /PeopleList/{1E8157D6-3BDC-43b7-817D-C3DA285DD606}
PUT /PeopleList/{1E8157D6-3BDC-43b7-817D-C3DA285DD606}
This would mean that the lost idempotency would not result in much overhead; clients simply create new GUIDs (by POSTing) if they didn't see the initial 201 CREATED response. The "tiny bit of conversational state" would now only be the created but not yet used holding areas.
The ideal solution would of course not require any conversational state on the server, but it eludes me.