REST - HTTP DELETE - semantics - only delete descendants - rest

In our project, a list of all books can be retrieved through REST:
GET http://server/api/books/
A specific book can be retrieved as following:
GET http://server/api/books/:id/
Deleting a specific book is easy:
DELETE http://server/api/books/:id/
Now, to my question: what should be the result of the following call:
DELETE http://server/api/books/
Obviously, all books are deleted. But should the resource books/ also be deleted? That is, after the request:
should GET /books/ return 200 OK with an empty list? or
should GET /books/ return 404 not found?
According to the specs, which says that the concrete URI will be gone afterwards, I'd go for the second option. However, this makes things complicated and unlogic, in my opinion. It makes more sense to have an empty list of books, instead of no books.
What do you think?

If it makes you feel better, assume that the server has logic that recreates the books resource automatically after it has been deleted. :-)
I'd go for 200 OK and an empty list. If you really don't like that then create a new resource called /books/all
and do
DELETE /Books/all

However, this makes things complicated and unlogic
How? You are requesting that a resource be deleted. That resource is deleted. The result is that it isn't there any more.
If anything, it's confusing to have it still be present after you deleted it.

I think allowing DELETE /books is too risky. Part of a well designed api is to avoid "easy" mistakes from api-client side. It could easily happen that in client code something is going wrong, accidently the id (e.g. empty-string variable) is missing and unintended DELETE /books is sent.
What I would do is to force the client to iterate through DELETE /books/{id} in case he wants to delete all books.
Maybe you can give more input on your use-case: I wonder how likely the use-case is that DELETE /books as root source called (it is quite radical to delete root resources). Maybe you are offering deleting a sub-resource, e.g. /user/{id}/shopping-cart/{id}/books. If it is a more "transient" resource (like shopping-cart is) deleting api for all books would make more sense.
Regarding your other question: For /books I would return 200 and empty list. In collection cases I much prefer empty-lists over 'null' values.

Related

Good REST API design for operations on resource sets

With REST it is pretty clear how to operate on resources, e.g.
PUT /users/{userId} - updates the user with userId
GET /users/{userId} - reads the user with userId
Similarly for resource sets
POST /users - creates a new user
GET /users/{userId}/books - reads list of books from a user
GET /users/{userId}/books?filter=x - reads list of books from a user with specific filter
What if I want to develop more elaborate operations on resource sets, e.g.
with the request body, add a list of books to the existing list and accepting duplicates (basically concatenating the list)
POST /users/{userId}/books
or PUT /users/{userId}/books
or PATCH?
or POST /users/{userId}/books/concatenate
with the request body, add a list of books to the existing list but no duplicates (basically merging the list)
POST /users/{userId}/books
or PUT /users/{userId}/books
or PATCH?
or POST /users/{userId}/books/merge
also for deleting parts of resource sets:
with the request body, delete a list of books from the existing list that have a certain property
POST /users/{userId}/books/delete?category=x
or DELETE /users/{userId}/books?category=x
or deleting all resources in a resource set:
POST /users/{userId}/books/delete_all
or DELETE /users/{userId}/books
Would be thankful for some hints or guidelines
"Resource sets", from the point of view of REST, are a fiction. There are only resources. As far as a general-purpose HTTP component is concerned, there is _no relation implied by the following URI:
/users
/users/{userId}
/users/{userId}/books
/users/{userId}/books?filter=x
/users/{userId}/books/concatenate
They are completely independent of one another; for instance, DELETE /users does not imply anything about the other resources.
We human beings tend to assign identifiers in patterns that make sense, but the machines don't care.
with the request body, add a list of books to the existing list and accepting duplicates (basically concatenating the list)
PUT and PATCH have remote authoring semantics; they act like you would expect if you were trying to edit a copy of a file on the server. You GET a copy of the current representation of the resource, make edits to your local copy, and then request that the server change its copy to match your copy. With PUT, you send a complete copy of your representation of the resource; with PATCH, you send a patch-document that describes the changes you made.
It's okay to use POST; HTML got along just fine using nothing but GET and POST, and the web took over the world.
You don't need a separate resource for POST; you can use one if you like, but it isn't necessary to do so.
with the request body, add a list of books to the existing list but no duplicates (basically merging the list)
Not really any different; what we agree upon in HTTP is the semantics of the request and response messages. What the server chooses to do is an implementation concern. See Fielding 2002.
So if I send to you a representation of a list with duplicate entries, and you strip out the duplicates, that's "fine"; you just need to exercise some care with your responses to ensure that you don't imply that you accepted the requested representation as is.
With PATCH, it's a bit fuzzy, in that the RFC describes all or nothing semantics, but based on the language used it is reasonable to infer that the implementation is restricted as well.
also for deleting parts of resource sets: with the request body, delete a list of books from the existing list that have a certain property
Give RFC 7231 a careful read: DELETE doesn't quite mean what your examples hint at. DELETE breaks the associate between a key (the target uri) and a value (the resource representations), but that doesn't necessarily mean "and also garbage collect the representation".
The same idea expressed another way -- suppose I GET /list-of-books from the server, and the returned representation is a list of three books. In the case where I want that resource to instead return a representation of an empty list, DELETE is the wrong tool. DELETE tells the server that I want future calls to GET /list-of-books to return 404 Not Found or possibly 410 Gone. If what I really want is a 200 OK with an empty list, then I need to PUT/PATCH/POST/etc. the resource.
deleting all resources in a resource set
Same problem as before.
With REST it is pretty clear how to operate on resources
This is the problem - it is NOT clear how to operate on resources. The web is cluttered with literature that makes a complete hash of it (we use REST to fetch documents that mangle the lessons of REST -- fabulous irony).
REST includes a uniform interface as a constraint. In HTTP, that interface is effectively a document store. PUT and PATCH just edit document contents - which is perfectly satisfactory if your domain is anemic or declarative. For anything else where we don't have standardized semantics, we use POST.
See Jim Webber, 2011: "You have to learn how to use HTTP to trigger business activity as a side effect of moving documents around the network."

RESTful API - DELETE some of a resource collection?

I can DELETE a single resource like:
// Removes group 12 from employee 46
DELETE /employees/46/groups/12
I can DELETE a whole resource collection like:
// Removes all groups from employee 46
DELETE /employees/46/groups
I'm looking for the proper RESTful way to DELETE some of a resource collection.
DELETE /employees/46/groups { ids: [12, 15, 32] }
DELETE /employees/46/groups?ids=12,15,32
DELETE /employees/46/groups/xx (single, but call it 3 times)
Should query string parameters (?ids=12,15,32) only be used with GET..?
Should the request body ({ ids: [12, 15, 32] }) always be used with POST, PUT and DELETE..?
All three of these will work, but which one is the standard way to DELETE only some of a resource collection..?
JSON API uses approach number 1 (DELETE /employees/46/groups with a body). I think that’s fishy, because RFC 7231 § 4.3.5 basically says that the entire target resource (/employees/46/groups) is to be deleted, regardless of what’s sent in the body. However, others disagree.
I think DELETE /employees/46/groups?ids=12,15,32 is best, because it considers the set of groups you want to delete as a resource of its own. You can give links to in your hypermedia. You can later support GET on it (but you don’t have to).
No, there is absolutely nothing preventing you from sending non-GET requests with a query string. The query string is not some kind of “parameter” (although it’s often useful to treat it like that), it’s an integral part of the resource’s URI. In fact, you could use DELETE /api.php?type=employee&id=46&groups=12,15,32 and that would still be perfectly RESTful. The whole point of REST is that URIs (among other things) should be opaque to the client.
However, the query string approach may pose problems when you want to delete a really large number of groups in one request. If that happens, the simplest approach is a POST /bulk-delete-groups RPC call. You may also consider PATCH /employees/46/groups (but please read RFC 5789 errata first).
Most APIs don't allow to delete a collection of resources at a time but it's possible to perform other operations on entities like:
DELETE /employees?id=12,15,32
or
DELETE /employees?id=12&id=15&id=32
A good choice maybe the non-REST way, to send a custom JSON object containing the ID's marked for deletion.

What should DELETE /collection do if some items can't be deleted?

I have a collection of items and some of them may or may not be deleted, depending on some preconditions. If a user wants to delete a resource (DELETE /collection/1) and there are external dependencies on that resource, the server will return an error. But what should happen if the user wants to delete the entire collection (DELETE /collection)?
Should all the resources which can be deleted be deleted and the server return a 2xx, or should the server leave everything intact and return a 4xx? Which would be the expected behavior?
As a REST API consumer, I'd expect the operation to be atomic and maybe get back a 409 Conflict with details if one of the deletes fails. Plus the DELETE method is theoretically idempotent as #jbarrueta pointed out.
Now if undeletable resources is a normal event in your use case and happens frequently, you may want to stray from the norm a little bit, delete all that can be deleted and return something like a 206 Partial Content (don't know if that's legal for DELETE though) with details about undeleted resources.
However, if you need to manage error cases finely, you might be better off sending separate DELETE commands.
I think the proper result is 204 no content by success and 409 conflict by failure because of the dependencies (as the others pointed out). I support atomicity as well.
I think you are thinking about REST as SOAP/RPC, which it is clearly not. Your REST service MUST fulfill the uniform interface constraint, which includes the HATEOAS interface constraint, so you MUST send hyperlinks to the client.
If we are talking about a simple link, like DELETE /collection, then you must send the link to the client, only if the resource state transition it represents, is available from the current resource state. So if you cannot delete the collection because of the dependencies, then you don't send a link about this transition, because it is not possible.
If it is a templated link, then you have to attach the "removable" property to the items, and set the checkboxes to disabled if it is false.
This way conflict happens only when the client got the link from a representation of a previous (stale) resource state, and in that case you must update the client state by querying the server again with GET.
Another possible solution (ofc. in combination with the previous ones) to show the link and automatically remove dependencies.
I guess you can use PATCH for bulk updates, which includes bulk removal, so that can be another solution as well.
I'd say it depends on your domain (although I'd rather use DELETE /collection/all instead of DELETE/collection/),
When you have the situation where you use delete all but some items can't be deleted, it depends on your domain where if you are doing the delete all to free up resources where if not your business process suffers, then it's better to delete what can be deleted and put other into a retry queue makes sense. in that case response should be OK.
Also situations could arise where there could be two operations
Clean Up - only delete unused
Delete All - delete all
In either situation I'd rather use specific method rather than using DELETE on the root URL,
for Clean Up - DELETE /collection/unused
for Delete ALL - DELETE /collection/all

What is the restful way to represent a resource clone operation in the URL?

I have REST API that exposes a complex large resource and I want to be able to clone this resource. Assume that the resource is exposed at /resources/{resoureId}
To clone resource 10 I could do something like.
GET /resources/10
POST /resources/ body of put containing a duplicate of the representation by GET /resources/10 without the id so that the POST creates a new resource.
The problem with this approach is that the resource is very large and complex it really makes no sense to return a full representation to the client and then have the client send it back as that would be just a total waste of bandwidth, and cpu on the server. Cloning the resource on the server is so much easier so I want to do that.
I could do something like POST /resources/10/clone or POST resources/clone/10 but both of these approaches feel wrong because the verb in the URL.
What is the most "restful/nouny" way to build url that can be used in this type of situation?
Since there is no copy or clone method in HTTP, it's really up to you what you want to do. In this case a POST seems perfectly reasonable, but other standards have taken different approaches:
WebDAV added a COPY method.
Amazon S3 uses PUT with no body and a special x-amz-copy-source header. They call this a PUT Object - Copy.
Both of these approaches assume that you know the destination URI. Your example seems to lack a known destination uri, so you pretty much must use POST. You can't use PUT or COPY because your creation operation is not idempotent.
If your service defines POST /resources as "create a new resource", then why not simply define another way to specify the resource other than as the body of the POST? E.g. POST /resources?source=/resources/10 with an empty body.
Francis' answer is a great one and probably what you're looking for. With that said, it's not technically RESTful since (as he says in the comments) it does rely on the client providing out of band information. Since the question was "what is the restful way" and not "what is a good way/the best way", that got me thinking about whether there is a RESTful solution. And I think what follows is a RESTful solution, although I'm not sure that it's necessarily any better in practice.
Firstly, as you've already identified, GET followed by POST is the simple and obvious RESTful way, but it's not efficient. So we're looking for an optimization, and we shouldn't be too surprised if it feels a little less natural than that solution!
The POST + sourceId solution creates a special URL - one that points not to a resource, but to an instruction to do something. Any time you find yourself creating special URLs like that, it's worth considering whether you can work around the need to do that by simply defining more resources.
We want the ability to copy
resources/10
What if we come up with another resource:
resources/10/copies
...and the definition of this resource is simply "the collection of resources that are copies of resource/10".
With this resource defined, we can now re-state our copy operation in different terms - instead of saying "I want the server to copy resources/10", we can say "I want to add a new thing to the collection of things that are copies of resources/10".
This sounds strange, but it fits naturally into REST semantics. For instance, let's say this resource currently looks like this (I'm going to use a JSON representation here):
[]
We can just update that with a POST or PATCH [1]:
POST resources/copies/10
["resources/11"]
Note that all we're sending to the server is metadata about a collection, so it's very efficient. We can assume that the server now knows where to get the data to copy, since that's part of the definition of this resource. We can also assume that the client knows that this results in a new resource being created at "resources/11" for the same reason.
With this solution, everything is defined clearly as a resource, and everything has one canonical URL, and no out-of-band information is ever required by the client.
Ultimately, is it worth going with this strange-feeling solution just for the sake of being more RESTful? That probably depends on your individual project. But it's always interesting to try and frame the problem differently by creating different resources!
[1] I don't know if makes sense to allow GET on "resources/10/copies". Obviously as soon as either the original resource or a copy of it change, the copy isn't really a copy any more and shouldn't be in this collection. Implementation-wise, I don't see the point in burdening the server with keeping track of that, so I think this should be treated as an update-only resource.
I think POST /resources/{id} would be a good solution to copy a resource.
Why?
POST /resources is the default REST-Standard to CREATE a new resource
POST /resources/{id} should not be possible in most REST apis, because that id already exists - you will never generate a new resource with you (the client) defining the id. The server will define the id.
Also note that you will never copy resource A on resource B. So if you want to copy existing resource with id=10, some answers suggest this kind of thing:
POST /resources?sourceId=10
POST /resources?copyid=10
but this is simpler:
POST /resources/10
which creates a copy of 10 - you have to retrieve 10 from storage, so if you don't find it, you cannot copy it = throw a 404 Not Found.
If it does exist, you create a copy of it.
So using this idea, you can see it does not make sense to do the following, copying some b resource to some a resource:
POST /resources?source=/resources/10
POST /resources-a?source=/resources-b/10
So why not simply use POST /resources/{id}
It will CREATE a new resource
The copy parent is defined by the {id}
The copy will be only on the same resource
It's the most REST-like variant
What do you think about this?
You want to create a copy of a specific resource. My Approach in that case, would be to use the following endpoint :
POST /resources/{id}/copy, read it "create a copy of resource {id}"
Will just put it out there, if this can be of help to anyone.
We had a similar scenario, where we were providing "clone vm" as a feature for scaling out on our IaaS offering. So if a user wanted to scale out they would have to hit POST: /vms/vm101 endpoint with request_body being
{"action": "clone", // Specifies action to take, since our users can do couple of other actions on a vm, like power_off/power_on etc.
"body": {"name": [vm102, vm103, vm104] // Number of clones to make
"storage": 50, ... // Optional parameters for specifying differences in specs one would want from the base virtual machine
}
and 3 clones of vm101 viz. vm102, vm103 and vm104 would be spinned.

Why use two step approach to deleting multiple items with REST

We all know the 'standard' way of deleting a single item via REST is to send a single DELETE request to a URI example.com/Items/666. Grand, let's move on to deleting many at once. As we do not require atomic deleting (or true transaction, ie all or nothing) we could just tell the client 'tough luck, make many requests' but that's not very nice is it. So we need a way to allow a client to request many 'Items' be deleted at once.
From my understanding, the 'typical' solution to this problem is a 'two step' approach. First the client POSTs a list of item IDs and is returned a URI such as example.com/Items/Collection/1. Once that collection is created, they call DELETE on it.
Now, I see that this works just fine, except to me, it is a bad solution. Firstly, you are forcing the client to make two requests to accommodate the server. Secondly, 'I thought DELETE was supposed to delete an Item?', shouldn't calling DELETE on this URI effectively cancel the transaction (it's not a true transaction though), how would we even cancel it? Really would be better if there was some form 'EXECUTE' action, but I can't rock the boat that much. It also forces the server to have to consider 'the JSON that was POSTed looks more like a request to modify these Items, but the request was DELETE... so I guess I will delete them'. This approach also starts to impose a sort of state on the client/server, not a true state I will admit, but it is sort of.
In my opinion, a better solution would be to simply call DELETE on example.com/Items (or maybe example.com/Items/Collection to imply this is a multiple delete) and pass JSON data containing a list of IDs that you wish to delete. As far as I can see, this basically solves all the problems the first method had. It is easier to use as a client, reduces the work the server has to do, is truly stateless, is more semantic.
I would really appreciate the feed back on this, am I missing something about REST that makes my solution to this problem unrealistic? I would also appreciate links to articles, especially if they compare these two methods; I am aware this is not normally approved of for SO. I need to be able to disprove that only the first method is truly RESTfull, prove that the second approach is a viable solution. Of course, if I am barking up the wrong tree do tell me.
I have spent the last week or so reading a fair bit on REST, and to the best of my understanding, it would be wrong to describe either of these solutions as 'RESTfull', rather you should say that 'neither solution goes against what REST means'.
The short answer is simply that REST, as laid out in Roy Fielding's dissertation (See chapter 5), does not cover the topic of how to go about deleting resources, singular or multiple, in a REST manor. That's right, there is no 'correct RESTful way to delete a resource'... well, not quite.
REST itself does not define how delete a resource, but it does define that what ever protocol you are using (remember that REST is not a protocol) will dictate the how perform these actions. The protocol will usually be HTTP; 'usually' being the key word as Fielding will point out, REST is not synonymous with HTTP.
So we look to HTTP to say which method is 'right'. Sadly, as far as HTTP is concerned, both approaches are viable. Yes 'viable'. HTTP will allow a client to send a POST request with a payload (to create a collection resource), and then call a DELETE method on this new collection to delete the resources; it will also allow you to send the data within the payload of a single DELETE method to delete the list of resources. HTTP is simply the medium by which you send requests to the server, it would be up to the server to respond appropriately. To me, the HTTP protocol seems to be rather open to interpretation in places, but it does seem to lay down fairly clear guide lines for what actions mean, how they should be dealt with and what response should be given; it's just it is a 'you should do this' rather than 'you must do this', but perhaps I am being a little pedantic on the wording.
Some people would argue that the 'two stage' approach cannot possibly be 'REST' as the server has to store a 'state' for the client to perform the second action. This is simply a misunderstanding of some part. It must be understood that neither the client nor the server is storing any 'state' information about the other between the list being POSTed and then subsequently being DELETEd. Yes, the list must have been created before it can deleted, but the server does not remember that it was client alpha that made this list (such an approach would allow the client to simply call 'DELETE' as the next request and the server remembers to use that list, this would not be stateless at all) as such, the client must tell the server to DELETE that specific list, the list it was given a specific URI for. If the client attempted to DELETE a collection list that did not already exist it would simply be told 'the resource can not be found' (the classic 404 error most likely). If you wish to claim that this two step approach does maintain a state, you must also claim that to simply GET an URI requires a state, as the URI must first exist. To claim that there is this 'state' persisting is misunderstanding what 'state' means. And as further 'proof' that such a two stage approach is indeed stateless, you could quite happily have client alpha POST the list and later client beta (without having had any communication with the other client) call DELETE on the list resources.
I think it can stand rather self evident that the second option, of just sending the list in the payload of the DELETE request, is stateless. All the information required to complete the request is stored completely within the one request.
It could be argued though that the DELETE action should only be called on a 'tangible' resource, but in doing so you are blatantly ignoring the REpresentational part of REST; It's in the name! It is the representational aspect that 'permits' URIs such as http://example.com/myService/timeNow, a URI that when 'got' will return, dynamically, the current time, with out having to load some file or read from some database. It is a key concept that the URIs are not mapping directly to some 'tangible' piece of data.
There is however one aspect of that stateless nature that must be questioned. As Fielding describes the 'client-stateless-server' in section 5.1.3, he states:
We next add a constraint to the client-server interaction: communication must
be stateless in nature, as in the client-stateless-server (CSS) style of
Section 3.4.3 (Figure 5-3), such that each request from client to server must
contain all of the information necessary to understand the request, and
cannot take advantage of any stored context on the server. Session state is
therefore kept entirely on the client.
The key part here in my eyes is "cannot take advantage of any stored context on the server". Now I will grant you that 'context' is somewhat open for interpretation. But I find it hard to see how you could consider storing a list (either in memory or on disk) that will be used to give actual useful meaning would not violate this 'rule'. With out this 'list context' the DELETE operation makes no sense. As such, I can only conclude that making use of a two step approach to perform an action such as deleting multiple resources cannot and should not be considered 'RESTfull'.
I also begrudge somewhat the effort that has had to be put into finding arguments either way for this. The Internet at large seems to have become swept up with this idea the the two step approach is the 'RESTfull' way doing such actions, with the reasoning 'it is the RESTfull way to do it'. If you step back for a moment from what everybody else is doing, you will see that either approach requires sending the same list, so it can be ignored from the argument. Both approaches are 'representational' and 'stateless'. The only real difference is that for some reason one approach has decided to require two requests. These two requests then come with follow up questions, such as how 'long do you keep that data for' and 'how does a client tell a server that it no longer wants that this collection, but wishes to keep the actual resources it refers to'.
So I am, to a point, answering my question with the same question, 'Why would you even consider a two step approach?'
IMO:
HTTP DELETE on existing collection to delete all of its member seems fine. Creating the collection just to delete all of the member sounds odd. As you yourself suggest, just pass IDs of the to be deleted items using JSON (or any other payload format). I think that the server should try to make multiple deletes an internal transaction though.
I would argue that HTTP already provides a method of deleting multiple items in the form of persistent connections and pipelining. At the HTTP protocol level it is absolutely fine to request idempotent methods like DELETE in a pipelined way - that is, send all the DELETE requests at once on a single connection and wait for all the responses.
This may be problematic for an AJAX client running in a browser since few browsers have pipelining support enabled by default. This is not the fault of HTTP, though, it is the fault of those specific clients.