How to design DELETE REST API that requires lots of data? - rest

I want to implement a DELETE REST API. But I need the option to provide a list of IDs to be deleted. This list could be arbitrarily long and may not fit within a URL.
I know POST supports this, but support for this and DELETE seems debatable. I wonder how others are handling this case.
How would an API be designed to handle this case?

This is unfortunately one of the biggest limitations in REST, but there are ways around it.
In this case I would abstract out a new entity, DeletionRequest, and have that get posted or put with the appropriate IDs. Since it is a new entity it would have its own rest endpoints.
A nice side effect of this is that the endpoints and entity can be expanded out to support async requests. If you want to delete a ton of data you don't want to rely on it happening in a single request, as things like timeouts can get in the way. With a DeletionRequest the user can get an ID for the deletion request on the first push, and then check the status with a GET request. Behind the scenes you can use an async system (celery, sidekiq, etc) to actually delete things and update the status of the DeletionRequest.
You don't have to take it that far to start, of course, but this would allow you to expand the application in that direction without having to change your API.

The URI is the resource identifier, so in my opinion the DELETE should not contain a body even if you can do it with your client and server. Either you send your data in the URI or you send it prior the DELETE.
I see 3 options here, but maybe there are others:
Do what Robert says and POST a transaction resource instead like DeletionRequest.
Group the resources you want to delete and DELETE the entire group.
Do a massive hack and PATCH the collection of resources you want to delete from.

Related

How to delete items from a subset with REST API

I'm wondering what are the best ways to delete items from a subset in a restful way. I got users and series, each user has his own list of series (watching, completed, etc). For example, if we want to get a list from a user we can do it with: GET /users/:id_user/series
If we want to delete a serie from the list of that user (but we don't want to delete the serie itself), how should it be?
I thought about the possibility of using DELETE /users/:id_user/series/:id_serie, but I'm not sure if it's the correct way for this case (maybe PATCH?).
I got another case, we got series and reviews. We can get the reviews like this: GET /series/:serie_id/reviews. In the other case we didn't want to delete the serie itself when deleting from a user list of series, but in this case we want to delete the review because its existence depends on the serie. So I guess in this case DELETE /series/:serie_id/reviews/:review_id is correct.
Is this difference important in order to choose the rest operation to delete the object/item from the subset?
How would you do it on the web?
You'd follow a link to a form, with input controls. You might have a something like a dropdown if you wanted to delete one series at a time, or lots of check boxes if you wanted to support a bulk delete. The user would provide input, hit the submit button, and the browser would create an application/x-www-form-urlencoded document and send it to the server.
What method would be used? Normally POST, because we are intending an edit to some resource on the server.
What resource would we be editing? Well, in trutch, it could be anything -- the server would include that information in the form metadata, so the client can just do what it is told.
So where should the server tell it to submit the form? Again, it could be anywhere... but a useful approach is to think about what resource in the client's cache is being updated. Because if we send the request to that resource, we get intelligent cache invalidation "for free".
So on the web, we would expect to see:
POST /users/:id_user/series
Does it have to be POST? On the HTML web, maybe it does, because the ubiquitous client of the web is a browser, not an editor.
It is okay to use POST.
But a perfectly valid alternative would be to edit the local copy of /users/:id_user/series, and then send back to the server a complete copy of the new version (PUT) or a patch-document describing the edits (PATCH). Notice that with both of these choices, the target uri is still /user/:id_user/series, so we still get the cache invalidation magic.
Creating a new resource in your model just to have something to DELETE is probably the wrong idea.
There are cases where an edit, or a delete, will necessarily impact more than one resource.
There are some specific circumstances when you can get the right magic cache invalidation with two resources (for instance, delete one document, and send back an updated copy of another).
But we don't, today, have a general purpose cache invalidation mechanism. (Closest thing I've been able to find is this, which seems to have stalled out in 2012.

Restful business logic on property update

I'm building a REST API and I'm trying to keep it as RESTful as possible, but some things are still not quite clear for me. I saw a lot of topic about similar question but all too centered about the "simple" problem of updating data, my issue is more about the business logic around that.
My main issue is with business logic triggered by partial update of a model. I see a lot of different opinion online about PATCH methods, creating new sub-ressources or adding action, but it often seems counter productive with the REST approach of keeping URI simple and structured.
I have some record that need to be proceeded ( refused, validated, partially validated ..etc ), each change trigger additional actions.
If it's refused, an email with the reason should be sent
if it's partially validated, the link to fulfill the missing data is sent
if it's validated some other ressources must be created.
There is a few other change that can be made to the status but this is enough for the example.
What would be a RESTful way to do that ?
My first idea would be to create actions :
POST /record/:id/refuse
POST /record/:id/validate ..etc
It seems RESTful to me but too complicated, and moreover, this approach means having multiple route performing essentially the same thing : Update one field in the record object
I also see the possibility of a PATCH method like :
PATCH /record/:id in which I check if the field to update is status, and the new value to know which action to perform.
But I feel it can start to be too complex when I will have the need to perform similar action for other property of the record.
My last option, and I think maybe the best but I'm not sure if it's RESTful, would be to use a sub-ressource status and to use PUT to update it :
PUT /record/:id/status, with a switch on the new value.
No matter what the previous value was, switching to accepted will always trigger the creation, switching to refused will always trigger the email ...etc
Are those way of achieving that RESTful and which one make more sense ? Is there other alternative I didn't think about ?
Thanks
What would be a RESTful way to do that ?
In HTTP, your "uniform interface" is that of a document store. Your Rest API is a facade, that takes messages with remote authoring semantics (PUT/POST/PATCH), and your implementation produces useful work as a side effect of its handling of those messages.
See Jim Webber 2011.
I have some record that need to be proceeded ( refused, validated, partially validated ..etc ), each change trigger additional actions.
So think about how we might do this on the web. We GET some resource, and what is returned is an html representation of the information of the record and a bunch of forms that describe actions we can do. So there's a refused form, and a validated form, and so on. The user chooses the correct form to use in the browser, fills in any supplementary information, and submits the form. The browser, using the HTML form processing rules, converts the form information into an HTTP request.
For unsafe operations, the form is configured to use POST, and the browsers therefore know that the form data should be part of the message-body of the request.
The target-uri of the request is just whatever was used as the form action -- which is to say, the representation of the form includes in it the information that describes where the form should be submitted.
As far as the browser and the user are concerned, the target-uri can be anything. So you could have separate resources to handle validate messages and refused messages and so on.
Caching is an important idea, both in REST and in HTTP; HTTP has specific rules baked into it for cache invalidation. Therefore, it is often the case that you will want to use a target-uri that identifies the document you want the client to reload if the command is successful.
So it might go something like this: we GET /record/123, and that gives us a bunch of information, and also some forms describing how we can change the record. So fill one out, submit it successfully, and now we expect the forms to be gone - or a new set of forms to be available. Therefore, it's the record document itself that we would expect to be reloading, and the target-uri of the forms should be /record/123.
(So the API implementation would be responsible for looking at the HTTP request, and figuring out the meaning of the message. They might all go to a single /record/:id POST handler, and that code looks through the message-body to figure out which internal function should do the work).
PUT/PATCH are the same sort of idea, except that instead of submitting forms, we send edited representations of the resource itself. We GET /record/123, change the status (for example, to Rejected), and then send a copy of our new representation of the record to the server for processing. It would therefore be the responsibility of the server to examine the differences between its representation of the resource and the new provided copy, and calculate from them any necessary side effects.
My last option, and I think maybe the best but I'm not sure if it's RESTful, would be to use a sub-resource status and to use PUT to update it
It's fine -- think of any web page you have ever seen where the source has a link to an image, or a link to java script. The result is two resources instead of one, with separate cache entries for each -- which is great, when you want fine grained control over the caching of the resources.
But there's a trade - you also need to fetch more resources. (Server-push mitigates some of this problem).
Making things easier on the server may make things harder on the client - you're really trying to find the design with the best balance.

REST API - Design a POST API - if its called multiple times for a same user

I'm new to designing RESTful APIs and currently developing APIs to manage students in a school.
Each student has a unique roll number that clients provide while adding/creating a user. Service creates an internal id that is unique for every user that is added.
If clients make multiple POST calls for the same user, what are the recommended options in this scenario? Success with an existing resource id? or an error? or something else.
If clients make multiple POST calls for the same user, what are the recommended options in this scenario? Success with an existing resource id? or an error? or something else.
One important thing to remember is that, on an unreliable network, the client cannot distinguish between a lost request and a lost response. So you will probably benefit from having a clear protocol in place to handle that condition.
Idempotent request handling is probably your best bet: tell the client that the user was created successfully as many times as it takes.
There's an edge case where you get two messages with the same unique identifier, but the other data is different, and you should work through the protocol to figure out the correct behavior in that case (first writer wins? last writer wins? raise a conflict?) keeping in mind that you have no guarantees that requests arrive in the order that they were sent.
Note: because you are using POST, general purpose components will not know that the request is idempotent, and won't be able to take advantage of that, which is fine. A resource model that supports PUT, rather than POST, would allow the general purpose components to handle lost messages, but there are other trade offs (for instance, HTML forms don't support PUT).
You have two options, POST and PUT, you can choose one of these or both based on your requirement.
If you choose POST, and if the resource already exists, throw an error saying the resource exists.
If you choose PUT, and if the resource already exists, then update the resource and return the existing resource id.
These are widely followed conventions which are intuitive for the api consumers. If you are deviating from these for any special cases then you have to make sure that the api consumers are aware of your convention.
This link might be super useful - PUT vs. POST in REST

Delete multiple records using REST

What is the REST-ful way of deleting multiple items?
My use case is that I have a Backbone Collection wherein I need to be able to delete multiple items at once. The options seem to be:
Send a DELETE request for every single record (which seems like a bad idea if there are potentially dozens of items);
Send a DELETE where the ID's to delete are strung together in the URL (i.e., "/records/1;2;3");
In a non-REST way, send a custom JSON object containing the ID's marked for deletion.
All options are less than ideal.
This seems like a gray area of the REST convention.
Is a viable RESTful choice, but obviously has the limitations you have described.
Don't do this. It would be construed by intermediaries as meaning “DELETE the (single) resource at /records/1;2;3” — So a 2xx response to this may cause them to purge their cache of /records/1;2;3; not purge /records/1, /records/2 or /records/3; proxy a 410 response for /records/1;2;3, or other things that don't make sense from your point of view.
This choice is best, and can be done RESTfully. If you are creating an API and you want to allow mass changes to resources, you can use REST to do it, but exactly how is not immediately obvious to many. One method is to create a ‘change request’ resource (e.g. by POSTing a body such as records=[1,2,3] to /delete-requests) and poll the created resource (specified by the Location header of the response) to find out if your request has been accepted, rejected, is in progress or has completed. This is useful for long-running operations. Another way is to send a PATCH request to the list resource, /records, the body of which contains a list of resources and actions to perform on those resources (in whatever format you want to support). This is useful for quick operations where the response code for the request can indicate the outcome of the operation.
Everything can be achieved whilst keeping within the constraints of REST, and usually the answer is to make the "problem" into a resource, and give it a URL.
So, batch operations, such as delete here, or POSTing multiple items to a list, or making the same edit to a swathe of resources, can all be handled by creating a "batch operations" list and POSTing your new operation to it.
Don't forget, REST isn't the only way to solve any problem. “REST” is just an architectural style and you don't have to adhere to it (but you lose certain benefits of the internet if you don't). I suggest you look down this list of HTTP API architectures and pick the one that suits you. Just make yourself aware of what you lose out on if you choose another architecture, and make an informed decision based on your use case.
There are some bad answers to this question on Patterns for handling batch operations in REST web services? which have far too many upvotes, but ought to be read too.
If GET /records?filteringCriteria returns array of all records matching the criteria, then DELETE /records?filteringCriteria could delete all such records.
In this case the answer to your question would be DELETE /records?id=1&id=2&id=3.
I think Mozilla Storage Service SyncStorage API v1.5 is a good way to delete multiple records using REST.
Deletes an entire collection.
DELETE https://<endpoint-url>/storage/<collection>
Deletes multiple BSOs from a collection with a single request.
DELETE https://<endpoint-url>/storage/<collection>?ids=<ids>
ids: deletes BSOs from the collection whose ids that are in the provided comma-separated list. A maximum of 100 ids may be provided.
Deletes the BSO at the given location.
DELETE https://<endpoint-url>/storage/<collection>/<id>
http://moz-services-docs.readthedocs.io/en/latest/storage/apis-1.5.html#api-instructions
This seems like a gray area of the REST convention.
Yes, so far I have only come accross one REST API design guide that mentions batch operations (such as a batch delete): the google api design guide.
This guide mentions the creation of "custom" methods that can be associated via a resource by using a colon, e.g. https://service.name/v1/some/resource/name:customVerb, it also explicitly mentions batch operations as use case:
A custom method can be associated with a resource, a collection, or a service. It may take an arbitrary request and return an arbitrary response, and also supports streaming request and response. [...] Custom methods should use HTTP POST verb since it has the most flexible semantics [...] For performance critical methods, it may be useful to provide custom batch methods to reduce per-request overhead.
So you could do the following according to google's api guide:
POST /api/path/to/your/collection:batchDelete
...to delete a bunch of items of your collection resource.
I've allowed for a wholesale replacement of a collection, e.g. PUT ~/people/123/shoes where the body is the entire collection representation.
This works for small child collections of items where the client wants to review a the items and prune-out some and add some others in and then update the server. They could PUT an empty collection to delete all.
This would mean GET ~/people/123/shoes/9 would still remain in cache even though a PUT deleted it, but that's just a caching issue and would be a problem if some other person deleted the shoe.
My data/systems APIs always use ETags as opposed to expiry times so the server is hit on each request, and I require correct version/concurrency headers to mutate the data. For APIs that are read-only and view/report aligned, I do use expiry times to reduce hits on origin, e.g. a leaderboard can be good for 10 mins.
For much larger collections, like ~/people, I tend not to need multiple delete, the use-case tends not to naturally arise and so single DELETE works fine.
In future, and from experience with building REST APIs and hitting the same issues and requirements, like audit, I'd be inclined to use only GET and POST verbs and design around events, e.g. POST a change of address event, though I suspect that'll come with its own set of problems :)
I'd also allow front-end devs to build their own APIs that consume stricter back-end APIs since there's often practical, valid client-side reasons why they don't like strict "Fielding zealot" REST API designs, and for productivity and cache layering reasons.
You can POST a deleted resource :). The URL will be
POST /deleted-records
and the body will be
{"ids": [1, 2, 3]}

RESTful Soft Delete

I'm trying to build a RESTful webapp wherein I utilize GET, POST, PUT, and DELETE. But I had a question about the use of DELETE in this particular app.
A bit of background first:
My webapp manages generic entities that are also managed (and, it happens, always created) in another system. So within my webapp, each entity will be stored in the database with a unique key. But the way we will be accessing them through URLs is with the unique key of the other system.
A simple example will make this clear, I think. Take the URL /entity/1. This will display information for the entity with ID 1 in the other system, and not my own system. In fact, IDs in my system will be completely hidden. There will be no URL scheme for accessing the entity with ID of 1 in my own system.
Alright, so now that we know how my webapp is structured, let's return to deleting those entities.
There will be a way to 'delete' entities in my system, but I put quotes around it because it won't actually be deleting them from the database. Rather, it will flag them with a property that prevents it from appearing when you go to /entity/1.
Because of this, I feel like I should be using PUT ('deleting' in this way will be idempotent), since I am, from the perspective of the data, simply setting a property.
So, the question: does the RESTful approach have fidelity to the data (in which case it is clear that I am PUTing), or the representation of the data in the app (in which case it seems that I am DELETEing)?
You should use DELETE.
What you intend to do with your data is called "soft deleting": you set a flag and avoid flagged items from appearing. This is internal to your webapp and the user doesn't have to know that you're soft deleting instead of deleting or whatever you want to do. This is why you should use the DELETE verb.
I think there is no definitive answer. I'd rely on whether 1. the soft-delete, recover and destroy actions are an actual feature of your api OR 2. soft-delete is merely a "paranoid" database engineering pattern.
The "soft" deletion is transparent for the api client, in which case using the DELETE verb seems like the way to go
Everything is as if the item was to be removed once and for all, but engineers want to keep it somewhere in the database
Api clients have the ability to recover or destroy the soft deleted resource, in which case soft deletion and recovery can use POST on a different action url like /resource/:id/softdelete and the destroy action would be the one using DELETE.
Another way to go may be to use DELETE with no query parameter to soft delete, and add ?destroy=true to actually destroy. But this approach seems less explicit and more prone to errors.
The DELETE method has very specific semantics in HTTP, which must not be overloaded
or stretched by a REST API’s design. Specifically, an API should not distort the intended
meaning of DELETE by mapping it to a lesser action that leaves the resource, and its URI,
available to clients. For example, if an API wishes to provide a “soft” delete or some
other state-changing interaction, it should employ a special controller resource and
direct its clients to use POST instead of DELETE to interact.
Source: Rest-API Desgin Rule book by Mark Massé
Suggestion:
POST: /entity/1/your-soft-delete-controller-name