Good REST API design for operations on resource sets

Good REST API design for operations on resource sets - rest

With REST it is pretty clear how to operate on resources, e.g.
PUT /users/{userId} - updates the user with userId
GET /users/{userId} - reads the user with userId
Similarly for resource sets
POST /users - creates a new user
GET /users/{userId}/books - reads list of books from a user
GET /users/{userId}/books?filter=x - reads list of books from a user with specific filter
What if I want to develop more elaborate operations on resource sets, e.g.
with the request body, add a list of books to the existing list and accepting duplicates (basically concatenating the list)
POST /users/{userId}/books
or PUT /users/{userId}/books
or PATCH?
or POST /users/{userId}/books/concatenate
with the request body, add a list of books to the existing list but no duplicates (basically merging the list)
POST /users/{userId}/books
or PUT /users/{userId}/books
or PATCH?
or POST /users/{userId}/books/merge
also for deleting parts of resource sets:
with the request body, delete a list of books from the existing list that have a certain property
POST /users/{userId}/books/delete?category=x
or DELETE /users/{userId}/books?category=x
or deleting all resources in a resource set:
POST /users/{userId}/books/delete_all
or DELETE /users/{userId}/books
Would be thankful for some hints or guidelines

"Resource sets", from the point of view of REST, are a fiction. There are only resources. As far as a general-purpose HTTP component is concerned, there is _no relation implied by the following URI:
/users
/users/{userId}
/users/{userId}/books
/users/{userId}/books?filter=x
/users/{userId}/books/concatenate
They are completely independent of one another; for instance, DELETE /users does not imply anything about the other resources.
We human beings tend to assign identifiers in patterns that make sense, but the machines don't care.
with the request body, add a list of books to the existing list and accepting duplicates (basically concatenating the list)
PUT and PATCH have remote authoring semantics; they act like you would expect if you were trying to edit a copy of a file on the server. You GET a copy of the current representation of the resource, make edits to your local copy, and then request that the server change its copy to match your copy. With PUT, you send a complete copy of your representation of the resource; with PATCH, you send a patch-document that describes the changes you made.
It's okay to use POST; HTML got along just fine using nothing but GET and POST, and the web took over the world.
You don't need a separate resource for POST; you can use one if you like, but it isn't necessary to do so.
with the request body, add a list of books to the existing list but no duplicates (basically merging the list)
Not really any different; what we agree upon in HTTP is the semantics of the request and response messages. What the server chooses to do is an implementation concern. See Fielding 2002.
So if I send to you a representation of a list with duplicate entries, and you strip out the duplicates, that's "fine"; you just need to exercise some care with your responses to ensure that you don't imply that you accepted the requested representation as is.
With PATCH, it's a bit fuzzy, in that the RFC describes all or nothing semantics, but based on the language used it is reasonable to infer that the implementation is restricted as well.
also for deleting parts of resource sets: with the request body, delete a list of books from the existing list that have a certain property
Give RFC 7231 a careful read: DELETE doesn't quite mean what your examples hint at. DELETE breaks the associate between a key (the target uri) and a value (the resource representations), but that doesn't necessarily mean "and also garbage collect the representation".
The same idea expressed another way -- suppose I GET /list-of-books from the server, and the returned representation is a list of three books. In the case where I want that resource to instead return a representation of an empty list, DELETE is the wrong tool. DELETE tells the server that I want future calls to GET /list-of-books to return 404 Not Found or possibly 410 Gone. If what I really want is a 200 OK with an empty list, then I need to PUT/PATCH/POST/etc. the resource.
deleting all resources in a resource set
Same problem as before.
With REST it is pretty clear how to operate on resources
This is the problem - it is NOT clear how to operate on resources. The web is cluttered with literature that makes a complete hash of it (we use REST to fetch documents that mangle the lessons of REST -- fabulous irony).
REST includes a uniform interface as a constraint. In HTTP, that interface is effectively a document store. PUT and PATCH just edit document contents - which is perfectly satisfactory if your domain is anemic or declarative. For anything else where we don't have standardized semantics, we use POST.
See Jim Webber, 2011: "You have to learn how to use HTTP to trigger business activity as a side effect of moving documents around the network."

Related

How to request REST API correctly for GET & DELETE without resources ID?

Please let me know if I misunderstanding.
Getting all active users
GET /api/users?active
What if I want to get all active user's messages
GET /api/users/active/messages
Or what if I want to delete all active user's messages
DELETE /api/users/no-active/messages

How to request REST API correctly for GET & DELETE without resources ID?
From the perspective of REST, this question doesn't make a lot of sense. Any named information can be a resource, and we use the resource identifier (aka, the URI) to identify which resource we are talking about.
GET /api/users?active
In this query, /api/users?active is a resource identifier (what RFC 7230 refers to as the request-target expressed in origin form).
Your resource, in this case is "all active users", or perhaps more precisely "the list of all active users"; the representation of that list will change over time depending on which users are currently active.
GET /api/users/active/messages
Same idea here, the resource is the list of messages.
Now normally when we are trying to modify a resource, we use the identifier of the resource as the target-uri for the change. So modifications to the list of messages would all share a common target-uri
POST /api/users/active/messages
PUT /api/users/active/messages
PATCH /api/users/active/messages
DELETE /api/users/active/messages
This is because the URI serves as a cache key, and general-purpose components that are familiar with HTTP caching semantics will know to invalidate any previously cached representations of the resource.
In HTTP, DELETE has a precise semantic meaning, which is to remove the association between the identifier and its representations. The natural consequence of a successful DELETE is that a subsequent GET would return a 404 Not Found (which means that the requested target-uri has no current representation).
If what you are intending is to modify the representation, then POST/PUT/PATCH are the more natural choices.
PUT /api/users/active/messages
Content-Type: application/json
[]
is a message that means "replace your current representation with this one".
Replacing one representation with another is pretty trivial when your implementation is just a document store - you validate the incoming representation, and then overwrite the old representation with the new. With dynamically generated representations, supporting the same semantics is a lot more work.
It may ease your life considerably to POST a "delete all messages" request to the resource, rather than trying to PUT a new representation.

Different methods can have same route:
Delete (DELETE) can still be the same:
DELETE /api/users/active/messages

REST delete multiple items in the batch

I need to delete multiple items by id in the batch however HTTP DELETE does not support a body payload.
Work around options:
1. #DELETE /path/abc?itemId=1&itemId=2&itemId=3 on the server side it will be parsed as List of ids and DELETE operation will be performed on each item.
2. #POST /path/abc including JSON payload containing all ids. { ids: [1, 2, 3] }
How bad this is and which option is preferable? Any alternatives?
Update: Please note that performance is a key here, it is not an option execute delete operation for each individual id.

Along the years, many people fell in doubt about it, as we can see in the related questions here aside. It seems that the accepted answers ranges from "for sure do it" to "its clearly mistreating the protocol". Since many questions was sent years ago, let's dig into the HTTP 1.1 specification from June 2014 (RFC 7231), for better understanding of what's clearly discouraged or not.
The first proposed workaround:
First, about resources and the URI itself on Section 2:
The target of an HTTP request is called a "resource". HTTP does not limit the nature of a resource; it merely defines an interface that might be used to interact with resources. Each resource is identified by a Uniform Resource Identifier (URI).
Based on it, some may argue that since HTTP does not limite the nature of a resource, a URI containing more than one id would be possible. I personally believe it's a matter of interpretation here.
About your first proposed workaround (DELETE '/path/abc?itemId=1&itemId=2&itemId=3') we can conclude that it's something discouraged if you think about a resource as a single document in your entity collection while being good to go if you think about a resource as the entity collection itself.
The second proposed workaround:
About your second proposed workaround (POST '/path/abc' with body: { ids: [1, 2, 3] }), using POST method for deletion could be misleading. The section Section 4.3.3 says about POST:
The POST method requests that the target resource process the representation enclosed in the request according to the resource's own specific semantics. For example, POST is used for the following functions (among others): Providing a block of data, such as the fields entered into an HTML form, to a data-handling process; Posting a message to a bulletin board, newsgroup, mailing list, blog, or similar group of articles; Creating a new resource that has yet to be identified by the origin server; and Appending data to a resource's existing representation(s).
While there's some space for interpretation about "among others" functions for POST, it clearly conflicts with the fact that we have the method DELETE for resources removal, as we can see in Section 4.1:
The DELETE method removes all current representations of the target resource.
So I personally strongly discourage the use of POST to delete resources.
An alternative workaround:
Inspired on your second workaround, we'd suggest one more:
DELETE '/path/abc' with body: { ids: [1, 2, 3] }
It's almost the same as proposed in the workaround two but instead using the correct HTTP method for deletion. Here, we arrive to the confusion about using an entity body in a DELETE request. There are many people out there stating that it isn't valid, but let's stick with the Section 4.3.5 of the specification:
A payload within a DELETE request message has no defined semantics; sending a payload body on a DELETE request might cause some existing implementations to reject the request.
So, we can conclude that the specification doesn't prevent DELETE from having a body payload. Unfortunately some existing implementations could reject the request... But how is this affecting us today?
It's hard to be 100% sure, but a modern request made with fetch just doesn't allow body for GET and HEAD. It's what the Fetch Standard states at Section 5.3 on Item 34:
If either body exists and is non-null or inputBody is non-null, and request’s method is GET or HEAD, then throw a TypeError.
And we can confirm it's implemented in the same way for the fetch pollyfill at line 342.
Final thoughts:
Since the alternative workaround with DELETE and a body payload is let viable by the HTTP specification and is supported by all modern browsers with fetch and since IE10 with the polyfill, I recommend this way to do batch deletes in a valid and full working way.

It's important to understand that the HTTP methods operate in the domain of "transferring documents across a network", and not in your own custom domain.
Your resource model is not your domain model is not your data model.
Alternative spelling: the REST API is a facade to make your domain look like a web site.
Behind the facade, the implementation can do what it likes, subject to the consideration that if the implementation does not comply with the semantics described by the messages, then it (and not the client) are responsible for any damages caused by the discrepancy.
DELETE /path/abc?itemId=1&itemId=2&itemId=3
So that HTTP request says specifically "Apply the delete semantics to the document described by /path/abc?itemId=1&itemId=2&itemId=3". The fact that this document is a composite of three different items in your durable store, that each need to be removed independently, is an implementation details. Part of the point of REST is that clients are insulated from precisely this sort of knowledge.
However, and I feel like this is where many people get lost, the metadata returned by the response to that delete request tells the client nothing about resources with different identifiers.
As far as the client is concerned, /path/abc is a distinct identifier from /path/abc?itemId=1&itemId=2&itemId=3. So if the client did a GET of /path/abc, and received a representation that includes itemIds 1, 2, 3; and then submits the delete you describe, it will still have within its own cache the representation that includes /path/abc after the delete succeeds.
This may, or may not, be what you want. If you are doing REST (via HTTP), it's the sort of thing you ought to be thinking about in your design.
POST /path/abc
some-useful-payload
This method tells the client that we are making some (possibly unsafe) change to /path/abc, and if it succeeds then the previous representation needs to be invalidated. The client should repeat its earlier GET /path/abc request to refresh its prior representation rather than using any earlier invalidated copy.
But as before, it doesn't affect the cached copies of other resources
/path/abc/1
/path/abc/2
/path/abc/3
All of these are still going to be sitting there in the cache, even though they have been "deleted".
To be completely fair, a lot of people don't care, because they aren't thinking about clients caching the data they get from the web server. And you can add metadata to the responses sent by the web server to communicate to the client (and intermediate components) that the representations don't support caching, or that the results can be cached but they must be revalidated with each use.
Again: Your resource model is not your domain model is not your data model. A REST API is a different way of thinking about what's going on, and the REST architectural style is tuned to solve a particular problem, and therefore may not be a good fit for the simpler problem you are trying to solve.
That doesn’t mean that I think everyone should design their own systems according to the REST architectural style. REST is intended for long-lived network-based applications that span multiple organizations. If you don’t see a need for the constraints, then don’t use them. That’s fine with me as long as you don’t call the result a REST API. I have no problem with systems that are true to their own architectural style. -- Fielding, 2008

Correct URI for REST calls to create & delete relationship between two entities

I need to create and delete relationships between two different entities through REST calls.
Let's say user A (the current user) is going to follow or un-follow user B. The existence of a follow relationship is denoted by the presense or absence of the Follow relationship entity (Follow(B, A) means that A follows B).
Should the calls be:
POST /api/follow/{user-b-id} // to follow
and
DELETE /api/follow/{user-b-id} // to un-follow
where the identity of user A is deduced from the token sent along to authenticate the call.
Or should they be based on the action being carried out:
POST /api/follow/{user-b-id} // to follow
and
POST /api/unfollow/{user-b-id} // to un-follow
I have doubts about which methods (POST, PUT, DELETE etc.) to use and whether the URIs should reference the action (verb?) being carried out. Since I am re-designing an API, I want to get as close to "correct" (yes, I do realize that's a little subjective) REST API design as makes sense for my project.

Correct URI for REST calls to create & delete relationship between two entities
REST doesn't care what spelling you use for your URI; /182b2559-5772-40fd-af84-297e3a4b4bcb is a perfectly find URI as far as REST is concerned. The constraints on spelling don't come from REST, but instead whatever the local coding standard is.
A common standard is to consider a collection resource that includes items; such that adding an item resource to a collection is modeled by sending a message to the collection resource, and removing the item resource is modeled by sending a message to the item resource. The Atom Publishing Protocol, for instance, works this way - a POST to a collection resource adds a new entry, a DELETE to the item resource removes the entry.
Following this approach, the usual guideline would be that the collection resource is named for the collection, with the item resources subordinate to it.
// Follow
POST /api/relationships
// Unfollow
DELETE /api/relationships/{id}
id here might be user-b-id or it might be something else; one of the core ideas in REST is that the server is the authority for its URI space; the server may embed information into the URI, at it's own discretion and for its own exclusive use. Consumers are expected to treat the identifiers as opaque units.
I have doubts about which methods (POST, PUT, DELETE etc.) to use and whether the URIs should reference the action (verb?) being carried out.
It's sometimes helpful to keep in mind that the world wide web has been explosively successful even though the primary media type in use (HTML) supports only GET and POST natively.
Technically, you can use POST for everything. The HTTP uniform interface gives you carte blanche.
PUT, DELETE, PATCH can all be considered specializations of POST: unsafe methods with additional semantics. PUT suggests idempotent replace semantics, DELETE suggests remove, PATCH for an all or nothing update.
Referencing the action isn't wrong (REST doesn't care about spelling, remember), but it does suggest that you are thinking about the effects of the messages rather than about the resources that the messages are acting upon.
JSON Patch may be a useful example to keep in mind. The operations (add, remove, replace, and so on) are encoded into the patch document, the URI specifies which resource should be modified with those operations.
Jim Webber expressed the idea this way - HTTP is a document transfer application. Useful work is a side effect of exchanging documents. The URI identify the documents that are used to navigate your integration protocol.
So if you need consistent, human readable spellings for your URI, one way to achieve this is by articulating that protocol and the documents from which it is composed.
Would it be correct to say that PUT is for replacing the entire entity (resource) and PATCH if for modifying a sub-set of the entity's (resource's) properties?
Not quite. PUT means the message-body of the request is a replacement representation of the resource. PATCH means the message-body of the request is a patch document.
There's nothing in the semantics that prevents you from using PUT to change a single element in a large document, or PATCH to completely replace a representation.
But a client might prefer PATCH to PUT because the patch document is much smaller than the replacement representation. Or it might prefer PUT to PATCH because the message transport is unreliable, and the idempotent semantics of PUT make retry easier.

The right decision also depends on the way other resources are mapped in the project. Same style is better, however if there's no preference, the following could have the advantage of being easier to implement and remember
POST /api/follow/{user-b-id} // to follow
and
POST /api/unfollow/{user-b-id} // to un-follow

I would say, use the delete verb if your are passing in the id of the relationship/link/follow from a to b. This way, it is fairly explicit your route is doing. It is accepting an id of some object and deleting it.
However, in your example, you are passing in the id of the other user, then you have to do some logic to find the relationship/link/follow object between the two and delete it. In my mind, this is more of a post than a delete because of the additional work you have to do. Regardless, it seems fairly subjective as to which one is "right",

What is the proper HTTP method for modifying a subordinate of the named resource?

I am creating a web client which has the purpose of modifying a set of database tables by adding records to them and removing records from them. It must do so atomically, so both deletion and insertion must be done with a single HTTP request. Clearly, this is a write operation of some sort, but I struggle to identify which method is appropriate.
POST seemed right at first, except that RFC 2616 specifies that a POST request must describe "a new subordinate" of the named resource. That isn't quite what I'm doing here.
PUT can be used to make changes to existing things, so that seemed about right, except that RFC 2616 also specifies that "the URI in a PUT request identifies the entity enclosed with the request [...] and the server MUST NOT attempt to apply the request to some other resource," which rules that method out because my URI does not directly specify the database tables.
PATCH seemed closer - now I am not cheating by only partly overwriting a resource - but RFC 5789 makes it clear that this method, like PUT, must actually modify the resource specified by the URI, not some subordinate resource.
So what method should I be using?
Or, more broadly for the benefit of other users:
For a request to X, you use
POST to create a new subordinate of X,
PUT to create a new X,
PATCH to modify X.
But what method should you use if you want to modify a subordinate of X?

To start.. not everything has to be REST. If REST is your hammer, everything may look like a nail.
If you really want to conform to REST ideals, PATCH is kind of out of the question. You're only really supposed to transfer state.
So the common 'solution' to this problem is to work outside the resources that you already have, but invent a new resource that represents the 'transaction' you wish to perform. This transaction can contain information about the operations you're doing in sequence, potentially atomically.
This allows you to PUT (or maybe POST) the transaction, and if needed, also GET the current state of the transaction to find out if it was successful.
In most designs this is not really appropriate though, and you should just fall back on POST and define a simple rpc-style action you perform on the parent.

First, allow me to correct your understanding of these methods.
POST is all about creating a brand new resource. You send some data to the server, and expect a response back saying where this new resource is created. The expectation would be that if you POST to /things/ the new resource will be stored at /things/theNewThing/. With POST you leave it to the server to decide the name of the resource that was created. Sending multiple identical POST requests results in multiple resources, each their own 'thing' with their own URI (unless the server has some additional logic to detect the duplicates).
PUT is mostly about creating a resource. The first major difference between PUT and POST is that PUT leaves the client in control of the URI. Generally, you don't really want this, but that's getting of the point. The other thing that PUT does, is not modify, if you read the specification carefully, it states that you replace what ever resource is at a URI with a brand new version. This has the appearance of making a modification, but is actually just a brand new resource at the same URI.
PATCH is for, as the name suggest, PATCHing a resource. You send a data to the server describing how to modify a particular resource. Consider a huge resource, PATCH allows you to send just the tiny bit of data that you wish to change, whilst PUT would require you send the entire new version.
Next, consider the resources. You have a set of tables each with many rows, that equates to a set of collections with many resources. Now, your problem is that you want to be able to atomically add resources and remove them at the same time. So you can't just POST then DELETE, as that's clearly not atomic. PATCHing the table how ever can be...
{ "add": [
{ /* a resource */ },
{ /* a resource */ } ],
"remove" : [ "id one", "id two" ] }
In that one body, we have sent the data to the server to both create two resources and delete two resources in the server. Now, there is a draw back to this, and that is that it's hard to let clients know what is going on. There's no 'proper' way of the client of the two new resources, 204 created is sort of there, but is meant have a header for the URI of the one new resource... but we added two. Sadly, this a problem you are going to face no matter what, HTTP simple isn't designed to handle multiple resources at once.
Transaction Resources
So this is a common solution people propose, and I think it stinks. The basic idea is that you first POST/PUT a blob of data on the server the encodes the transaction you wish to make. You then use another method to 'activate' this transaction.
Well hang on... that's two requests... it sends the same data that you would via PATCH and then you have fudge HTTP even more in order to somehow 'activate' this transaction. And what's more, we have this 'transaction' resource now floating around! What do we even do with that?

I know this question has been asked already some time ago, but I thought I should provide some commentary to this myself. This is actually not a real "answer" but a response to thecoshman's answer. Unfortunately, I am unable to comment on his answer which would be the right thing to do, but I don't have enough "reputation" which is a strange (and unnecessary) concept, IMHO.
So, now on to my comment for #thecoshman:
You seem to question the concept of "transactional resources" but in your answer it looks to me that you might have misunderstood the concept of them. In your answer, you describe that you first do a POST with the resource and the associated transaction and then POST another resource to "activate" this transaction. But I believe the concept of transactional resources are somehow different.
Let me give you a simple example:
In a system you have a "customer" resource and his address with customer as the primary (or named) resource and the address being the subordinate address. For this example, let us assume we have a customer with a customerId of 1234. The URI to reach this customer would be /api/customer/1234. So, how would you now just update the customer's address without having to update the entire customer resource? You could define a "transaction resource" called "updateCustomerAddress". With that you would then POST the updated customer address data (JSON or even XML) to the following URI: POST /api/customer/1234/updateCustomerAddress. The service would then create this new transactional resource to be applied to the customer with customerId=1234. Once the transaction resource has been created, the call would return with 201, although the actual change may not have been applied to the customer resource. So a subsequent GET /api/customer/1234 may return the old address, or already the new and updated address. This supports well an asynchronous model for updating subordinate resources, or even named resources.
And what would we do with the created transactional resource? It would be completely opaque to the client and discarded as soon as the transaction has been completed. So the call may actually not return a URI of the transactional resource since it may have disappeared already by the time a client would try to access it.
As you can see, transactional resources should not require two HTTP calls to a service and can be done in just one.

RFC 2616 is obsolete. Please read RFC 723* instead, in particular https://datatracker.ietf.org/doc/html/rfc7231#section-4.3.3.

PUT vs. POST when creating multiple resources that already have IDs

I'm building an API where the majority of endpoints accept and return multiple resources. I'm having trouble determining whether to use POST or PUT requests.
In my situation, resources are always created and identified outside of the API using UUIDs for identifiers. Since the resources already exist and are already identified, a PUT request seems appropriate. However, it would be impractical to include the UUID of each resource in the URI (the UUIDs are very long and there could be many resources in a single request).
A POST request also seems appropriate because even though the resource is already identified with an ID, it does not yet exist in our database. Also, with POST requests, there is no expectation of having the IDs in the URI.
Which would be the correct HTTP verb to use in this situation?
PUT /resources/1,2,3 ---> Impractical due to the number of resources per request
PUT /resources ---------> More practical but lacks IDs in the URI
POST /resources ---------> Possibly inaccurate verb since resource is already identified

Your three choices, as I see them, are:
Use PUT /resources/1, then PUT /resources/2, then PUT /resources 3. This is the "as designed" approach. It does result in 3 calls instead of 1, but lets you leverage the benefits of using PUT.
Use POST /resources, with the body of the POST containing all details on all the resources going up to the server, including the id. The server can create the resource from the id in the body. You lose the benefits of PUT, but save on wire traffic.
Use PATCH /resources, with the body of the PATCH containing all details on the resources to be created. This really only works if you're using JSON, since patch semantics for XML are iffy at best. The semantics are described in RFC 6902.
It is generally not suggested to perform a request on multiple ids. If you call PUT /resources, semantically you're saying that you're replacing the contents of /resources with the body you just sent up, which is not what you intend.
I would suggest the first approach unless you have a proven reason to avoid it (tested and have a performance issue). In that case, I'd seriously consider PATCH over POST.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse