Updating composite entities in a RESTful resource

Updating composite entities in a RESTful resource - rest

I have an entity with several attributes, say «project». Apart from simple attributes, the project may have a list of «statuses» of which the last one is the current one. I have a web form to create/edit a project. All attributes of this project can be changed in this form, and also users can add a new status for the project (but they can’t change or delete old statuses).
Project statuses are purely composite entities, they don’t have any distinctive meaning or identity outside of the project scope, and they are not to be addressed directly, so they obviously don’t deserve a special root REST resource.
According to REST architecture, I created a resource called /projects. POST is used to create a new project, and PUT is used to change an existing project.
However, I don’t want the client to PUT the project together with all its historical statuses, firstly because this collection is too heavy, and secondly because the business logic allows only for adding statuses, not changing or deleting them, so PUTting the project together with all of its statuses doesn’t make any sense anyway.
PUTting a project with only a new status is also not an option, because it violates the idempotency of PUT.
I also don’t like the idea of POSTing a status in a second HTTP-request, say /project/{id}/status, because that would break the atomicity of the update operation from the user’s standpoint. If this second request gets lost on the wire, then the project will appear inconsistent to the user who edited it (the attributes changed, but the status stayed the same). Creating RESTful "transactions" seems like overkill (and also error prone) for this simple task of updating a seemingly monolithic entity.
This kind of problem is quite ubiquitous in my work, and may be generalized as such: what is the RESTfully correct and atomic way of updating a complex composite entity for which only partial update is allowed by the business logic?

I think that if you want to do partial updates (it's actually your case), you should use the method PATCH. This allows to update either the project without dependencies (statuses) or the dependency(ies) without the project hints.
You can notice that there is a format to describe the operations to do within a method PATCH. It's called JSON Patch (see https://www.rfc-editor.org/rfc/rfc6902). This format describes what you want to do within your request: add an element, update it, remove it, ...
I think that you could have something like that if you want (for example) to update the name of a specific project, remove a status (it's also a sample since I read that you want to forbid this!) and add a new one in one atomic request:
PATCH /projects/1
[
{
"op": "replace",
"path": "/name",
"value": "the new name of the project"
},
{
"op": "remove",
"path": "/statuses/1"
},
{
"op": "add",
"path": "/statuses/",
"value": {
"name": "my status",
(...)
}
}
]
Notice that you can put what you want in the attribute name to identify the related element in the resource state. So /statuses/1 can be the second element in the array, the status with id of value 1 or something else.
The server-side processing for the request can be atomic.
I wrote a blog post about bulk updates: https://templth.wordpress.com/2015/05/14/implementing-bulk-updates-within-restful-services/. I think that the section "Implementing bulk updates" could correspond to what you look for.
Hope it helps you,
Thierry

Do you need HTTP PATCH? It is the verb to express delta updates to a resource.
https://www.rfc-editor.org/rfc/rfc5789

Related

What is the most RESTful way to update a resource in a non-idempotent way?

What is the most RESTful way to update part of a resource, where the generation of that resource is done server side, not on the client side. This is not an idempotent action, as the supporting data on the server may change between requests.
I'm creating a Rest API, and I've come to a design choice where I'm quite sure of the way to move forward.
I have a resource that I want to refresh, which involves creating a large json blob based on support data, then saving that json blob to a database before serving it back to the user.
My question is, what is the most RESTful way to perform this action? As the client doesn't perform the calculations, and it also isn't idempotent as the data set may change between each call, I feel it is unnatural to use a PUT.
I settled on a POST, but that doesn't sit right either.
A third option would be to have a sub-resource that describes the action of refreshing - this doesn't feel correct either.
For example, I have a document:
GET /document/<documentId>
which would return something like:
"body": {
"createdAt": "2019-01-01 12:00:00",
"updatedAt": "2019-01-01 13:00:00",
"name": "example",
"location": "example",
"city": "example"
}
These fields are generated by the server when the document is created, the client doesn't update them.
To allow the client to signal that they would like the server to regenerate the document, I have settled on:
POST /document/<documentId>
"body": {
"param1": "updatedparam1",
"param2": "updatedparam2"
}
An alternative approach would be to do something like:
POST /document/<documentId>/refresh
"body": {...}
but that feels more like an RPC call rather than REST.
Does this make sense logically? I haven't seen many suggestions that POST can be to a single resource as opposed to a collection.
Please do let me know if I can expand on anything, I've been banging my head against this for a little while and have probably missed something.

I settled on a POST, but that doesn't sit right either.
POST is fine.
HTTP semantics include rules about invalidating cached representations of resources. Presumably, when you tell the server to regenerate the document, you don't want to keep using the old copy yourself. So the target uri of the request should be the same as that which you use to GET the resource.
So:
POST /document/<documentId>
Is a good start.
An alternative, assuming the semantics match, would be to use PATCH -- that's an appropriate choice to make if what you are doing is proposing replacement values for the representation. In that case, the body of the request should be a "patch document". You can, of course, define your own type of patch document; generic clients may already understand one or more of the standards RFC-6902:JSON Patch or RFC-7386:JSON Merge Patch so you can potentially save some work by supporting one or more of the standard formats.
I haven't seen many suggestions that POST can be to a single resource as opposed to a collection.
Part of the point of REST is that resources support a uniform interface - "single resources" and "collection resources" look the same. Historically, we got a bit unlucky with the early specifications for POST, which were easily misinterpreted as CREATE.
But generic clients don't know, or care, whether or not the resource you specify in a web form is a "collection resource"; it just packs up the data and sends the request, confident that the server will know what to do.

REST design for update/add/delete item from a list of subresources

I would like to know which is the best practice when you are having a resource which contains a list of subresources. For example, you have the resource Author which has info like name, id, birthday and a List books. This list of books exists only in relation with the Author. So, you have the following scenario:
You want to add a new book to the book list
You want to update the name of a book from the list
You want to delete a book from the list
SOLUTION 1
I searched which is the correct design and I found multiple approaches. I want to know if there is a standard way of designing this. I think the design by the book says to have the following methods:
To add: POST /authors/{authorId}/book/
To update: PUT /authors/{authorId}/book/{bookId}
To delete: DELETE /authors/{authorId}/book/{bookId}
SOLUTION 2
My solution is to have only one PUT method which does all these 3 things because the list of books exists only inside object author and you are actually updating the author. Something like:
PUT /authors/{authorId}/updateBookList (and send the whole updated book list inside the author object)
I find multiple errors in my scenario. For example, sending more data from the client, having some logic on the client, more validation on the API and also relying that the client has the latest version of Book List.
My question is: is it anti-pattern to do this?
SITUATION 1. In my situation, my API is using another API, not a database. The used API has just one method of "updateBookList", so I am guessing it is easier to duplicate this behavior inside my API too. Is it also correct?
SITUATION 2. But, supposing my API would use a database would it be more suitable to use SOLUTION 1?
Also, if you could provide some articles, books where you can find similar information. I know this kind of design is not written in stone but some guidelines would help. (Example: from Book REST API Design Rulebook - Masse - O'Reilly)

Solution 2 sounds very much like old-style RPC where a method is invoked that performs some processing. This is like a REST antipattern as REST's focus is on resources and not on methods. The operations you can perform on a resource are given by the underlying protocol (HTTP in your case) and thus REST should adhere to the semantics of the underlying protocol (one of its few constraints).
In addition, REST doesn't care how you set up your URIs, hence there are no RESTful URLs actually. For an automated system a URI following a certain structure has just the same semantics as a randomly generated string acting as a URI. It's us humans who put sense into the string though an application should use the rel attribute which gives the URI some kind of logical name the application can use. An application who expects a certain logical composition of an URL is already tightly coupled to the API and hence violates the principles REST tries to solve, namely the decoupling of clients from server APIs.
If you want to update (sub)resources via PUT in a RESTful way, you have to follow the semantics of put which basically state that the received payload replaces the payload accessible at the given URI before the update.
The PUT method requests that the state of the target resource be
created or replaced with the state defined by the representation
enclosed in the request message payload.
...
The target resource in a POST request is intended to handle the
enclosed representation according to the resource's own semantics,
whereas the enclosed representation in a PUT request is defined as
replacing the state of the target resource. Hence, the intent of PUT
is idempotent and visible to intermediaries, even though the exact
effect is only known by the origin server.
In regards to partial updates RFC 7231 states that partial updates are possible by either using PATCH as suggested by #Alexandru or by issuing a PUT request directly at a sub-resource where the payload replaces the content of the sub-resource with the one in the payload. For the resource containing the sub-resouce this has an affect of a partial update.
Partial content updates are possible by
targeting a separately identified resource with state that overlaps a
portion of the larger resource, or by using a different method that
has been specifically defined for partial updates (for example, the
PATCH method defined in [RFC5789]).
In your case you could therefore send the updated book collection directly via a PUT operation to something like an .../author/{authorId}/books resource which replaces the old collection. As this might not scale well for authors that have written many publications PATCH is probably preferable. Note, however, that PATCH requires an atomic and transactional behavior. Either all actions succeed or none. If an error occurs in the middle of the actions you have to role back all already executed steps.
In regards to your request for further literature, SO isn't the right place to ask this as there is an own off-topic close/flag reason exactly for this.

I'd go with the first option and have separate methods instead of cramming all logic inside a generic PUT. Even if you're relying on an API instead of a database, that's just a 3rd party dependency that you should be able to switch at any point, without having to refactor too much of your code.
That being said, if you're going to allow the update of a large number of books at once, then PATCH might be your friend:
Looking at the RFC 6902 (which defines the Patch standard), from the client's perspective the API could be called like
PATCH /authors/{authorId}/book
[
{ "op": "add", "path": "/ids", "value": [ "24", "27", "35" ]},
{ "op": "remove", "path": "/ids", "value": [ "20", "30" ]}
]

Technically, solution 1 hands down.
REST API URLs consist of resources (and identifiers and filter attribute name/values). It should not contain actions (verbs). Using verbs encourages creation of stupid APIs.
E.g. I know a real-life-in-production API that wants you to
do POST on /getrecords to get all records
do POST on /putrecords to add a new record
Reasons to choose solution 2 would not be technical.
For requirement #2 (You want to update the name of a book from the list), it is possible to use JSON PATCH semantics, but use HTTP PATCH (https://tools.ietf.org/html/rfc5789) semantics to design the URL (not JSON PATCH semantics as suggested by Alexandru Marculescu).
I.e.
Do PATCH on /authors/{authorId}/book/{bookId}, where body contains only PK and changed attributes. Instead of:
To update: PUT on /authors/{authorId}/book/{bookId}
JSON PATCH semantics may of course be used to design the body of a PATCH request, but it just complicates things IMO.

How do you model a RESTful API for a single resource?

I am looking to expose some domain RESTful APIs on top of an existing project. One of the entities I need to model has a single document: settings. Settings are created with the application and is a singleton document. I'd like to expose it via a well-designed resource-based RESTful API.
Normally when modeling an API for a resource with many items its something like:
GET /employees/ <-- returns [] of 1-* items
GET /employees/{id}/ <-- returns 1 item
POST /employees/ <-- creates an item
PUT /employees/{id}/ <-- updates all fields on specific item
PATCH /employees/{id}/ <-- updates a subset of fields specified on an item
DELETE /employees/{id}/ <-- deletes a specific item
OPTION 1: If I modeled settings in the same way then the following API is built:
GET /settings/ <-- returns [] of 1-* items
[{ "id": "06e24c15-f7e6-418e-9077-7e86d14981e3", "property": "value" }]
GET /settings/{id}/ <-- returns 1 item
{ "id": "06e24c15-f7e6-418e-9077-7e86d14981e3", "property": "value" }
PUT /settings/{id}/
PATCH /settings/{id}/
This to me has a few nuances:
We return an array when only 1 item CAN and EVER WILL exist. Settings are a singleton that the application creates.
We require knowing the id to make a request only returning 1 item
We require the id of a singleton just to PUT or PATCH it
OPTION 2: My mind then goes in this direction:
GET /settings/ <-- returns 1 item
{ "id": "06e24c15-f7e6-418e-9077-7e86d14981e3", "property": "value" }
PUT /settings/
PATCH /settings/
This design removes the nuances brought up below and doesn't require an id to PUT or PATCH. This feels the most consistent to me as all requests have the same shape.
OPTION 3: Another option is to add the id back to the PUT and the PATCH to require it to make updates, but then an API user must perform a GET just to obtain the id of a singleton:
GET /settings/ <-- returns 1 item
{ "id": "06e24c15-f7e6-418e-9077-7e86d14981e3", "property": "value" }
PUT /settings/{id}/
PATCH /settings/{id}/
This seems inconsistent because the GET 1 doesn't have the same shape as the UPDATE 1 request. It also doesn't require a consumer to perform a GET to find the identifier of the singleton.
Is there a preferred way to model this?
Does anyone have any good reference material on modeling RESTful APIs for singleton resources? I am currently leaning towards OPTION 2 but I'd like to know if there are good resources or standards that I can look into.
Is there a compelling reason to require an API consumer to make a GET for the id of a resource to then use it in an update request, perhaps for security reasons, etc?

The ID of the Resource is the Url itself and not necessarily a Guid or UUID. The Url should uniquely IDentify the Resource, in your case the Settings entity.
But, in order to be RESTfull, you must point to this resource in your index Url (i.e. the / path) with an appropriate rel attribute, so the client will not hardcode the Url, such as this:
GET /
{ ....
"links": [
{ "url" : "/settings", "rel" : "settings" }
], ...
}
There are no specifics to accesing a singleton resource other than the Url will not contain a Guid, Uuid or any other numeric value.

Option 2 is perfectly RESTful, as far as I can tell.
The core idea behind RESTful APIs is that you're manipulating "resources". The word "resource" is intentionally left vague so that it can refer to whatever is important to the specfic application, and so that the API can focus only on how content will be accessed regardless of what content will be accessed.
If your resource is a singleton, it does not make sense to attribute an ID value to it. IDs are very useful and commonly used in RESTful APIs, but they are not a core part of what makes an API RESTful, and, as you have noticed, would actually make accessing singleton resources more cumbersome.
Therefore, you should just do away with IDs and have both
GET /settings/
and
GET /settings/{id}
always return the settings singleton object. (access-by-id is not required, but it's nice to have just in case someone tries it). Also, be sure to document your API endpoint so consumers don't expect an array :)
Re: your questions,
I believe option 2 would be the preferred way of modeling this, and I believe requiring your consumer to make a GET for the id would actually be somewhat of an anti-pattern.

I think the confusion here is because the word settings is plural, but the resource is a singleton.
Why not rename the resource to /configuration and go with option 2?
It would probably be less surprising to consumers of your API.

You're probably overthinking it. There's no concept of singleton in HTTP or REST.
GET /settings/ is perfectly fine.
By the way, we can hardly relate this to DDD - at least not if you don't give more context about what settings means in your domain.
It might also be that you're trying to tack an "Entity with ID" approach on Settings when it's not appropriate. Not all objects in a system are entities.

REST: How to update a row and create zero or more of other resource on same request?

I'll try to make this as simple as possible, it may be sort of dumb question.
I'm rewriting an 7 years old web application (help desk software). It is required to make it a REST API to be consumed by different clients web browser and mobile app. Also required to keep the business logic as close as possible to the original which for now have been working well.
The current struggle about REST:
What I need is to update the a resource lets say /tickets/47321
If the status_id of the this resource is changed a record of this change need to be saved, both in the same DB transaction because it need to be all-or-nothing. this is how the original application worked and we would like to keep this behavior.
So question is:
Can I PUT to tickets/47321 the whole resource or partial state representation to update the resource residing in server and create the new record of the history change (if status_id is different) and return them both back to client as JSON:
{
ticket: {}, // new ticket state
history: {} // the new created record of history change
}
This way the client can update the ticket and add the history to a list of history changes if there is any being return?

Technically, yes, you can do this. However, it might not be as straightforward as simply returning 2 objects, side by side; looking at the Richardson Maturity Model (Level 1), one would expect to receive the same type of resource after
calling (PUT) an api endpoint.
That being said, you could either embed the additional resource (append a history change to a ticket, following the Hypertext Application Language (HAL) proposed draft specification), or better yet (if you're aiming towards REST level 3), provide a link relationship, in conformity with the "Target IRI" defined in Web Linking specification (RFC 5988) from the ticket:
/api:history?ticketId=47321 would return all the history records belonging to that ticket, paged and sorted by created date, for example (and you can just select the latest)
/api:history?id=123 you would do some work on the server to ensure this points straight to the latest history record (related to that ticket id)
Regarding the partial update, looking at the RFC 6902 (which defines the Patch standard), from the client's perspective the API could be called like
PATCH /ticket/47321
[
{ "op": "replace", "path": "/author", "value": "David"},
{ "op": "replace", "path": "/statusId", "value": "1"}
]
More examples can be found here.

DELETE in a REST API without an entity ID best practice

I have a timespan I want to delete in a REST API.
It doesn't have an id so calling HTTP DELETE on "/timespan/" is not really possible. The implementation would be possible, but I would rather not put in the extra effort (requires some database modifications) unless there is a good reason to add it.
I considered calling DELETE on "/timespan/" with "start" and "end" inside the request but to my understanding this clashes with the way REST works.
Is it legit to call DELETE on "/timespan//" or maybe a concatenation such as "/timespan/+" or should I implement IDs after all?

You are correct. DELETE doesn't take a body.
RFC 7231:
A payload within a DELETE request message has no defined semantics;
sending a payload body on a DELETE request might cause some existing
implementations to reject the request.
I've seen what you want done as
DELETE /sites/{siteId}/maintenance
but that's really not optimal. If maintenance is a resource, it needs some way of being uniquely identified. If it's a property of a resource, then you delete it via PUT or PATCH on that resource.

Assuming your resource is a maintenance performed on a site. The API could have:
DELETE /sites/{site-id}/maintenances/{maintenance-id} to delete one maintenance. The typical delete.
DELETE /sites/{site-id}/maintenances to delete all maintenances of a given site. Not usual, and dangerous.
DELETE /site/{site-id}/maintenances?start={start-date}&end={end-date} to delete all maintenances in a timespan. The answer to your question.
Rationale: the timespan is not part of the resource, it's a filtering attribute on the collection of resources. Therefore, it should not be part of the URI (Uniform Resource Identifier). We should use query string parameters for filtering.
Another option:
POST /site/{site-id}/clean-up to delete all maintenances of a given site that are within the timespan specified in the request body. Also an answer to your question.
Rationale: some advocate that a REST API can offer more "coarse-grained" operations that closely resemble the business capability in complement to (or in the place of) more CRUD-like APIs. Here's a reference. In the particular case, the POST operation executes a "clean-up business process". The start and end dates go in the request body. The operation should return 200 (not 201) for success.

Instead of designing a low-level CRUD-like REST API and let the callers know details about your domain, you could let the client POST their intent and let the server decide what to delete. Seems like the right approach if you ant to prevent users from accidentally (maliciously?) deleting resources. More in Fine grained CRUD resources versus Coarse Grained resources at http://www.thoughtworks.com/insights/blog/rest-api-design-resource-modeling

If you are following the RESTful principle closely, you would need to retrieve a list of resources (partial as proposed by #moonwave99) and invoke DELETE on every resource whose timestamp is older than a certain threshold. This conveis the semantics of delete while being completely idempotent. You, however, need a resource identifier therefore (which you should have obtained via the previous retrieve).
The next choice would be to set off a PUT request where you send each and every entry that should be available after the request within that request. In practice this is however a no-go as to much data would need to be transfered.
Last but not least, you would have the possibility to delete resources via PATCH where you pass the necessary instructions to the server it needs to transform the resource(s) from state_before to state_after. In case of json-patch you have the remove operation at hand. However, the spec does not provide any possibility to remove a state if a certain condition was met - a combination of test+delete would be handy in that case. The quintesence is though, that not the server is responsible for filtering out certain data but the client which has to send each necessary step to the server. Therefore, a client has to retrieve the current state of a resource collection like /api/sampleResources which could be an array of JSON objects and do the decission locally:
HTTP/1.1 200 OK
...
ETag: "7776cdb01f44354af8bfa4db0c56eebcb1378975"
...
[
{
...
"timestamp": "2015-07-17T19:40:00",
...
},
{
...
"timestamp": "2014-10-05T10:00:00",
...
},
{
...
"timestamp": "2015-07-16T15:00:00",
...
},
{
...
"timestamp": "2014-12-31T00:00:00",
...
}
]
If all entries from the last year should be deleted, a JSON-PATCH request would need to look like this:
PATCH /api/sampleResources
...
ETag: "7776cdb01f44354af8bfa4db0c56eebcb1378975"
...
[
{ "op": "remove", "path": "/1" },
{ "op": "remove", "path": "/3" }
]
The path is specified in JSON Pointer notation and is 0 based, therefore the sample above removes the second and fourth entry of the list, which are the entries of 2014.
As the resource can be modified between the lookup and the patch generation, it is highly recommended to use ETag features to gurantee that the patch is executed on the right state. If the ETags do not match, the request fails with a precondition failure.
Note however, that PATCH is not idempotent! So sending the same request twice may delete more entries as intended. This is not the case with PUT, as the state after the request is exactly the state you told the resource to have. Therefore, sending the same request twice does not change anything state-wise. In comparison to a bunch of DELETE requests, PATCH is atomic - either all or none of the operations are executed, while the DELETE requests are all independent.
I agree strongly, that there is a need for conditional and partial PUT and DELETE operations to adhere to the business needs. For simplicity reasons I would really recommend to use partial GET and DELETE operations instead of PUT or PATCH.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse