Proper response to client for a RESTful PUT endpoint for updating multiple entities in a single batch? - rest

For a standard REST PUT request of to update single entity, for example, a document, using an endpoint that looks something like this:
[Route("documents/{id}")]
public void Put(int id, [FromBody]Document document)
there is a well-defined way to use HTTP status codes to communicate with the client, using an HTTP 200 status for a successful update, an HTTP 404 if document with the specified Id was not found, an HTTP 500 if there was a problem updating the record, etc.
My issue is that we have a RESTful API with potentially extremely high usage. For performance reasons, we would like to create a endpoint that will accept multiple document entities to update in a single PUT operation:
[Route("documents")]
public void Put([FromBody]IEnumerable<Document> documents)
with input such as this:
[
{"Id":1,"Name":"doc one","Author":"Fred"},
{"Id":2,"Name":"doc two","Author":"John"},
{"Id":3,"Name":"doc three","Author":"Mary"}
]
If a user submits 10 documents, and I am able only able to successfully update 9 of them, with remaining one failing due to some issue, I would like to commit the 9 successfully updated documents and then communicate to the user which updates succeeded and which ones failed.
One approach I could take is that if any of the submitted documents successfully update, return an HTTP 200. In the response object that I return to the client, I can include a list of those documents that succeeded and a list of documents that failed. For each of those that failed, I can include the reason why, along with maybe an HTTP status code for each failed document.
But should I be returning an HTTP 200 if some of the requests failed? This approach counts on the client to inspect the list of failed documents to see if there are problems. My fear is that the user will see the HTTP 200 and assume everything is fine.
The other option is that if the client submits 10 documents, and I am able to successfully update 9 of them and one fails, return the HTTP status code for the one that failed. For example, if one failed because the specified Id could not be found, return an HTTP 404, if it failed because the DB was unavailable, return an HTTP 500, etc.
This approach also has problems. For example, if two documents fail for different reasons, which HTTP status code should be returned? And does it make sense to return, for example, an HTTP 500 status for a request that successfully updated some of the items?
Do the REST guidelines give any suggestions for this issue of batch updates? Are there any recommended approaches for this issue?

HTTP Status 207 Multi Status can be used to handle batch processing.
When processing more than one entities, your API can return a 207 status response containing a list of responses:
Each entity and response share a key allowing consumer to know which response correspond to which provided entity. In the provided use case, document's Id could be used as key.
Each response contains the same data you would have received when processing the corresponding entity alone (including http status).
The RFC stated that the message is in XML but you can use JSON with your own structure.
You can take a look at Jive API which handle batch processing to see an example.
Given the input
[
{"Id":1,"Name":"doc one","Author":"Fred"},
{"Id":2,"Name":"doc two","Author":"John"},
{"Id":3,"Name":"doc three","Author":"Mary"}
]
A full success would return a 207 http status, the response containing three 200 http statuses:
[
{
"Id": 1,
"status": 200,
"data" : { data returned for a single processing }
},
{
"Id": 2,
"status": 200,
"data" : { data returned for a single processing }
},
{
"Id": 3,
"status": 200,
"data" : { data returned for a single processing }
}
]
If there's a problem with entity with id 3 like missing author:
[
{"Id":1,"Name":"doc one","Author":"Fred"},
{"Id":2,"Name":"doc two","Author":"John"},
{"Id":3,"Name":"doc three"}
]
The response will still be a 207 but will contain two 200 http statuses for id 1 and 2 and a 400 status for id 3.
[
{
"Id": 1,
"status": 200,
"data" : { data returned for a single processing }
},
{
"Id": 2,
"status": 200,
"data" : { data returned for a single processing }
},
{
"Id": 3,
"status": 400,
"data" : { data returned for a single processing 400 error }
}
]

Related

BulkDocs Api used to save CouchDb document is taking more time compare to put method?

As we checked in the chrome network tab is shown both Api request and response timing.
Analysis that timings, the bulk docs Api is taking 2x of time to complete the document save in CouchDB. sometimes this 2x time is changed to 3 or 4x it depending on the waiting server response time.
At the same time, The PUT method takes 1/4 time to save the data, and this PUT request is called from another API. It looks like saving records using PUT requests is faster than using BulkDocs API.
Here, I have mentioned the request and response, and the screenshot for your reference.
BulkDocs Request:
{"docs":[{"_id":"pfm718215_2_BE1A8AC4-EB53-4C8E-B3F7-5D4FB4329963","data":{"pfm718093_1595329":null,"pfm_718215_id":null,"createdby":52803,"createdon":1665575674775,"lookupname":null,"lookupmail":null,"lastmodifiedby":52803,"lastmodifiedon":1665575674775,"guid":"Xj0JpEofDDy37Z2","name":"test","pfm_718093_1595327_id":null,"display_name":"pfm718215","couch_id":null,"couch_rev_id":null,"pfm718093_1595325":null,"pfm_718093_1595325_id":null,"pfm718093_1595327":null,"pfm_718093_1595329_id":null,"type":"pfm718215","sync_flag":"C","org_id":3}}],"new_edits":true}
BulkDocs Response :
[{
"ok": true,
"id": "pfm718215_2_BE1A8AC4-EB53-4C8E-B3F7-5D4FB4329963",
"rev": "1-05f3e8e3e96844cb51a8143891b81d16"
}]
BulkDocs Timings Screenshot :
Header
Request
Timings
PUT Request :
{"webserviceInput":{"processInfo":{"orgId":3,"userId":52803},"dataParams":{"data":{"pfm718093_1595329":null,"pfm_718215_id":null,"createdby":52803,"createdon":1665569303482,"lookupname":null,"lookupmail":null,"lastmodifiedby":52803,"lastmodifiedon":1665569303482,"guid":"DRtlY2FlKAwVHBq","name":"test","pfm_718093_1595327_id":null,"display_name":"pfm718215","couch_id":null,"couch_rev_id":null,"pfm718093_1595325":null,"pfm_718093_1595325_id":null,"pfm718093_1595327":null,"pfm_718093_1595329_id":null,"type":"pfm718215","sync_flag":"C","org_id":3}},"sessionType":"NODEJS"}}
PUT Response :
{
"ok": true,
"id": "5f1eee08c843d01257c8b698d923fb02",
"rev": "1-0f67c9b8c2acf7aead7e991e344b04df"
}
PUT Timings Screenshot :
Header
Request
Timing
CouchDB version Details :
Couchdb 3.2.0 and {“erlang_version":"20.3.8.26","javascript_engine":{"name":"spidermonkey","version":"1.8.5"}}

Design a RESTful api for creating a resource and its related resources at once

I want to know how to design a RESTFUL api for creating a resource and its related resources at once.
For example, I want to create an order which contains a list of items using my RESTFUL API, for example:
{
order_id:1,
description: "XXX",
items: [
{item_id:1, price:30, ...},
{item_id:2, price:40, ...}
]
}
One way is giving two api
post: api/orders => create a new order and return the order id
post: api/orders/id/items => create related items using the order_id
However, the order and items should be created together. So if the second api failed, it will create an order without any item inside and it is the situation I don't want to see. Actually, I want the backend server to do a transaction and create the order and items at once, it should be succeeded or failed together.
So, is it a good way to put the items in the body of request, and post only once to api/orders ? Or is there other better design for this situation?
Thank you!
I want to know how to design a RESTFUL api for creating a resource and its related resources at once.
Perfectly reasonable thing to do. Concentrate your attention on how to describe to the client how to create the appropriate request, and how to interpret the document describing the result.
The clue is in the definition of the 201 Created status code
The 201 (Created) status code indicates that the request has been fulfilled and has resulted in one or more new resources being created.
The 201 response payload typically describes and links to the resource(s) created.
(emphasis added)
In the case of the web, the way we would do this is to have a form; the client would provide information in the form and submit it. The browser, following the standard for form processing, would generate a POST (because the semantics are unsafe) request with the form data encoded within the message body and the appropriate content type defined in the header (for instance application/x-www-form-urlencoded).
The response, in term, would be an HTML document with a bunch of links to all of the interesting resources that were created.
Does it have to be HTML? No, of course not - you could use text/plain if it suited your needs. You have somewhat better long term prospects when using a media type that has built into it a notion of links that general purpose components will understand.
Definitely, creating order without items - bad idea. This will ends up with not solid API and not consistent entities. Also, you can't create items using api/orders URI, because this violates the basis of the REST principles.
For your business logic REST API may looks like:
POST api/item
{
price: 40,
name: "xxx",
...
}
<<<<< 201
{
id: 1
}
GET api/item/{id}
<<<<< 200
{
id: 4,
price: 40,
name: "xxx",
...
}
POST api/order
{
description: "xxx",
items: [
{id: 1, count: 5},
{id: 23456, count: 1}
]
}
<<<<< 201
{
id: 123442
}
I think it's unnecessary to put full items in creating order request body. Item ID's will be enough to create order-item bindings on backend.

What is the standard practice for designing REST API if it is being used for inserting / updating a list of records

We are building an API which will be used for inserting and updating the records in a database. So if the record exists based on the Key the record will be updated and if it does not then it will be inserted.
I have two questions.
As per REST guidelines, what are the options for designing such an API e.g. PUT / POST OR PATCH? How should the list of objects be represented?
NOTE: I know from other answers that I read that there is confusion over how it should be as per REST guidelines. So I am OK if I can get some guidance on general best practice (irrespective of REST part)
Secondly, the part where I am really confused about is how to represent the output of this or what this API should return.
Specific guidance/inputs on above topic would be really appreciated.
I've seen many different implementations for inserts/updates across various vendors (Stripe, HubSpot, PayPal, Google, Microsoft). Even though they differ, the difference somehow fits well with their overall API implementation and is not usually a cause for stress.
With that said, the "general" rule for inserts is:
POST /customers - provide the customer details within the body.
This will create a new customer, returns the unique ID and customer details in the response (along with createdDate and other auto-generated attributes).
Pretty much most, if not all API vendors, implement this logic for inserts.
Updates, are quite different. Your options include:
POST
POST /customer/<customer_id> - include attributes and values you want to update within the body.
Here you use a POST to update the customer. It's not a very common implementation, but I've seen it in several places.
PUT
PUT/customer/<customer_id> - include either all, or partially updated attributes within the body.
Given PUT is technically an idempotent method, you can either stay true to the REST convention and expect your users to provide all the attributes to update the resource, or make it simpler by only accepting the attributes they want to update. The second option is not very "RESTful", but is easier to handle from a users perspective (and reduces the size of the payload).
PATCH
PATCH /customer/<customer_id> - include the operation and attributes that you want to update / remove/ replace / etc within the body. More about how to PATCH.
The PATCH method is used for partial updates, and it's how you're "meant" to invoke partial updates. It's a little harder to use from a consumers perspective.
Now, this is where the bias kicks-in. I personally prefer to use POST, where I am not required to provide all the attributes to invoke an update (just the ones I want to update). Reason is due to simplicity of usage.
In terms of the response body for the updates, usually they will return object within the response body include the updated attributes (and updated auto-generated attributes, such updatedDate).
Bulk inserts/ updates
Looking at the Facebook Graph HTTP API (Batch Request) for inspiration, and assuming POST is your preferred method for updates, you could embed an array of requests using a dedicated batch resource as an option.
Endpoint: POST /batch/customers
Body:
{
["first_name": "John", "last_name": "Smith"...], //CREATE
["id": "777", "first_name": "Jane", "last_name": "Doe"...], //UPDATE
["id": "999", "first_name": "Mike", "last_name": "Smith"...], //UPDATE
[....]
}
Sample Response
{
"id": "123",
"result":[
{ // Creation successful
"code": 200,
"headers":{..},
"body": {..},
"uri": "/customers/345"
},
{ // Update successful
"code": 200,
"headers":{..},
"body": {..},
"uri": "/customers/777",
},
{ // A failed update request
"code": 404,
"headers":{..},
"body": {..}, // body includes error details
}
]
}

REST API sub resources, data to return?

If we have customers and orders, I'm looking for the correct RESTful way to get this data:
{
"customer": {
"id": 123,
"name": "Jim Bloggs"
"orders": [
{
"id": 123,
"item": "Union Jack Keyring",
"qty": 1
}, {
"id": 987,
"item": "London Eye Ticket",
"qty": 5
}
]
}
}
GET /customers/123/orders
GET /customers/123?inc-orders=1
Am I correct that the last part/folder of the URL, excluding query string params, should be the resource returned..?
If so, number 1 should only return order data and not include the customer data. While number 2 is pointing directly at customer 123 and uses query string params to effect/filter the customer data returned, in this case including the order data.
Which of these two calls is the correct RESTful call for the above JSON..? ...or is there a secret number 3 option..?
You have 3 options which I think could be considered RESTful.
1)
GET /customers/12
But always include the orders. Do you have a situation in which the client would not want to use the orders? Or can the orders array get really big? If so you might want another option.
2)
GET /customers/123, which could include a link to their orders like so:
{
"customer": {
"id": 123,
"name": "Jim Bloggs"
"orders": {
"href": "<link to you orders go here>"
}
}
}
With this your client would have to make 2 requests to get a customer and their orders. Good thing about this way though is that you can easily implement clean paging and filtering on orders.
3)
GET /customers/123?fields=orders
This is similar to your second approach. This will allow clients to use your API more efficiently, but I wouldn't go this route unless you really need to limit the fields that are coming back from your server. Otherwise it will add unnecessary complexity to your API which you will have to maintain.
The Resource (identified by the complete URL) is the same, a customer. Only the Representation is different, with or without embedded orders.
Use Content Negotiation to get different Representations for the same Resource.
Request
GET GET /customers/123/
Accept: application/vnd.acme.customer.short+json
Response
200 OK
Content-Type: application/vnd.acm.customer.short+json
{
"customer": {
"id": 123,
"name": "Jim Bloggs"
}
}
Request
GET GET /customers/123/
Accept: application/vnd.acme.customer.full+json
Response
200 OK
Content-Type: application/vnd.acme.customer.full+json
{
"customer": {
"id": 123,
"name": "Jim Bloggs"
"orders": [
{
"id": 123,
"item": "Union Jack Keyring",
"qty": 1
}, {
"id": 987,
"item": "London Eye Ticket",
"qty": 5
}
]
}
}
The JSON that you posted looks like what would be the result of
GET /customers/123
provided the Customer resource contains a collection of Orders as a property; alternatively you could either embed them, or provide a link to them.
The latter would result in something like this:
GET /customers/123/orders
which would return something like
{
"orders": [
{
"id": 123,
"item": "Union Jack Keyring",
"qty": 1
},
{
"id": 987,
"item": "London Eye Ticket",
"qty": 5
}
]
}
I'm looking for the correct RESTful way to get this data
Simply perform a HTTP GET request on a URI that points to a resource that produces this data!
TL;DR
REST does not care about URI design - but on its constraints!
Clients perform state transitions through possible actions returned by the server through dynamically identified hyperlinks contained within the response.
Clients and servers can negotiate on a preferred hypermedia type
Instead of embedding the whole (sub-)resource consider only returning the link to that resource so a client can look it up if interested
First, REST does not really care about the URI design as long as the URI is unique. Sure, a simple URI design is easier to understand for humans, though if compared to HTML the actual link can be hidden behind a more meaninful text and is thus also not that important for humans also as long as they are able to find the link and can perform an action against it. Next, why do you think your "response" or API is RESTful? To call an API RESTful, the API should respect a couple of constraints. Among these constraints is one quite buzzword-famous: hypertext as the engine of application state (HATEOAS).
REST is a generalized concept of the Web we use every day. A quite common task for a web-session is that a client requests something where the server sends a HTML document with plenty of links and other resources the client can use to request further pages or stream a video (or what ever). A user operationg on a client can use the returned information to proceed further, request new pages, send information to the server etc, etc. The same holds true for RESTful applications. This is was REST simply defines as HATEOAS. If you now have a look at your "response" and double check with the HATEOAS constraint you might see that your response does not contain any links to start with. A client therefore needs domain knowledge to proceed further.
JSON itself isn't the best hypermedia type IMO as it only defines the overall syntax of the data but does not carry any semantics, similar to plain XML which though may have some DTD or schemas a client may use to validate the document and check if further semantic rules are available elsewhere. There are a couple of hypermedia types that build up on JSON that are probably better suited like f.e. application/hal+json (A good comparison of JSON based hypermedia types can be found in this blog post). You are of course entitled to define your own hypermedia type, though certain clients may not be able to understand it out of the box.
If you take f.e. a look at HAL you see that it defines an _embedded element where you can put in certain sub-resources. This seems to be ideal in your case. Depending on your design, orders could also be a resource on its own and thus be reachable via GET /orders/{orderId} itself. Instead of embedding the whole sub-resource, you can also just include the link to that (sub)resource so a client can look up the data if interested.
If there are cases where you want to return only customer data and other cases where you want also to include oder data you can f.e. define different hypermedia types (based on HAL f.e.) for both, one returning just the customer data while the other also includes the oder data. These types could be named like this: application/vnd.yourcompanyname.version.customers.hal+json or application/vnd.yourcompanyname.version.customer_orders.hal+json. While this is for sure an development overhead compared to adding a simple query-parameter to the request, the semantics are more clear and the documentation overhead is on the hypermedia type (or representation) rather then the HTTP operation.
You can of course also define some kind of view structure where one view only returns the customer data as is while a different view returns the customer data including the orders similar to a response I gave on a not so unrelated topic.

Sending application specific messages

There is a change to our business logic, where earlier with one of the APIs we use to return a list, for eg. list of employees. Recently we introduced authorization checks, to see if a particular user has permission to view a specific employee.
If say there are 10 employees that should be returned through method GET, due to the missing permission only 5 are returned. The request itself in this case is successful. I am currently not sure how to pass on the information back to the client that there were 5 employees that are filtered out due to missing permission.
Should this be mapped to HTTP status codes? If yes, which status code fits this? Or this is not an error at all?
What would be the best approach in this case?
A status code by itself wouldn't be sufficient to indicate the partial response. The status code 206 sounds close by name but is used when a client specifically requests a partial set of data based on headers.
Use 200. The request was fulfilled successfully after all, and the reason for the smaller set of data is proprietary to your API so extra metadata in the response to indicate a message might be sufficient.
Assuming JSON response:
{
"data": [ ... ],
"messages": [
"Only some data was returned due to permissions."
]
}
If you have many consumers and are worried about backward compatibility you may also want to provide a vendor specific versioned JSON media type:
"Content-Type": "application/vnd.myorg-v2+json"