RESTful API response for data transfer objects

RESTful API response for data transfer objects - rest

I have this following scenario in my application. I am logged in as a user and i create a group. There is a REST api for creating the group (POST /groups/api/v1/groups) and getting the group details (GET /groups/api/v1/groups/{group id})
The response returned on success is not just the json representation of the group resource. Its a DTO which contains a lot of other information (to avoid multiple calls to the server)
For instance, the response can include
Actions that can be performed on the group (for ex: inviting a user to the group) and the corresponding urls that need to be hit for each action
Count of members in the group.
Recent activity in the group
Member information
etc
Right now the only client using the REST api's is the UI which needs additional information. If the APIs are exposed to developers later, they may not need all the information being returned. How do we handle rest responses in such scenarios where we have to return DTO's which contain more information?
Is it a good design to be returning dto's in rest response for GET or should be avoided?

It helps if you accept the fact that RESTful HTTP is noisy. The design compensation for the noise is caching, which you should try to use as much as you can to save server hits. A well-cached application can use multiple resources, rather than one large resource, because many of the requests will not ever leave the client.
As far as your specific question, use the expand query parameter to identify what child objects to include. You can further specify what properties of that child to include. For example,
GET /groups/api/v1/groups/12345
{
"id": 12345,
"name": "The Magnificent Seven",
"location": {
"self": "/groups/api/v1/locations/43"
}
}
GET /groups/api/v1/groups/12345?expand=location
{
"id": 12345,
"name": "The Magnificent Seven",
"location": {
"self": "/groups/api/v1/locations/43",
"longitude": "24°01′N",
"latitude": "104°40′W"
}
}
GET /groups/api/v1/groups/12345?expand=location[latitude]
{
"id": 12345,
"name": "The Magnificent Seven",
"location": {
"self": "/groups/api/v1/locations/43",
"latitude": "104°40′W"
}
}

Well giving more data than required can prove to be harmful and you need to start explaining everyone why you are giving so much data. You can have a query param that is secret to a subset of users say "alldetails=true" which will give the full DTO.
If you are using Java with codehaus or some other JSON utility on the REST server you can specify what elements to expose by using a mixin. codehaus has "addMixInAnnotations()" for that.
A good REST response for GET should have an identifier, required details and URLs in JSON or XML.

Related

How to delete an element from a collection association in Spring Data REST?

Suppose you have a One-To-Many or Many-To-Many relationship in Spring Data REST. Let's say you have groups that has a One-to-Many relationship with users. If you get the list of associations from a group you will get back links like this:
{
"_embedded": {
"users": [
{
"username": "test25",
"enabled": false,
"firstName": "strifng",
"lastName": "sdfdffff",
"_links": {
"self": {
"href": "…/users/78"
}
}
},
{
"username": "test33",
"enabled": true,
"firstName": "sd",
"lastName": "asdfsa",
"_links": {
"self": {
"href": "…/users/77"
}
}
}
}
]
}
Which is useless if you are trying to remove a particular user from a group. You are either forced to use PUT with /groups/{id}/users but that is impossible if you have thousands of users. You can POST to /groups/{id}/users with a list of URI but you can't DELETE to /groups/{id}/users.
Why?
The only way DELETE works is by calling /groups/{id}/users/{id} but there's no way to construct this URI from the front end as it is not returned in the collection.
How do you get around this?

The pattern that you'd need to use here is to access the association resource asking it for the text/uri-list media type, get the URIs of all linked resources, modify the list as you see need and PUT it back to the association resource. I.e.:
GET /groups/4711/users
200 OK
…/users/3149
…/users/41
…/users/4711
Followed by a:
PUT /groups/4711/users
…/users/3149
…/users/4711
Basically removing the user with id 41 from the association.
The problem
The problem with this suggestion is it that it currently doesn't work 🙃. It's broken in the sense that the lookup of the list of URIs currently fails due to some bug. Looks like that functionality went off the radar at some point (as it's not even advertised in the reference docs anymore). Good news is that I filed and fixed a ticket for you. If you give the latest snapshots a try, the suggested protocol should be working.
Some general considerations
In general it's hard to provide an API to generically remove individual items from a collection of associations. The HTTP DELETE method unfortunately operates on the target URI only and does not take any request body. I.e. you'd have to expose some kind of identification mechanism for the individual collection elements within the URI. There's no spec that I am aware of that defines how to do that and we don't want to get into the business of defining one.
One could investigate the ability to use JSON Patch requests to collection like association resources but that's not without problems either. I've filed a ticket to keep track of that idea.
Besides that a potentially ever-growing list of references to other resources is pretty hard to manage in the first place. It might be a better choice to augment the resources space with a custom resource that handles the unassignment of the user from the group and advertise that through a custom link.

What is the standard practice for designing REST API if it is being used for inserting / updating a list of records

We are building an API which will be used for inserting and updating the records in a database. So if the record exists based on the Key the record will be updated and if it does not then it will be inserted.
I have two questions.
As per REST guidelines, what are the options for designing such an API e.g. PUT / POST OR PATCH? How should the list of objects be represented?
NOTE: I know from other answers that I read that there is confusion over how it should be as per REST guidelines. So I am OK if I can get some guidance on general best practice (irrespective of REST part)
Secondly, the part where I am really confused about is how to represent the output of this or what this API should return.
Specific guidance/inputs on above topic would be really appreciated.

I've seen many different implementations for inserts/updates across various vendors (Stripe, HubSpot, PayPal, Google, Microsoft). Even though they differ, the difference somehow fits well with their overall API implementation and is not usually a cause for stress.
With that said, the "general" rule for inserts is:
POST /customers - provide the customer details within the body.
This will create a new customer, returns the unique ID and customer details in the response (along with createdDate and other auto-generated attributes).
Pretty much most, if not all API vendors, implement this logic for inserts.
Updates, are quite different. Your options include:
POST
POST /customer/<customer_id> - include attributes and values you want to update within the body.
Here you use a POST to update the customer. It's not a very common implementation, but I've seen it in several places.
PUT
PUT/customer/<customer_id> - include either all, or partially updated attributes within the body.
Given PUT is technically an idempotent method, you can either stay true to the REST convention and expect your users to provide all the attributes to update the resource, or make it simpler by only accepting the attributes they want to update. The second option is not very "RESTful", but is easier to handle from a users perspective (and reduces the size of the payload).
PATCH
PATCH /customer/<customer_id> - include the operation and attributes that you want to update / remove/ replace / etc within the body. More about how to PATCH.
The PATCH method is used for partial updates, and it's how you're "meant" to invoke partial updates. It's a little harder to use from a consumers perspective.
Now, this is where the bias kicks-in. I personally prefer to use POST, where I am not required to provide all the attributes to invoke an update (just the ones I want to update). Reason is due to simplicity of usage.
In terms of the response body for the updates, usually they will return object within the response body include the updated attributes (and updated auto-generated attributes, such updatedDate).
Bulk inserts/ updates
Looking at the Facebook Graph HTTP API (Batch Request) for inspiration, and assuming POST is your preferred method for updates, you could embed an array of requests using a dedicated batch resource as an option.
Endpoint: POST /batch/customers
Body:
{
["first_name": "John", "last_name": "Smith"...], //CREATE
["id": "777", "first_name": "Jane", "last_name": "Doe"...], //UPDATE
["id": "999", "first_name": "Mike", "last_name": "Smith"...], //UPDATE
[....]
}
Sample Response
{
"id": "123",
"result":[
{ // Creation successful
"code": 200,
"headers":{..},
"body": {..},
"uri": "/customers/345"
},
{ // Update successful
"code": 200,
"headers":{..},
"body": {..},
"uri": "/customers/777",
},
{ // A failed update request
"code": 404,
"headers":{..},
"body": {..}, // body includes error details
}
]
}

How to post two resources in the same request?

I have /companies and /users resources. My business logic prevents creating company without first user. Only way I can think of:
POST /companies
{
"name": "Harvey's Broiler",
"user": {
"firstName": "Jojo",
"lastName": "Stomopolous",
"email": "jojostomopolous#harveysbroiler.com",
"password": "password"
}
}
Response:
{
"id": 10001,
"name": "Harvey's Broiler",
"user": {
"id": 10002,
"firstName": "Jojo",
"lastName": "Stomopolous",
"email": "jojostomopolous#harveysbroiler.com"
}
}
Later, they can be reachable as:
GET /companies/10001
Response:
{
"id": 10001,
"name": "Harvey's Broiler"
}
and
GET /users/10002
or
GET /companies/10001/users/10002
Response:
{
"id": 10002,
"firstName": "Jojo",
"lastName": "Stomopolous",
"email": "jojostomopolous#harveysbroiler.com"
}

This is a common question developers encounter when designing APIs.
The first question you should ask yourself is are the Company and User resources same order citizens? By order, I mean are they the same importance, are they both resources you massively operate with and have independent roles and operations in the system you are building? My hunch is the answer is yes. If the answer is no, then the User is only a way of denoting the founder of the Company, and you have no problem, just put the User as an embedded object like you already did.
However, if my hunch is right that you have some business logic around User and Company, I would do just that, keep them separate, under separate endpoints.
If you need a User to create a Company, than implement that logic. If a Company is attempted to be created with a missing User, return an error (HTTP response 400, or something along those lines). Of course, that should be documented for the user. The request is than like this:
POST /companies
{
"name": "Harvey's Broiler",
"user": 1234
}
Mashing the creation of two objects under the same request artifically can only lead to issues. Now you need to return the status for both (a User is created, but Company failed?), return two IDs (what if you also need to add other information, tax details, you get a third ID) and so on.
The only valid reason for creating the User along with the Company is if a User is very often created along with the Company, if not always, and you need to reduce the number of API calls, so you only fire one, but I am not sure that is the case.
If you can't even have a User without a Company, than see if you can revise the requirements or create the User/Company in two steps. First fire a request for a User "placeholder" (let's say the User will not be visible in a list of Users or such, it will not be valid), and after creating Company, the User becomes valid and visible, and other operations become permitted. Until that, there is no User, only a placeholder for it. The same can be logic can be reversed for Company.
And another thing, I will not go into this kind of nesting:
GET /companies/10001/users/10002
First, it is usually hard to program (you get a lot of boilerplate), and it is a possible maintenance nightmare. You can extrapolate the case to:
GET /companies/10001/users/10002/accounts/24314/bank/address
And fetch a bank address for a bank of a user who founded a company. I'm hesitant to implement this kind of approach if I do not have to.
Also, please consider reading about HATEOAS. It might help you if you need this kind of nesting. Actually I will always encourage at least considering the HATEOAS principle when starting a new API.

REST API sub resources, data to return?

If we have customers and orders, I'm looking for the correct RESTful way to get this data:
{
"customer": {
"id": 123,
"name": "Jim Bloggs"
"orders": [
{
"id": 123,
"item": "Union Jack Keyring",
"qty": 1
}, {
"id": 987,
"item": "London Eye Ticket",
"qty": 5
}
]
}
}
GET /customers/123/orders
GET /customers/123?inc-orders=1
Am I correct that the last part/folder of the URL, excluding query string params, should be the resource returned..?
If so, number 1 should only return order data and not include the customer data. While number 2 is pointing directly at customer 123 and uses query string params to effect/filter the customer data returned, in this case including the order data.
Which of these two calls is the correct RESTful call for the above JSON..? ...or is there a secret number 3 option..?

You have 3 options which I think could be considered RESTful.
1)
GET /customers/12
But always include the orders. Do you have a situation in which the client would not want to use the orders? Or can the orders array get really big? If so you might want another option.
2)
GET /customers/123, which could include a link to their orders like so:
{
"customer": {
"id": 123,
"name": "Jim Bloggs"
"orders": {
"href": "<link to you orders go here>"
}
}
}
With this your client would have to make 2 requests to get a customer and their orders. Good thing about this way though is that you can easily implement clean paging and filtering on orders.
3)
GET /customers/123?fields=orders
This is similar to your second approach. This will allow clients to use your API more efficiently, but I wouldn't go this route unless you really need to limit the fields that are coming back from your server. Otherwise it will add unnecessary complexity to your API which you will have to maintain.

The Resource (identified by the complete URL) is the same, a customer. Only the Representation is different, with or without embedded orders.
Use Content Negotiation to get different Representations for the same Resource.
Request
GET GET /customers/123/
Accept: application/vnd.acme.customer.short+json
Response
200 OK
Content-Type: application/vnd.acm.customer.short+json
{
"customer": {
"id": 123,
"name": "Jim Bloggs"
}
}
Request
GET GET /customers/123/
Accept: application/vnd.acme.customer.full+json
Response
200 OK
Content-Type: application/vnd.acme.customer.full+json
{
"customer": {
"id": 123,
"name": "Jim Bloggs"
"orders": [
{
"id": 123,
"item": "Union Jack Keyring",
"qty": 1
}, {
"id": 987,
"item": "London Eye Ticket",
"qty": 5
}
]
}
}

The JSON that you posted looks like what would be the result of
GET /customers/123
provided the Customer resource contains a collection of Orders as a property; alternatively you could either embed them, or provide a link to them.
The latter would result in something like this:
GET /customers/123/orders
which would return something like
{
"orders": [
{
"id": 123,
"item": "Union Jack Keyring",
"qty": 1
},
{
"id": 987,
"item": "London Eye Ticket",
"qty": 5
}
]
}

I'm looking for the correct RESTful way to get this data
Simply perform a HTTP GET request on a URI that points to a resource that produces this data!
TL;DR
REST does not care about URI design - but on its constraints!
Clients perform state transitions through possible actions returned by the server through dynamically identified hyperlinks contained within the response.
Clients and servers can negotiate on a preferred hypermedia type
Instead of embedding the whole (sub-)resource consider only returning the link to that resource so a client can look it up if interested
First, REST does not really care about the URI design as long as the URI is unique. Sure, a simple URI design is easier to understand for humans, though if compared to HTML the actual link can be hidden behind a more meaninful text and is thus also not that important for humans also as long as they are able to find the link and can perform an action against it. Next, why do you think your "response" or API is RESTful? To call an API RESTful, the API should respect a couple of constraints. Among these constraints is one quite buzzword-famous: hypertext as the engine of application state (HATEOAS).
REST is a generalized concept of the Web we use every day. A quite common task for a web-session is that a client requests something where the server sends a HTML document with plenty of links and other resources the client can use to request further pages or stream a video (or what ever). A user operationg on a client can use the returned information to proceed further, request new pages, send information to the server etc, etc. The same holds true for RESTful applications. This is was REST simply defines as HATEOAS. If you now have a look at your "response" and double check with the HATEOAS constraint you might see that your response does not contain any links to start with. A client therefore needs domain knowledge to proceed further.
JSON itself isn't the best hypermedia type IMO as it only defines the overall syntax of the data but does not carry any semantics, similar to plain XML which though may have some DTD or schemas a client may use to validate the document and check if further semantic rules are available elsewhere. There are a couple of hypermedia types that build up on JSON that are probably better suited like f.e. application/hal+json (A good comparison of JSON based hypermedia types can be found in this blog post). You are of course entitled to define your own hypermedia type, though certain clients may not be able to understand it out of the box.
If you take f.e. a look at HAL you see that it defines an _embedded element where you can put in certain sub-resources. This seems to be ideal in your case. Depending on your design, orders could also be a resource on its own and thus be reachable via GET /orders/{orderId} itself. Instead of embedding the whole sub-resource, you can also just include the link to that (sub)resource so a client can look up the data if interested.
If there are cases where you want to return only customer data and other cases where you want also to include oder data you can f.e. define different hypermedia types (based on HAL f.e.) for both, one returning just the customer data while the other also includes the oder data. These types could be named like this: application/vnd.yourcompanyname.version.customers.hal+json or application/vnd.yourcompanyname.version.customer_orders.hal+json. While this is for sure an development overhead compared to adding a simple query-parameter to the request, the semantics are more clear and the documentation overhead is on the hypermedia type (or representation) rather then the HTTP operation.
You can of course also define some kind of view structure where one view only returns the customer data as is while a different view returns the customer data including the orders similar to a response I gave on a not so unrelated topic.

REST design for "related" resources

Suppose I have a customer that has multiple accounts. Really any data object that has a "1 to many" relationship with another complex object will do.
An example might be:
{ id: 1,
name: "Bob",
accounts: [ { id: 2, name: "Work account" },
{ id: 3, name: "Home account" } }
My question is, when is it appropriate/better to expose the accounts as a sub-resource of the customer, vs. as a separate resource? Or both?
For example, my first intuition would be to have: /customers/1 return the object above. If you wanted to modify one of the accounts, you'd have to POST to /accounts/2.
The other way to go about it (I have seen in some APIs) is to expose another route /customers/1/accounts which would return the array above, and then set up POST/PATCH routes there to allow API users to mutate the array of accounts.
My problem with that approach is that if the array of accounts are actually "included by reference", it's not really clear whether that REST route is modifying the account or if it's merely modifying the linkage between customer and the account.
Is there a best practice here?

This is a good question and is up for discussion (there isn't a "correct" answer). Here are some points you may want to consider:
Having the child account resource embedded in the customer resource will cause more data to always be sent back with the /customers/{id} request.
Having the child account resource non-embedded will require a client to send multiple HTTP requests if it needs both basic customer information and also account information.
You'll want to determine exactly how your security paradigm will work with embedded resources. (i.e. Is it possible to be allowed to get the information of a customer but not be allowed to see the customers accounts?)
Does it ever make sense to have an account without a customer in your domain? Can accounts transfer ownership? If not, then /customers/{id}/accounts/{acct_id} makes more sense.
Implied in the definition of REST, issuing HTTP methods on a URI is modifying a resource identified by the URI, so by default, you're always modifying the account and not the linkage between the customer and account.
If you needed functionality to modify the linkage of accounts, you could invent a resource like, "account link request" (POST /accounts/{id}/linkreqeust or something of that nature). This request could have a state in its own right, where you would have back-end functionality that would evaluate the request and then determine if an account should be linked or detached to/from a customer and then do the attach/detach process.
In summary, there's nothing in REST that prevents you from referencing the same resource with different links (/accounts/{id}; /customers/{id}/accounts/{acct_id}). My preference is if there are no security implications to having the sub-resource, then have it in conjunction with an endpoint to access the sub-resource by itself. If accounts could be tied to multiple users (or have no customers), I would also expose the /accounts/{id} endpoint.
ex. /customers/1:
{
"id": "1",
"name": "John Doe",
"accounts": {
"collection": [
{
"id": "1",
"type": "savings",
"balance": "$400",
"links": {
"self": "/customers/1/accounts/1"
}
},
{
"id": "2",
"type": "checking",
"balance": "$600",
"links": {
"self": "/customers/1/accounts/2",
"detach" : "/customers/1/accounts/2/linkrequest?action=detach"
}
}
],
"links": {
"self": "/customers/1/accounts",
"attach": "customers/1/accounts/linkrequest?action=attach"
}
},
"links": {
"self": "/customers/1"
}
}

As you already suspected, it's all come down to the data representation model in your database. Whether it is an owned relationship (Actual entities list), unowned relationship (what you referred as "included by reference"), or even an implicit 1:N relationship by backlink reference (a customer backlink field on account).
For retrieving the data by GET, the following can be used depending on the representation model:
/customers/1/accounts
/accounts?customer eq 1