REST API sub resources, data to return? - rest

If we have customers and orders, I'm looking for the correct RESTful way to get this data:
{
"customer": {
"id": 123,
"name": "Jim Bloggs"
"orders": [
{
"id": 123,
"item": "Union Jack Keyring",
"qty": 1
}, {
"id": 987,
"item": "London Eye Ticket",
"qty": 5
}
]
}
}
GET /customers/123/orders
GET /customers/123?inc-orders=1
Am I correct that the last part/folder of the URL, excluding query string params, should be the resource returned..?
If so, number 1 should only return order data and not include the customer data. While number 2 is pointing directly at customer 123 and uses query string params to effect/filter the customer data returned, in this case including the order data.
Which of these two calls is the correct RESTful call for the above JSON..? ...or is there a secret number 3 option..?

You have 3 options which I think could be considered RESTful.
1)
GET /customers/12
But always include the orders. Do you have a situation in which the client would not want to use the orders? Or can the orders array get really big? If so you might want another option.
2)
GET /customers/123, which could include a link to their orders like so:
{
"customer": {
"id": 123,
"name": "Jim Bloggs"
"orders": {
"href": "<link to you orders go here>"
}
}
}
With this your client would have to make 2 requests to get a customer and their orders. Good thing about this way though is that you can easily implement clean paging and filtering on orders.
3)
GET /customers/123?fields=orders
This is similar to your second approach. This will allow clients to use your API more efficiently, but I wouldn't go this route unless you really need to limit the fields that are coming back from your server. Otherwise it will add unnecessary complexity to your API which you will have to maintain.

The Resource (identified by the complete URL) is the same, a customer. Only the Representation is different, with or without embedded orders.
Use Content Negotiation to get different Representations for the same Resource.
Request
GET GET /customers/123/
Accept: application/vnd.acme.customer.short+json
Response
200 OK
Content-Type: application/vnd.acm.customer.short+json
{
"customer": {
"id": 123,
"name": "Jim Bloggs"
}
}
Request
GET GET /customers/123/
Accept: application/vnd.acme.customer.full+json
Response
200 OK
Content-Type: application/vnd.acme.customer.full+json
{
"customer": {
"id": 123,
"name": "Jim Bloggs"
"orders": [
{
"id": 123,
"item": "Union Jack Keyring",
"qty": 1
}, {
"id": 987,
"item": "London Eye Ticket",
"qty": 5
}
]
}
}

The JSON that you posted looks like what would be the result of
GET /customers/123
provided the Customer resource contains a collection of Orders as a property; alternatively you could either embed them, or provide a link to them.
The latter would result in something like this:
GET /customers/123/orders
which would return something like
{
"orders": [
{
"id": 123,
"item": "Union Jack Keyring",
"qty": 1
},
{
"id": 987,
"item": "London Eye Ticket",
"qty": 5
}
]
}

I'm looking for the correct RESTful way to get this data
Simply perform a HTTP GET request on a URI that points to a resource that produces this data!
TL;DR
REST does not care about URI design - but on its constraints!
Clients perform state transitions through possible actions returned by the server through dynamically identified hyperlinks contained within the response.
Clients and servers can negotiate on a preferred hypermedia type
Instead of embedding the whole (sub-)resource consider only returning the link to that resource so a client can look it up if interested
First, REST does not really care about the URI design as long as the URI is unique. Sure, a simple URI design is easier to understand for humans, though if compared to HTML the actual link can be hidden behind a more meaninful text and is thus also not that important for humans also as long as they are able to find the link and can perform an action against it. Next, why do you think your "response" or API is RESTful? To call an API RESTful, the API should respect a couple of constraints. Among these constraints is one quite buzzword-famous: hypertext as the engine of application state (HATEOAS).
REST is a generalized concept of the Web we use every day. A quite common task for a web-session is that a client requests something where the server sends a HTML document with plenty of links and other resources the client can use to request further pages or stream a video (or what ever). A user operationg on a client can use the returned information to proceed further, request new pages, send information to the server etc, etc. The same holds true for RESTful applications. This is was REST simply defines as HATEOAS. If you now have a look at your "response" and double check with the HATEOAS constraint you might see that your response does not contain any links to start with. A client therefore needs domain knowledge to proceed further.
JSON itself isn't the best hypermedia type IMO as it only defines the overall syntax of the data but does not carry any semantics, similar to plain XML which though may have some DTD or schemas a client may use to validate the document and check if further semantic rules are available elsewhere. There are a couple of hypermedia types that build up on JSON that are probably better suited like f.e. application/hal+json (A good comparison of JSON based hypermedia types can be found in this blog post). You are of course entitled to define your own hypermedia type, though certain clients may not be able to understand it out of the box.
If you take f.e. a look at HAL you see that it defines an _embedded element where you can put in certain sub-resources. This seems to be ideal in your case. Depending on your design, orders could also be a resource on its own and thus be reachable via GET /orders/{orderId} itself. Instead of embedding the whole sub-resource, you can also just include the link to that (sub)resource so a client can look up the data if interested.
If there are cases where you want to return only customer data and other cases where you want also to include oder data you can f.e. define different hypermedia types (based on HAL f.e.) for both, one returning just the customer data while the other also includes the oder data. These types could be named like this: application/vnd.yourcompanyname.version.customers.hal+json or application/vnd.yourcompanyname.version.customer_orders.hal+json. While this is for sure an development overhead compared to adding a simple query-parameter to the request, the semantics are more clear and the documentation overhead is on the hypermedia type (or representation) rather then the HTTP operation.
You can of course also define some kind of view structure where one view only returns the customer data as is while a different view returns the customer data including the orders similar to a response I gave on a not so unrelated topic.

Related

How to delete an element from a collection association in Spring Data REST?

Suppose you have a One-To-Many or Many-To-Many relationship in Spring Data REST. Let's say you have groups that has a One-to-Many relationship with users. If you get the list of associations from a group you will get back links like this:
{
"_embedded": {
"users": [
{
"username": "test25",
"enabled": false,
"firstName": "strifng",
"lastName": "sdfdffff",
"_links": {
"self": {
"href": "…/users/78"
}
}
},
{
"username": "test33",
"enabled": true,
"firstName": "sd",
"lastName": "asdfsa",
"_links": {
"self": {
"href": "…/users/77"
}
}
}
}
]
}
Which is useless if you are trying to remove a particular user from a group. You are either forced to use PUT with /groups/{id}/users but that is impossible if you have thousands of users. You can POST to /groups/{id}/users with a list of URI but you can't DELETE to /groups/{id}/users.
Why?
The only way DELETE works is by calling /groups/{id}/users/{id} but there's no way to construct this URI from the front end as it is not returned in the collection.
How do you get around this?
The pattern that you'd need to use here is to access the association resource asking it for the text/uri-list media type, get the URIs of all linked resources, modify the list as you see need and PUT it back to the association resource. I.e.:
GET /groups/4711/users
200 OK
…/users/3149
…/users/41
…/users/4711
Followed by a:
PUT /groups/4711/users
…/users/3149
…/users/4711
Basically removing the user with id 41 from the association.
The problem
The problem with this suggestion is it that it currently doesn't work 🙃. It's broken in the sense that the lookup of the list of URIs currently fails due to some bug. Looks like that functionality went off the radar at some point (as it's not even advertised in the reference docs anymore). Good news is that I filed and fixed a ticket for you. If you give the latest snapshots a try, the suggested protocol should be working.
Some general considerations
In general it's hard to provide an API to generically remove individual items from a collection of associations. The HTTP DELETE method unfortunately operates on the target URI only and does not take any request body. I.e. you'd have to expose some kind of identification mechanism for the individual collection elements within the URI. There's no spec that I am aware of that defines how to do that and we don't want to get into the business of defining one.
One could investigate the ability to use JSON Patch requests to collection like association resources but that's not without problems either. I've filed a ticket to keep track of that idea.
Besides that a potentially ever-growing list of references to other resources is pretty hard to manage in the first place. It might be a better choice to augment the resources space with a custom resource that handles the unassignment of the user from the group and advertise that through a custom link.

What is the standard practice for designing REST API if it is being used for inserting / updating a list of records

We are building an API which will be used for inserting and updating the records in a database. So if the record exists based on the Key the record will be updated and if it does not then it will be inserted.
I have two questions.
As per REST guidelines, what are the options for designing such an API e.g. PUT / POST OR PATCH? How should the list of objects be represented?
NOTE: I know from other answers that I read that there is confusion over how it should be as per REST guidelines. So I am OK if I can get some guidance on general best practice (irrespective of REST part)
Secondly, the part where I am really confused about is how to represent the output of this or what this API should return.
Specific guidance/inputs on above topic would be really appreciated.
I've seen many different implementations for inserts/updates across various vendors (Stripe, HubSpot, PayPal, Google, Microsoft). Even though they differ, the difference somehow fits well with their overall API implementation and is not usually a cause for stress.
With that said, the "general" rule for inserts is:
POST /customers - provide the customer details within the body.
This will create a new customer, returns the unique ID and customer details in the response (along with createdDate and other auto-generated attributes).
Pretty much most, if not all API vendors, implement this logic for inserts.
Updates, are quite different. Your options include:
POST
POST /customer/<customer_id> - include attributes and values you want to update within the body.
Here you use a POST to update the customer. It's not a very common implementation, but I've seen it in several places.
PUT
PUT/customer/<customer_id> - include either all, or partially updated attributes within the body.
Given PUT is technically an idempotent method, you can either stay true to the REST convention and expect your users to provide all the attributes to update the resource, or make it simpler by only accepting the attributes they want to update. The second option is not very "RESTful", but is easier to handle from a users perspective (and reduces the size of the payload).
PATCH
PATCH /customer/<customer_id> - include the operation and attributes that you want to update / remove/ replace / etc within the body. More about how to PATCH.
The PATCH method is used for partial updates, and it's how you're "meant" to invoke partial updates. It's a little harder to use from a consumers perspective.
Now, this is where the bias kicks-in. I personally prefer to use POST, where I am not required to provide all the attributes to invoke an update (just the ones I want to update). Reason is due to simplicity of usage.
In terms of the response body for the updates, usually they will return object within the response body include the updated attributes (and updated auto-generated attributes, such updatedDate).
Bulk inserts/ updates
Looking at the Facebook Graph HTTP API (Batch Request) for inspiration, and assuming POST is your preferred method for updates, you could embed an array of requests using a dedicated batch resource as an option.
Endpoint: POST /batch/customers
Body:
{
["first_name": "John", "last_name": "Smith"...], //CREATE
["id": "777", "first_name": "Jane", "last_name": "Doe"...], //UPDATE
["id": "999", "first_name": "Mike", "last_name": "Smith"...], //UPDATE
[....]
}
Sample Response
{
"id": "123",
"result":[
{ // Creation successful
"code": 200,
"headers":{..},
"body": {..},
"uri": "/customers/345"
},
{ // Update successful
"code": 200,
"headers":{..},
"body": {..},
"uri": "/customers/777",
},
{ // A failed update request
"code": 404,
"headers":{..},
"body": {..}, // body includes error details
}
]
}

How to post two resources in the same request?

I have /companies and /users resources. My business logic prevents creating company without first user. Only way I can think of:
POST /companies
{
"name": "Harvey's Broiler",
"user": {
"firstName": "Jojo",
"lastName": "Stomopolous",
"email": "jojostomopolous#harveysbroiler.com",
"password": "password"
}
}
Response:
{
"id": 10001,
"name": "Harvey's Broiler",
"user": {
"id": 10002,
"firstName": "Jojo",
"lastName": "Stomopolous",
"email": "jojostomopolous#harveysbroiler.com"
}
}
Later, they can be reachable as:
GET /companies/10001
Response:
{
"id": 10001,
"name": "Harvey's Broiler"
}
and
GET /users/10002
or
GET /companies/10001/users/10002
Response:
{
"id": 10002,
"firstName": "Jojo",
"lastName": "Stomopolous",
"email": "jojostomopolous#harveysbroiler.com"
}
This is a common question developers encounter when designing APIs.
The first question you should ask yourself is are the Company and User resources same order citizens? By order, I mean are they the same importance, are they both resources you massively operate with and have independent roles and operations in the system you are building? My hunch is the answer is yes. If the answer is no, then the User is only a way of denoting the founder of the Company, and you have no problem, just put the User as an embedded object like you already did.
However, if my hunch is right that you have some business logic around User and Company, I would do just that, keep them separate, under separate endpoints.
If you need a User to create a Company, than implement that logic. If a Company is attempted to be created with a missing User, return an error (HTTP response 400, or something along those lines). Of course, that should be documented for the user. The request is than like this:
POST /companies
{
"name": "Harvey's Broiler",
"user": 1234
}
Mashing the creation of two objects under the same request artifically can only lead to issues. Now you need to return the status for both (a User is created, but Company failed?), return two IDs (what if you also need to add other information, tax details, you get a third ID) and so on.
The only valid reason for creating the User along with the Company is if a User is very often created along with the Company, if not always, and you need to reduce the number of API calls, so you only fire one, but I am not sure that is the case.
If you can't even have a User without a Company, than see if you can revise the requirements or create the User/Company in two steps. First fire a request for a User "placeholder" (let's say the User will not be visible in a list of Users or such, it will not be valid), and after creating Company, the User becomes valid and visible, and other operations become permitted. Until that, there is no User, only a placeholder for it. The same can be logic can be reversed for Company.
And another thing, I will not go into this kind of nesting:
GET /companies/10001/users/10002
First, it is usually hard to program (you get a lot of boilerplate), and it is a possible maintenance nightmare. You can extrapolate the case to:
GET /companies/10001/users/10002/accounts/24314/bank/address
And fetch a bank address for a bank of a user who founded a company. I'm hesitant to implement this kind of approach if I do not have to.
Also, please consider reading about HATEOAS. It might help you if you need this kind of nesting. Actually I will always encourage at least considering the HATEOAS principle when starting a new API.

REST - put IDs in body or not?

Let's say I want to have a RESTful resource for people, where the client is able to assign ID.
A person looks like this: {"id": <UUID>, "name": "Jimmy"}
Now, how should the client save (or "PUT") it?
PUT /person/UUID {"id": <UUID>, "name": "Jimmy"} - now we have this nasty duplication that we have to verify all the time: Does the ID in body match the one in path?
Asymmetric representation:
PUT /person/UUID {"name": "Jimmy"}
GET /person/UUID returns {"id": <UUID>, "name": "Jimmy"}
No IDs in body - ID only in location:
PUT /person/UUID {"name": "Jimmy"}
GET /person/UUID returns {"name": "Jimmy"}
No kind of POST seems like a good idea since the ID is generated by the client.
What are the common patterns and ways to solve it? IDs only in location seems like the most dogmatically correct way, but it also makes the practical implementation harder.
There is nothing wrong in having different read/write models: the client can write one resource representation where after the server can return another representation with added/calculated elements in it (or even a completely different representation - there is nothing in any spec against that, the only requirement is that PUT should create or replace the resource).
So I would go for the asymmetric solution in (2) and avoid the "nasty duplication check" on the server side when writing:
PUT /person/UUID {"name": "Jimmy"}
GET /person/UUID returns {"id": <UUID>, "name": "Jimmy"}
If it is a public API you should be conservative when you reply, but accept liberally.
By that I mean, you should support both 1 and 2. I agree that 3 doesn't make sense.
The way to support both 1 and 2 is to get the id from the url if none is supplied in the request body, and if it is in the request body, then validate that it matches the id in the url. If the two do not match, then return a 400 Bad Request response.
When returning a person resource be conservative and always include the id in the json, even though it is optional in the put.
Just FYI, the answers here are wrong.
TLDR; If you're using PUT, you should have the id in the body. If you are using PATCH, you do not need the id in the body.
See:
https://restfulapi.net/rest-api-design-tutorial-with-example/
https://restfulapi.net/rest-put-vs-post/
https://restfulapi.net/http-methods/#patch
PUT
Use PUT APIs primarily to update existing resource (if the
resource does not exist, then API may decide to create a new resource
or not). If a new resource has been created by the PUT API, the origin
server MUST inform the user agent via the HTTP response code 201
(Created) response and if an existing resource is modified, either the
200 (OK) or 204 (No Content) response codes SHOULD be sent to indicate
successful completion of the request.
If the request passes through a cache and the Request-URI identifies
one or more currently cached entities, those entries SHOULD be treated
as stale. Responses to this method are not cacheable.
Use PUT when you want to modify a singular resource which is already a
part of resources collection. PUT replaces the resource in its
entirety. Use PATCH if request updates part of the resource.
PATCH
HTTP PATCH requests are to make partial update on a resource. If you
see PUT requests also modify a resource entity so to make more clear –
PATCH method is the correct choice for partially updating an existing
resource and PUT should only be used if you’re replacing a resource in
its entirety.
So you should use it in this way:
POST /device-management/devices : Create a new device
PUT /device-management/devices/{id} : Update the device information identified by "id"
PATCH /device-management/devices/{id} : Partial-update the device information identified by "id"
RESTful practices indicate that it shouldn't matter what you PUT at /{id}--the content of the record should be updated to the one provided by the payload--but GET /{id} should still link to the same resource.
In other words, PUT /3 may update to payload id to 4, but GET /3 should still link to the same payload (and return the one with id set to 4).
If you are deciding that your API requires the same identifier in the URI and the payload, it's your job to make sure it matches, but definitely use PATCH instead of PUT if you are excluding the id in the payload that should be there in its entirety. This is where the accepted answer got it wrong. PUT must replace the entire resource, where-as patch may be partial.
One solution to this issue involves the somewhat confusing concept of "Hypertext As The Engine Of Application State," or "HATEOAS." This means that a REST response contains the available resources or actions to be performed as hyperlinks. Using this method, which was part of the original conception of REST, the unique identifiers/IDs of resources are themselves hyperlinks. So, for example, you could have something like:
GET /person/<UUID> {"person": {"location": "/person/<UUID>", "data": { "name": "Jimmy"}}}
Then, if you want to update that resource, you could do (pseudocode):
updatedPerson = person.data
updatedPerson.name = "Timmy"
PUT(URI: response.resource, data: updatedPerson)
One advantage of this is that the client doesn't have to have any idea about the server's internal representation of User IDs. The IDs could change, and even the URLs themselves could change, as long as the client has a way to discover them. For example, when getting a collection of people, you could return a response like this:
GET /people
{ "people": [
"/person/1",
"/person/2"
]
}
(You could, of course, also return the full person object for each person, depending on the needs of the application).
With this method, you think of your objects more in terms of resources and locations, and less in terms of ID. The internal representation of unique identifier is thus decoupled from your client logic. This was the original impetus behind REST: to create client-server architectures that are more loosely coupled than the RPC systems that existed before, by using the features of HTTP. For more information on HATEOAS, look at the Wikipedia article as well as this short article.
In an insert you do not need to add the id in the URL. This way if you send an ID in a PUT you may interpreted as an UPDATE to change the primary key.
INSERT:
PUT /persons/
{"id": 1, "name": "Jimmy"}
HTTP/1.1 201 Created
{"id": 1, "name": "Jimmy", "other_field"="filled_by_server"}
GET /persons/1
HTTP/1.1 200 OK
{"id": 1, "name": "Jimmy", "other_field"="filled_by_server"}
UPDATE
PUT /persons/1
{"id": "2", "name": "Jimmy Jr"} -
HTTP/1.1 200 OK
{"id": "2", "name": "Jimmy Jr", "other_field"="filled_by_server"}
GET /persons/2
HTTP/1.1 200 OK
{"id": "2", "name": "Jimmy Jr", "other_field"="updated_by_server"}
The JSON API uses this standard and solves some issues returning the inserted or updated object with a link to the new object. Some updates or inserts may include some business logic that will change additional fields
You will also see that you can avoid the get after the insert and update.
While it's Ok to have different representations for different operations, a general recommendation for PUT is to contain the WHOLE payload. That means that id should be there as well. Otherwise, you should use PATCH.
Having said that, I think PUT should mostly be utilised for updates and the id should always be passed in the URL as well. As a result of that, using PUT to update the resource identifier is a bad idea.
It leaves us in an undesirable situation when id in the URL can be different from the id in the body.
So, how do we resolve such a conflict? We basically have 2 options:
throw a 4XX exception
add a Warning(X-API-Warn etc) header.
That's as close as I can get to answering this question because the topic in general is a matter of opinion.
This has been asked before - the discussion is worth a look:
Should a RESTful GET response return a resource's ID?
This is one of those questions where it's easy to get bogged down into debate around what is and is not "RESTful".
For what it's worth, I try to think in terms of consistent resources and not change the design of them between methods. However, IMHO the most important thing from a usability perspective is that you are consistent across the entire API!
There is nothing bad in using different approaches. but i think the best way is the solution with 2nd.
PUT /person/UUID {"name": "Jimmy"}
GET /person/UUID returns {"id": <UUID>, "name": "Jimmy"}
it is mostly used in this way even the entity framework use this technique when the entity is added in dbContext the class without the generated ID is ID generated by reference in Entity Framework.
You may need to look into PATCH/PUT request types.
PATCH requests are used to update a resource partially whereas in PUT requests, you have to send the entire resource where it gets overridden on the server.
As far as having an ID in the url is concerned, I think you should always have it as it is a standard practice to identify a resource. Even the Stripe API works that way.
You can use a PATCH request to update a resource on the server with ID to identify it but do not update the actual ID.
I'm looking at this from a JSON-LD/ Semantic Web point of view because that's a good way to go to achieve real REST conformance as I have outlined in these slides. Looking at it from that perspective, there is no question to go for option (1.) as the ID (IRI) of a Web resource should always be equal to the URL which I can use to look-up/ dereference the resource.
I think the verification is not really hard to implement nor is it computationally intens; so I don't consider this a valid reason for going with option (2.).
I think option (3.) is not really an option as POST (create new) has a different semantics than PUT (update/ replace).

RESTful API response for data transfer objects

I have this following scenario in my application. I am logged in as a user and i create a group. There is a REST api for creating the group (POST /groups/api/v1/groups) and getting the group details (GET /groups/api/v1/groups/{group id})
The response returned on success is not just the json representation of the group resource. Its a DTO which contains a lot of other information (to avoid multiple calls to the server)
For instance, the response can include
Actions that can be performed on the group (for ex: inviting a user to the group) and the corresponding urls that need to be hit for each action
Count of members in the group.
Recent activity in the group
Member information
etc
Right now the only client using the REST api's is the UI which needs additional information. If the APIs are exposed to developers later, they may not need all the information being returned. How do we handle rest responses in such scenarios where we have to return DTO's which contain more information?
Is it a good design to be returning dto's in rest response for GET or should be avoided?
It helps if you accept the fact that RESTful HTTP is noisy. The design compensation for the noise is caching, which you should try to use as much as you can to save server hits. A well-cached application can use multiple resources, rather than one large resource, because many of the requests will not ever leave the client.
As far as your specific question, use the expand query parameter to identify what child objects to include. You can further specify what properties of that child to include. For example,
GET /groups/api/v1/groups/12345
{
"id": 12345,
"name": "The Magnificent Seven",
"location": {
"self": "/groups/api/v1/locations/43"
}
}
GET /groups/api/v1/groups/12345?expand=location
{
"id": 12345,
"name": "The Magnificent Seven",
"location": {
"self": "/groups/api/v1/locations/43",
"longitude": "24°01′N",
"latitude": "104°40′W"
}
}
GET /groups/api/v1/groups/12345?expand=location[latitude]
{
"id": 12345,
"name": "The Magnificent Seven",
"location": {
"self": "/groups/api/v1/locations/43",
"latitude": "104°40′W"
}
}
Well giving more data than required can prove to be harmful and you need to start explaining everyone why you are giving so much data. You can have a query param that is secret to a subset of users say "alldetails=true" which will give the full DTO.
If you are using Java with codehaus or some other JSON utility on the REST server you can specify what elements to expose by using a mixin. codehaus has "addMixInAnnotations()" for that.
A good REST response for GET should have an identifier, required details and URLs in JSON or XML.