Suppose I have a customer that has multiple accounts. Really any data object that has a "1 to many" relationship with another complex object will do.
An example might be:
{ id: 1,
name: "Bob",
accounts: [ { id: 2, name: "Work account" },
{ id: 3, name: "Home account" } }
My question is, when is it appropriate/better to expose the accounts as a sub-resource of the customer, vs. as a separate resource? Or both?
For example, my first intuition would be to have: /customers/1 return the object above. If you wanted to modify one of the accounts, you'd have to POST to /accounts/2.
The other way to go about it (I have seen in some APIs) is to expose another route /customers/1/accounts which would return the array above, and then set up POST/PATCH routes there to allow API users to mutate the array of accounts.
My problem with that approach is that if the array of accounts are actually "included by reference", it's not really clear whether that REST route is modifying the account or if it's merely modifying the linkage between customer and the account.
Is there a best practice here?
This is a good question and is up for discussion (there isn't a "correct" answer). Here are some points you may want to consider:
Having the child account resource embedded in the customer resource will cause more data to always be sent back with the /customers/{id} request.
Having the child account resource non-embedded will require a client to send multiple HTTP requests if it needs both basic customer information and also account information.
You'll want to determine exactly how your security paradigm will work with embedded resources. (i.e. Is it possible to be allowed to get the information of a customer but not be allowed to see the customers accounts?)
Does it ever make sense to have an account without a customer in your domain? Can accounts transfer ownership? If not, then /customers/{id}/accounts/{acct_id} makes more sense.
Implied in the definition of REST, issuing HTTP methods on a URI is modifying a resource identified by the URI, so by default, you're always modifying the account and not the linkage between the customer and account.
If you needed functionality to modify the linkage of accounts, you could invent a resource like, "account link request" (POST /accounts/{id}/linkreqeust or something of that nature). This request could have a state in its own right, where you would have back-end functionality that would evaluate the request and then determine if an account should be linked or detached to/from a customer and then do the attach/detach process.
In summary, there's nothing in REST that prevents you from referencing the same resource with different links (/accounts/{id}; /customers/{id}/accounts/{acct_id}). My preference is if there are no security implications to having the sub-resource, then have it in conjunction with an endpoint to access the sub-resource by itself. If accounts could be tied to multiple users (or have no customers), I would also expose the /accounts/{id} endpoint.
ex. /customers/1:
{
"id": "1",
"name": "John Doe",
"accounts": {
"collection": [
{
"id": "1",
"type": "savings",
"balance": "$400",
"links": {
"self": "/customers/1/accounts/1"
}
},
{
"id": "2",
"type": "checking",
"balance": "$600",
"links": {
"self": "/customers/1/accounts/2",
"detach" : "/customers/1/accounts/2/linkrequest?action=detach"
}
}
],
"links": {
"self": "/customers/1/accounts",
"attach": "customers/1/accounts/linkrequest?action=attach"
}
},
"links": {
"self": "/customers/1"
}
}
As you already suspected, it's all come down to the data representation model in your database. Whether it is an owned relationship (Actual entities list), unowned relationship (what you referred as "included by reference"), or even an implicit 1:N relationship by backlink reference (a customer backlink field on account).
For retrieving the data by GET, the following can be used depending on the representation model:
/customers/1/accounts
/accounts?customer eq 1
Related
Suppose you have a One-To-Many or Many-To-Many relationship in Spring Data REST. Let's say you have groups that has a One-to-Many relationship with users. If you get the list of associations from a group you will get back links like this:
{
"_embedded": {
"users": [
{
"username": "test25",
"enabled": false,
"firstName": "strifng",
"lastName": "sdfdffff",
"_links": {
"self": {
"href": "…/users/78"
}
}
},
{
"username": "test33",
"enabled": true,
"firstName": "sd",
"lastName": "asdfsa",
"_links": {
"self": {
"href": "…/users/77"
}
}
}
}
]
}
Which is useless if you are trying to remove a particular user from a group. You are either forced to use PUT with /groups/{id}/users but that is impossible if you have thousands of users. You can POST to /groups/{id}/users with a list of URI but you can't DELETE to /groups/{id}/users.
Why?
The only way DELETE works is by calling /groups/{id}/users/{id} but there's no way to construct this URI from the front end as it is not returned in the collection.
How do you get around this?
The pattern that you'd need to use here is to access the association resource asking it for the text/uri-list media type, get the URIs of all linked resources, modify the list as you see need and PUT it back to the association resource. I.e.:
GET /groups/4711/users
200 OK
…/users/3149
…/users/41
…/users/4711
Followed by a:
PUT /groups/4711/users
…/users/3149
…/users/4711
Basically removing the user with id 41 from the association.
The problem
The problem with this suggestion is it that it currently doesn't work 🙃. It's broken in the sense that the lookup of the list of URIs currently fails due to some bug. Looks like that functionality went off the radar at some point (as it's not even advertised in the reference docs anymore). Good news is that I filed and fixed a ticket for you. If you give the latest snapshots a try, the suggested protocol should be working.
Some general considerations
In general it's hard to provide an API to generically remove individual items from a collection of associations. The HTTP DELETE method unfortunately operates on the target URI only and does not take any request body. I.e. you'd have to expose some kind of identification mechanism for the individual collection elements within the URI. There's no spec that I am aware of that defines how to do that and we don't want to get into the business of defining one.
One could investigate the ability to use JSON Patch requests to collection like association resources but that's not without problems either. I've filed a ticket to keep track of that idea.
Besides that a potentially ever-growing list of references to other resources is pretty hard to manage in the first place. It might be a better choice to augment the resources space with a custom resource that handles the unassignment of the user from the group and advertise that through a custom link.
I have /companies and /users resources. My business logic prevents creating company without first user. Only way I can think of:
POST /companies
{
"name": "Harvey's Broiler",
"user": {
"firstName": "Jojo",
"lastName": "Stomopolous",
"email": "jojostomopolous#harveysbroiler.com",
"password": "password"
}
}
Response:
{
"id": 10001,
"name": "Harvey's Broiler",
"user": {
"id": 10002,
"firstName": "Jojo",
"lastName": "Stomopolous",
"email": "jojostomopolous#harveysbroiler.com"
}
}
Later, they can be reachable as:
GET /companies/10001
Response:
{
"id": 10001,
"name": "Harvey's Broiler"
}
and
GET /users/10002
or
GET /companies/10001/users/10002
Response:
{
"id": 10002,
"firstName": "Jojo",
"lastName": "Stomopolous",
"email": "jojostomopolous#harveysbroiler.com"
}
This is a common question developers encounter when designing APIs.
The first question you should ask yourself is are the Company and User resources same order citizens? By order, I mean are they the same importance, are they both resources you massively operate with and have independent roles and operations in the system you are building? My hunch is the answer is yes. If the answer is no, then the User is only a way of denoting the founder of the Company, and you have no problem, just put the User as an embedded object like you already did.
However, if my hunch is right that you have some business logic around User and Company, I would do just that, keep them separate, under separate endpoints.
If you need a User to create a Company, than implement that logic. If a Company is attempted to be created with a missing User, return an error (HTTP response 400, or something along those lines). Of course, that should be documented for the user. The request is than like this:
POST /companies
{
"name": "Harvey's Broiler",
"user": 1234
}
Mashing the creation of two objects under the same request artifically can only lead to issues. Now you need to return the status for both (a User is created, but Company failed?), return two IDs (what if you also need to add other information, tax details, you get a third ID) and so on.
The only valid reason for creating the User along with the Company is if a User is very often created along with the Company, if not always, and you need to reduce the number of API calls, so you only fire one, but I am not sure that is the case.
If you can't even have a User without a Company, than see if you can revise the requirements or create the User/Company in two steps. First fire a request for a User "placeholder" (let's say the User will not be visible in a list of Users or such, it will not be valid), and after creating Company, the User becomes valid and visible, and other operations become permitted. Until that, there is no User, only a placeholder for it. The same can be logic can be reversed for Company.
And another thing, I will not go into this kind of nesting:
GET /companies/10001/users/10002
First, it is usually hard to program (you get a lot of boilerplate), and it is a possible maintenance nightmare. You can extrapolate the case to:
GET /companies/10001/users/10002/accounts/24314/bank/address
And fetch a bank address for a bank of a user who founded a company. I'm hesitant to implement this kind of approach if I do not have to.
Also, please consider reading about HATEOAS. It might help you if you need this kind of nesting. Actually I will always encourage at least considering the HATEOAS principle when starting a new API.
If we have customers and orders, I'm looking for the correct RESTful way to get this data:
{
"customer": {
"id": 123,
"name": "Jim Bloggs"
"orders": [
{
"id": 123,
"item": "Union Jack Keyring",
"qty": 1
}, {
"id": 987,
"item": "London Eye Ticket",
"qty": 5
}
]
}
}
GET /customers/123/orders
GET /customers/123?inc-orders=1
Am I correct that the last part/folder of the URL, excluding query string params, should be the resource returned..?
If so, number 1 should only return order data and not include the customer data. While number 2 is pointing directly at customer 123 and uses query string params to effect/filter the customer data returned, in this case including the order data.
Which of these two calls is the correct RESTful call for the above JSON..? ...or is there a secret number 3 option..?
You have 3 options which I think could be considered RESTful.
1)
GET /customers/12
But always include the orders. Do you have a situation in which the client would not want to use the orders? Or can the orders array get really big? If so you might want another option.
2)
GET /customers/123, which could include a link to their orders like so:
{
"customer": {
"id": 123,
"name": "Jim Bloggs"
"orders": {
"href": "<link to you orders go here>"
}
}
}
With this your client would have to make 2 requests to get a customer and their orders. Good thing about this way though is that you can easily implement clean paging and filtering on orders.
3)
GET /customers/123?fields=orders
This is similar to your second approach. This will allow clients to use your API more efficiently, but I wouldn't go this route unless you really need to limit the fields that are coming back from your server. Otherwise it will add unnecessary complexity to your API which you will have to maintain.
The Resource (identified by the complete URL) is the same, a customer. Only the Representation is different, with or without embedded orders.
Use Content Negotiation to get different Representations for the same Resource.
Request
GET GET /customers/123/
Accept: application/vnd.acme.customer.short+json
Response
200 OK
Content-Type: application/vnd.acm.customer.short+json
{
"customer": {
"id": 123,
"name": "Jim Bloggs"
}
}
Request
GET GET /customers/123/
Accept: application/vnd.acme.customer.full+json
Response
200 OK
Content-Type: application/vnd.acme.customer.full+json
{
"customer": {
"id": 123,
"name": "Jim Bloggs"
"orders": [
{
"id": 123,
"item": "Union Jack Keyring",
"qty": 1
}, {
"id": 987,
"item": "London Eye Ticket",
"qty": 5
}
]
}
}
The JSON that you posted looks like what would be the result of
GET /customers/123
provided the Customer resource contains a collection of Orders as a property; alternatively you could either embed them, or provide a link to them.
The latter would result in something like this:
GET /customers/123/orders
which would return something like
{
"orders": [
{
"id": 123,
"item": "Union Jack Keyring",
"qty": 1
},
{
"id": 987,
"item": "London Eye Ticket",
"qty": 5
}
]
}
I'm looking for the correct RESTful way to get this data
Simply perform a HTTP GET request on a URI that points to a resource that produces this data!
TL;DR
REST does not care about URI design - but on its constraints!
Clients perform state transitions through possible actions returned by the server through dynamically identified hyperlinks contained within the response.
Clients and servers can negotiate on a preferred hypermedia type
Instead of embedding the whole (sub-)resource consider only returning the link to that resource so a client can look it up if interested
First, REST does not really care about the URI design as long as the URI is unique. Sure, a simple URI design is easier to understand for humans, though if compared to HTML the actual link can be hidden behind a more meaninful text and is thus also not that important for humans also as long as they are able to find the link and can perform an action against it. Next, why do you think your "response" or API is RESTful? To call an API RESTful, the API should respect a couple of constraints. Among these constraints is one quite buzzword-famous: hypertext as the engine of application state (HATEOAS).
REST is a generalized concept of the Web we use every day. A quite common task for a web-session is that a client requests something where the server sends a HTML document with plenty of links and other resources the client can use to request further pages or stream a video (or what ever). A user operationg on a client can use the returned information to proceed further, request new pages, send information to the server etc, etc. The same holds true for RESTful applications. This is was REST simply defines as HATEOAS. If you now have a look at your "response" and double check with the HATEOAS constraint you might see that your response does not contain any links to start with. A client therefore needs domain knowledge to proceed further.
JSON itself isn't the best hypermedia type IMO as it only defines the overall syntax of the data but does not carry any semantics, similar to plain XML which though may have some DTD or schemas a client may use to validate the document and check if further semantic rules are available elsewhere. There are a couple of hypermedia types that build up on JSON that are probably better suited like f.e. application/hal+json (A good comparison of JSON based hypermedia types can be found in this blog post). You are of course entitled to define your own hypermedia type, though certain clients may not be able to understand it out of the box.
If you take f.e. a look at HAL you see that it defines an _embedded element where you can put in certain sub-resources. This seems to be ideal in your case. Depending on your design, orders could also be a resource on its own and thus be reachable via GET /orders/{orderId} itself. Instead of embedding the whole sub-resource, you can also just include the link to that (sub)resource so a client can look up the data if interested.
If there are cases where you want to return only customer data and other cases where you want also to include oder data you can f.e. define different hypermedia types (based on HAL f.e.) for both, one returning just the customer data while the other also includes the oder data. These types could be named like this: application/vnd.yourcompanyname.version.customers.hal+json or application/vnd.yourcompanyname.version.customer_orders.hal+json. While this is for sure an development overhead compared to adding a simple query-parameter to the request, the semantics are more clear and the documentation overhead is on the hypermedia type (or representation) rather then the HTTP operation.
You can of course also define some kind of view structure where one view only returns the customer data as is while a different view returns the customer data including the orders similar to a response I gave on a not so unrelated topic.
Say I have two collection resources:
/persons
/organizations
A GET to /persons/id/ returns a specific person. Likewise, a GET to /organizations/id returns a specific organization.
A person can be member of one or more organizations. In this relation context, we have data such as the role of the person in the organization, the date on which the person joined the organization, ...
Which of the designs make most sense?
A membership resource /memberships/id, to which a GET returns the data of the relation context (together with a link to the person and the organization).
A /persons/id/organizations/id and a /organizations/id/persons/id. A GET to one of the two returns the relation context, and a GET to the other one redirects (http status code 303) to the other.
Something else?
Another option is to embed the relationships right into the resources themselves. This makes it easier for a client to follow relationships between resources as they consume the service. For example, here's a hypothetical person with relationships to two organization resources via two membership resources, and one of those membership resources:
"person890": {
"firstName": "Jane",
"lastName": "Smith",
"links": [{
"rel": "membership",
"href": "memberships/123"
}, {
"link": "membership",
"href": "memberships/456"
}]
}
"membership123": {
"role": "chairwoman",
"date: "12/23/2013",
"term": "3 years",
"links": [{
"rel": "person",
"href": "persons/890",
}, {
"rel": "organization",
"href": "organizations/7575"
}]
}
The basic principle at work here is HATEOAS - "Hypermedia as the Engine of Application State" - which enables a client with minimal understanding of your data to still interact with your API.
If your question is limited to the structure, I think there's no objectively correct answer. In principle, you should stick with whatever keeps consistency across your API. If there's nothing like this already implemented, I think it depends on what your goal is. If you want to keep the API as simple as possible, option 1 seems good enough.
Usually, I try to make the API as flexible as possible for the clients, so that they can get the exact information they need with as few requests as possible, and without bothering me to implement custom endpoints. Assuming organizations can be huge and have a lot of members, while a person can't be a member of a lot of organizations, this is what I'd do:
-I see no reason to have the two-level URI on both sides, so /persons/id can be the canonical URI for the person and /persons to the paginated collection of all persons across all organizations. organizations/id can be the URI for the organization, and /organizations/id/persons can give you a collection to all persons within an organization, and an alternative URI for the person.
I see no need for the 303, but that's a matter of option. You may have /organizations/id/persons/id redirect to /persons/id if you want.
Keep the /memberships/id as you described in 1.
Assuming you're using some form of HATEOAS, all resources should have links to the related resources.
A few other ideas I often implement that help usability and flexibility are:
All resources should have a self link to the canonical URI.
You should be able to query the collections. Like /memberships?person_id=X should generate a subset of the collection that lists all membership instances for that person.
You should be able to expand a resource representation to include an embedded representation. It may be something explicit, like /persons/id?expand=memberships should generate a representation of person with a field containing an embedded list of all memberships, or you can use something I call the zoom protocol. You have a parameter that indicates how many levels of relationships should be embedded, decreasing it as you progress through the relationships. So, /persons/id?zoom=1 will embed memberships, /persons/id?zoom=2 will embed memberships, and apply zoom=1 to the membership representations themselves, embedding organizations.
Say I have the following two root resources:
.../organizations
.../persons
A GET on .../organizations/id returns all the information about a specific organization, such as the name, location, etc.
A GET on .../persons/id returns all the information about a specific person, such as the name, age, gender, etc.
What is the preferred RESTful way to model the membership of a person in an organization (for retrieval and creation)? I do not only want to model the membership itself, but also add extra properties, such as the date on which the person joined the organization, his/her role in the organization, ...
Some thoughts:
If we provide .../organizations/id/persons/id, what should a GET return? Only the membership data (join data, role, ...) and a link to .../persons/id? The REST API user can use the link to fetch all the information about the person.
Do we provide a possibility to POST to .../persons for creating a person, and another/separate POST to .../organizations/id/persons for creating the membership?
Going further, let's say a person must always be member of at least one organization. In that case, we need one POST for atomically creating the person and the membership at the same time.
How do we model that? Preferably, I would like to keep the root resources .../organizations and .../projects. It doesn't make sense to create a person on .../organizations/id/persons, neither it does make sense to create a membership on .../persons/.
Wouldn't using HAL and its simple format fulfil your needs?
Let's suppose we have defined resources for persons, organizations and memberships
and we are attempting to retrieve information related to a person identified by "42".
Request:
GET /persons/42 HTTP/1.1
Accept: application/hal+json
Response:
HTTP1.1 200 OK
Content-Type: application/hal+json
{
"id": 42,
"name": "Smith",
"firstName": "John",
"organization": {
"id": 1234,
"name": "blah",
"href": "http://myserver/organizations/1234"
},
"membership": {
"id": 5678,
"name": "blih",
"href": "http://myserver/memberships/5678"
},
"_links": {
"self" : {
"href" : "http://myserver/persons/42"
}
}
}
The person resource refers to the parent organization through the "organization"
relation. That relation allows you to easily navigate to the corresponding organization
resource through the corresponding href link.
In the same manner, the membership relations allows to access the corresponding
membership data (once again through the "href" link), if you consider that membership
associates one person to one organization.
Request:
GET /memberships/5678 HTTP/1.1
Accept: application/hal+json
Response:
HTTP1.1 200 OK
Content-Type: application/hal+json
{
"id": 5678,
"name": "blih",
"person": {
"id": 42,
"href": "http://myserver/persons/42"
},
"organization": {
"id": 1234,
"href": "http://myserver/organizations/1234"
},
"_links": {
"self": {
"href": "http://myserver/memberships/5678"
}
}
}
Please note that I'm not saying that the modelisation above is the right one for your
needs (one person can probably belong to several organizations, for example, and you then need an array in the serialization).
My point is that using HAL might help you modelize what you want.
Think of a resource an object that could exist independently without a dependency to another object. This is just a guideline and you can see how it works in terms of your projects and organizations.
The way I see it a membership should be it's independent resource, because it could even exist after the resource person has been deleted, for historical purposes for example.
In that model I would create a resource /memberships, because it's not a property of a person or an organization that would prompt you to add that as a person or organization sub-resource.
I'm not sure if I agree with #hellraiser, perfect would be hard to define even by Roy's standards. I usually try to achieve the higher level of REST as described by Fowler: http://martinfowler.com/articles/richardsonMaturityModel.html
Actually, perfect RESTful API design is unreachable. I've never seen a system that satisfies all REST API theses formulated by Roy Fielding. But you can improve your skills of rest api design from project to project by following best practises.
For the first time, look this article.