I am building an event management system. The schema is described below. The API understands these relations and returns links to related resources in the request result. For example,
GET /Events/1
"links": {
"Venue": "/Venues/1",
"Tickets": "/Tickets?event_id=1",
"Teams": "/Teams?event_id=1",
"Registrations": "/Registrations?event_id=1"
}
Most of what I've read about REST and HATEOAS suggests this is the "right" way to do it however it is very inefficient. For example, if I want to generate a report of users and teams participating in an event it requires many requests. This is analogous to running several select queries instead of running a single join query against a DB. So my question is, should I expand the relationships and embed related resources within a resource request? This also poses a problem b/c the request above will return A LOT of data. The answer might be to stick with the relationship links and setup proper caching. Regardless, I would like your opinions.
Schema
events
hasMany registrations
hasMany tickets
hasMany teams
team
belongsTo event
ticket
belongsTo event
hasMany registrations
user
hasMany registrations
registrations
belongsTo event
belongsTo ticket
belongsTo user
belongsTo team
There is nothing wrong with returning the full representation of resources in the body of another request. This can be on the verbose side though, as you mentioned.
Given that some callers of the service may only want the URIs returned but sometimes you want to reduce number of round trips across the network, i.e., you want everything in one call then, the term you are searching for is, projections.
These are different representations of your resources catered to the needs of the client.
You can specify these in a URI parameter, e.g., GET /Events/1?venueProjection=full,teamProjection=uri
And then return the projection according to what the client asked.
"links": {
"Venue": {
"uri": "/Venues/1",
"attr1": "value1",
"attrN": "valueN"
},
"Tickets": "/Tickets?event_id=1",
"Teams": "/Teams?event_id=1",
"Registrations": "/Registrations?event_id=1"
}
Note: Always return the URI with your projections so that, if they aren't full, the client has easy access to the full resource later.
I suggest you do some reading around "rest projections" from Google or check out the RESTful Cookbook.
Related
Suppose you have a One-To-Many or Many-To-Many relationship in Spring Data REST. Let's say you have groups that has a One-to-Many relationship with users. If you get the list of associations from a group you will get back links like this:
{
"_embedded": {
"users": [
{
"username": "test25",
"enabled": false,
"firstName": "strifng",
"lastName": "sdfdffff",
"_links": {
"self": {
"href": "…/users/78"
}
}
},
{
"username": "test33",
"enabled": true,
"firstName": "sd",
"lastName": "asdfsa",
"_links": {
"self": {
"href": "…/users/77"
}
}
}
}
]
}
Which is useless if you are trying to remove a particular user from a group. You are either forced to use PUT with /groups/{id}/users but that is impossible if you have thousands of users. You can POST to /groups/{id}/users with a list of URI but you can't DELETE to /groups/{id}/users.
Why?
The only way DELETE works is by calling /groups/{id}/users/{id} but there's no way to construct this URI from the front end as it is not returned in the collection.
How do you get around this?
The pattern that you'd need to use here is to access the association resource asking it for the text/uri-list media type, get the URIs of all linked resources, modify the list as you see need and PUT it back to the association resource. I.e.:
GET /groups/4711/users
200 OK
…/users/3149
…/users/41
…/users/4711
Followed by a:
PUT /groups/4711/users
…/users/3149
…/users/4711
Basically removing the user with id 41 from the association.
The problem
The problem with this suggestion is it that it currently doesn't work 🙃. It's broken in the sense that the lookup of the list of URIs currently fails due to some bug. Looks like that functionality went off the radar at some point (as it's not even advertised in the reference docs anymore). Good news is that I filed and fixed a ticket for you. If you give the latest snapshots a try, the suggested protocol should be working.
Some general considerations
In general it's hard to provide an API to generically remove individual items from a collection of associations. The HTTP DELETE method unfortunately operates on the target URI only and does not take any request body. I.e. you'd have to expose some kind of identification mechanism for the individual collection elements within the URI. There's no spec that I am aware of that defines how to do that and we don't want to get into the business of defining one.
One could investigate the ability to use JSON Patch requests to collection like association resources but that's not without problems either. I've filed a ticket to keep track of that idea.
Besides that a potentially ever-growing list of references to other resources is pretty hard to manage in the first place. It might be a better choice to augment the resources space with a custom resource that handles the unassignment of the user from the group and advertise that through a custom link.
Consider I have a User resource on database and it contains a list of Addresses with in it.
GET /users -> Returns list of users
[
{
"name":"Rick",
"email":"abc#example.com,
"addresses":[
"home" : {
....
},
"work" : {
...
}
]
}
]
Here addresses is a part of User resource. Now when designing api for CRUD on addresses the below seems to be a of good structure.
GET /users/{user-id}/addresses
POST /users/{user-id}/addresses
But i'm confused as addresses doesn't relate to a direct domain model on database.
So question is
Are Resources need to be directly in relation with domain models? or above design is proper?
Are Resources need to be directly in relation with domain models?
No. In fact, it's been argued that they shouldn't be.
Your data model is not your object model is not your resource model is not your affordance model. -- Amundsen, 2016.
So, GET /users/{user-id}/addresses POST /users/{user-id}/addresses is valid?
It's not clear which question you are asking.
Having a resource that returns representations of a collection of addresses for a user? Sure, there's nothing wrong with doing that. You may want to think through some of the caching implications -- what happens to cached representations of /users/{user-id} when somebody posts a change to /users/{user-id}/addresses? -- and make the trade-offs appropriate to your situation.
Are those identifier spellings appropriate for a resource? Sure -- but so are any other ones you might imagine. REST doesn't care what spelling you use, so long as you are following the appropriate standard (in the case of URI: RFC 3986).
Let's take the following example:
We want to expose company and employee information from a RESTful API.
Company data should be quite simply:
GET api/v1/companies
GET api/v1/companies/{id}
Employees BELONG to a company, but we still want to retrieve them individually as well, so which solution is best:
Solution 1: Using sub-resources
Get all employees for a company:
GET api/v1/companies/{companyId}/employees
Get a specific employee:
GET api/v1/companies/{companyId}/employees/{employeeId}
Solution 2: Using an independent resources
Get all employees for a company:
GET api/v1/employees?companyId={companyId}
Get a specific employee:
GET api/v1/employees/{employeeId}
Both options seem to have their pros and cons.
With sub-resources, I may not always have the CompanyId on hand when wanting to retrieve an individual employee.
With an independent resource, getting all employees for a company should use the sub-resource approach if we want to be RESTful.
Otherwise, we could use a mix, but this lacks consistency:
Get all employees for a company:
GET api/v1/companies/{companyId}/employees
Get a specific employee:
GET api/v1/employees/{employeeId}
What is the best approach to take in such a situation if we want to stay true to RESTful standards?
For me this sounds like the common many-to-many relationship problem for RESTful services. (see How to handle many-to-many relationships in a RESTful API?)
Your first solution seems good at first but you will have problems whenever you want to access the relation itself.
Instead of returning the employee with the following GET request you should return the relation.
GET api/v1/companies/{companyId}/employees/{employeeId}
If the relation can be identified by 2 keys this solutions seems to be fine. But what happens if the relation is identified by 3+ id's? The URI becomes rather long.
GET api/v1/companies/{companyId}/employees/{employeeId}/categories/{categoryId}
In this case I would come up with a separate resource for the relation:
GET api/v1/company-employees/{id}
The returned model in JSON would look like this:
{
"id": 1 <- the id of the relation
"company": {
"id": 2
},
"employee": {
"id": 3
},
"category": {
"id": 4
}
}
I think it would be okay to provide both. If you want the client to browse through the list of companies first, then select a company and then get the list of all employees, the first approach is necessary. If, may be in addition, you want the client to be able to filter employees by name or age, but without knowing the company identifier, you must provide the second approach as well. It depends on what you want the client to do. In my opinion, it would not be necessary to provide the second approach, if clients can only filter employees by company identifier.
I would go for the first approach and providing some links to retrieve the subordinate resource.
If I take the example of a new employee that you may add in a company. It seems to be difficult, for the client with the second approach to make a POST on your collections. Why ? Because he has to know the company id that is "somewhere else".
With the first approach, as you followed a path, you already know this information (the companyId)... so it's easier for the client to add a new employee.
Back to your example, the main benefit of the second approach is, if your client want something like "the amount of employees in a city", where you don't care about the notion of company.
But it seems that you need the notion of company, so I would go for the first.
Also, very related to this question: RESTful design: when to use sub-resources?
Say I have two collection resources:
/persons
/organizations
A GET to /persons/id/ returns a specific person. Likewise, a GET to /organizations/id returns a specific organization.
A person can be member of one or more organizations. In this relation context, we have data such as the role of the person in the organization, the date on which the person joined the organization, ...
Which of the designs make most sense?
A membership resource /memberships/id, to which a GET returns the data of the relation context (together with a link to the person and the organization).
A /persons/id/organizations/id and a /organizations/id/persons/id. A GET to one of the two returns the relation context, and a GET to the other one redirects (http status code 303) to the other.
Something else?
Another option is to embed the relationships right into the resources themselves. This makes it easier for a client to follow relationships between resources as they consume the service. For example, here's a hypothetical person with relationships to two organization resources via two membership resources, and one of those membership resources:
"person890": {
"firstName": "Jane",
"lastName": "Smith",
"links": [{
"rel": "membership",
"href": "memberships/123"
}, {
"link": "membership",
"href": "memberships/456"
}]
}
"membership123": {
"role": "chairwoman",
"date: "12/23/2013",
"term": "3 years",
"links": [{
"rel": "person",
"href": "persons/890",
}, {
"rel": "organization",
"href": "organizations/7575"
}]
}
The basic principle at work here is HATEOAS - "Hypermedia as the Engine of Application State" - which enables a client with minimal understanding of your data to still interact with your API.
If your question is limited to the structure, I think there's no objectively correct answer. In principle, you should stick with whatever keeps consistency across your API. If there's nothing like this already implemented, I think it depends on what your goal is. If you want to keep the API as simple as possible, option 1 seems good enough.
Usually, I try to make the API as flexible as possible for the clients, so that they can get the exact information they need with as few requests as possible, and without bothering me to implement custom endpoints. Assuming organizations can be huge and have a lot of members, while a person can't be a member of a lot of organizations, this is what I'd do:
-I see no reason to have the two-level URI on both sides, so /persons/id can be the canonical URI for the person and /persons to the paginated collection of all persons across all organizations. organizations/id can be the URI for the organization, and /organizations/id/persons can give you a collection to all persons within an organization, and an alternative URI for the person.
I see no need for the 303, but that's a matter of option. You may have /organizations/id/persons/id redirect to /persons/id if you want.
Keep the /memberships/id as you described in 1.
Assuming you're using some form of HATEOAS, all resources should have links to the related resources.
A few other ideas I often implement that help usability and flexibility are:
All resources should have a self link to the canonical URI.
You should be able to query the collections. Like /memberships?person_id=X should generate a subset of the collection that lists all membership instances for that person.
You should be able to expand a resource representation to include an embedded representation. It may be something explicit, like /persons/id?expand=memberships should generate a representation of person with a field containing an embedded list of all memberships, or you can use something I call the zoom protocol. You have a parameter that indicates how many levels of relationships should be embedded, decreasing it as you progress through the relationships. So, /persons/id?zoom=1 will embed memberships, /persons/id?zoom=2 will embed memberships, and apply zoom=1 to the membership representations themselves, embedding organizations.
I'm designing a HATEOAS API for internal data at my company, but have been having troubles with the discovery of links. Consider the following set of steps for someone to retrieve information about a specific employee in this system:
User sends GET to http://coredata/ to get all available resources, returns a number of links including one tagged as rel = "http://coredata/rels/employees"
User follows HREF on the rel from the first request, performing a GET at (for example) http://coredata/employees
The data returned from this last call is my conundrum and a situation where I've heard mixed suggestions. Here are some of them:
That GET will return all employees (with perhaps truncated data), and the client would be responsible for picking the one it wants from that list.
That GET would return a number of URI templated links describing how to query / get one employee / get all employees. Something like:
"_links": {
"http://coredata/rels/employees#RetrieveOne": {
"href": "http://coredata/employees/{id}"
},
"http://coredata/rels/employees#Query": {
"href": "http://coredata/employees{?login,firstName,lastName}"
},
"http://coredata/rels/employees#All": {
"href": "http://coredata/employees/all"
}
}
I'm a little stuck here with what remains closest to HATEOAS. For option 1, I really do not want to make my clients retrieve all employees every time for the sake of navigation, but I can see how using URI templating in example two introduces some out-of-band knowledge.
My other thought was to use the RetrieveOne, Query, and All operations as my cool URLs, but that seems to violate the concept that you should be able to navigate to the resources you want from one base URI.
Has anyone else managed to come up with a good way to handle this? Navigation is dead simple once you've retrieved one resource or a set of resources, but it seems very difficult to use for discovery.
Option 2 is not too bad as you're using RFC 6570 to characterize the URI patterns; while HATEOAS is usually stated in terms of not having clients synthesize URIs, if a server is prepared to make guarantees on the URI template and to tell it to clients explicitly in a standard format, it's acceptable. (I would be tempted to have the “list all employees” URL be without the all suffix, so as to distinguish it from the employee with that ID; the client should not — in principle — know what an employee ID looks like.)
In fact, the main problem is actually that clients have to understand what those tag URIs mean; there's just no real way to guess that “http://coredata/rels/employees#All” means “list all employees”. That's where you get into embedding knowledge in clients, semantic labeling, etc. and HATEOAS doesn't really address those things.
TL;DR: Use OPTIONS method to return programmatically consumable documentation and always implement pagination.
We create a number of internal REST services at my work. We have standardized on the use of the OPTIONS method to return the metadata of a resource. The metadata we return acts a parsable documentation of that resource. It indicates url templates, various options such as PAGE, PAGESIZE and the different methods that the resource supports. We also return rel links so top level resource discovery can occur with the use of OPTIONS without pulling and actual data.
We also implement pagination specifically to prevent issues around returning large amounts of data unnecessarily.
My HATEOAS API returns HTML as well as HAL+JSON, as you are using, and they both use the same URIs, so my JSON responses simply return what a human web user would see (minus all the pretty colours). e.g.
GET /
{"_links": {
"http://coredata/companies": { "href": "/companies?page=1" }
...
}}
GET /companies?page=1
{"_links": {
"next": { "href": "?page=2" }
...
}}