REST API design: how to handle resources that can also be sub-resources - rest

I have to put a (read-only) REST service atop of an existing product database. The easy part is having a top level product resource, like:
/api/products/
Now, actually callers of this service will rather need to get their relevant products based on the ID of a store and of a specific process (like "retail"). Behind the scenes, the combination of those two values results in a configured subset of products. This must be transparent for the caller, it should not need to know about these "product portfolios".
So I thought about designing the URI like this, where 1234 is the StoreID and retail is the process:
/api/stores/1234/retail/products
The first question that comes up here is if I should return full products here or URIs to their individual resources on /api/products/ ... the pro would be clearly that the caller does not need to retrieve each individual product from /api/products, the con would be that this would cause a caching headache on the /api/stores/1234/retail/products URI.
To complicate things, those products of course also have prices. Also here, a product does not have one price, but multiple ones that is also dependent of the StoreID and the Process, besides other factors. In reality, prices are direct children of products, so:
/api/products/ABCD/prices
would be the obvious choice, but again, as StoreID and Process are relevant to pre-filter the prices, an URI like:
/api/stores/1234/retail/products/ABCD/prices
would be more appropriate.
At the same time, there are other subresources of products that will not make sense to have under this URI, like product details. Those would clearly only make sense directly under /api/products/ABCD/details as they are not dependant on the store or process.
But this looks somehow messy to me. But at the same time, solving this by only having queryparam filters to solve it directly on the product resource, is not much nicer and does not enforce the caller to provide both, StoreId and process:
/api/products?store=1234&process=retail
/api/products/ABCD/prices?store=1234&process=retail
Even more, process or storeid does not have anything to do with the product, so querying for it directly on product seems odd. For prices, it would make sense, though.
So my question is: is there a good way to solve this that i don't see? And: would you recommend returning full products when they are a subresource - and what do you think about handling (HTTP) caching when doing that?

The first question that comes up here is if I should return full
products here or URIs to their individual resources on /api/products/
[...] the con would be that this
would cause a caching headache on the /api/stores/1234/retail/products
URI.
I would definitely return the full products - imagine the amount the client would have to do if it would simply want to display a list of product names. Ideally this endpoint would be paginated (query string can include something like &pageSize=10&pageNumber=2, for example).
Also there are various caching solutions around this - for example you can cache all the products in a data structure service like Redis.
To complicate things, those products of course also have prices [...]
and details subresource.
Looking at the Richardson Maturity Model level 3, this would be where links come into play, and you could have something like this under a product resource:
<link rel = "/linkrels/products/ABCD/prices"
uri = "/products/ABCD/prices?store=1234&process=retail"/>
and another similar link for the product details resource.
#Roman is right, REST is meant to be discoverable, clients should simply follow links (that can have long/ugly uris), instead of having to memorize them (like in SOAP, for example).

Related

How should I design a REST API

I'm thinking about a REST API design. There are several tables in my database. For example Customer and Order.
Of course - each Order has its Customer (and every customer can have many Orders).
I've decided to provide such an interface
/api/v1/Customers/ -- get list of Customers, add new Customer
/api/v1/Customers/:id: -- get Customer with id=:id:
/api/v1/Orders/ -- get list of Orders, add new Order
/api/v1/Orders/:id: -- get Order with id=:id:
It works flawlessly. But my frontend has to display a list of orders with customer names. With this interface, I will have to make a single call to /api/v1/Orders/ and then another call to /api/v1/Customer/:id: for each record from the previous call. Or perform two calls to /api/v1/Orders/ and /api/v1/Customers/ and combine them on the frontend side.
It looks like overkill, this kind of operation should be done at the database level. But how can/should I provide an appropriate interface?
/api/v1/OrdersWithCustomers
/api/v1/OrdersWithCustomers/:id:
Seems weir. Is it a right way to go
There's no rule that says you cannot "extend" the data being returned from a REST API call. So instead of returning "just" the Order entity (as stored in the backend), you could of course return an OrderResponseDTO which includes all (revelant) fields of the Order entity - plus some from the Customer entity that might are relevant in your use case.
The data model for your REST API does not have to be an exact 1:1 match to your underlying database schema - it does give you the freedom to leave out some fields, or add some additional information that the consumers of your API will find helpful.
Great question, and any API design will tend to hit pragmatic reality at some point like this.
One option is to include a larger object graph for each resource (ie include the customer linked to each order) but use filter query parameters to allow users to specify what properties they require or don't require.
Personally I think that request parameters on a restful GET are fine for either search semantics when retrieving a list of resources, or filtering what is presented for each resource as in this case
Another option for your use case might be to look into a GraphQL approach.
How would you do it on the web?
You've got a web site, and that website serves documents about Customers, and documents about Orders. But your clients aren't happy, because its too much boring, mistake-prone work to aggregate information in the two kinds of documents.
Can we please have a document, they ask, with the boring work already done?
And so you generate a bunch of these new reports, and stick them on your web server, and create links to make it easier to navigate between related documents. TA-DA.
A "REST-API" is a facade that makes your information look and act like a web site. The fact that you are generating your representations from a database is an implementation details, deliberately hidden behind the "uniform interface".

REST URL Design for One to Many and Many to Many Relationships

Your backend has two Models:
One Company to Many Employees.
You want to accomplish the following:
Get all Companies
Get a Company by ID
Get all Employees for a Company
Get all Employees
Get a Employee by ID
What is the best practice for handling the REST URLs when your models have 1:M relationships? This is what I have thought of so far:
/companies/
/companies/<company_id>/
/companies/<company_id>/employees/
/employees/
/employees/id/<employee_id>/
Now let's pretend One Company has Many Models. What is the best name to use for "Adding an employee to a Company" ? I can think of several alternatives:
Using GET:
/companies/<company_id>/add-employee/<employee_id>/
/employees/<employee_id/add-company/<company_id>/
Using POST:
/companies/add-employee/
/employees/add-company/
The URIs look fine to me, except maybe the last one, that does not need an additional "id" in the path. Also, I prefer singular forms of words, but that is just me perhaps:
/company/
/company/<company_id>/
/company/<company_id>/employee/
/employee/
/employee/<employee_id>/
The URIs do not matter that much actually, and can be changed at any point later in time when done properly. That is, all the URIs are linked to, instead of hardcoded into the client.
As far as adding an employee, I would perhaps use the same URIs defined above, and the PUT method:
PUT /employee/123
With some representation of an employee. I would prefer the PUT because it is idempotent. This means, if the operation seems to fail (timeout, network error occurs, whatever) the operation can be repeated without checking whether the previous one "really" failed on the server or not. The PUT requires some additional work on the server side, and some additional work to properly link to (such as forms), but offers a more robust design.
As an alternative you can use
POST /employee
With the employee representation as body. This does not offer any guarantees, but it is easier to implement.
Do not use GET to add an employee (or anything for that matter). This would go against the HTTP Specification for the GET method, which states that it should be a pure information retrieval method.

RESTful services naming: GET the details of a list of products

I am trying to understand which should be the correct REST approach to name some of a e-commerce style endpoint.
If I am not mistaken, getting a list of products and the details of one of each will end up with two GET endpoints as
A) GET /products
B) GET /products/id
(I deliberately skipped pagination issues)
If I am looking at the list of shops which sell a specific product I can specify and endpoint as such
C) GET /products/id/shops
I struggle to understand though what happens if I need to specify more than one product for shop reserach.
Can the above endpoint be expanded to take multiple parameters or this is somehow discouraged?
In other words, should I be looking into something like
1) GET /products/id1,id2,id3/shops
2) GET /products/id1/shops [id2,id3]
or rather a completely new
3) GET /shops [id1,id2,id3]
?
Notes
Unanswered question in SO seems to underline that this is a sort of an untold story in RESTful services... :)
My current source of reference
As referred in many SO answers, as for example here, the URI does not make the service RESTful.
I agree with it, so to extend the concept a little my point above is that the implementation of the a service like 1) above may be (and in my case is, for server implementation details) different from easily combining the result given by 3 endpoints in the form of C).
In a more general sense, the implementation of such a combination could be kept internal.
Thus, yes the URI does not make the service RESTful but it would be nice to extend the cleaniness and expresiveness of the C) form for multiple ids.
Edit
In response to the correct note by Lutz in answer, shops may be trated as resources on their own.
What if, I came out with this not so clever example, the subresource does not really exists on its own, as for example the free places for 2 movies in a cinema where
GET /movies/12,14/places
where
GET /places?movies=12,14
is obviously feasible but not that RESTful imho.
I agree with the answer given by Lutz Horn. Furthermore, I'm not sure why you would think that using the query string in a GET request is "not that RESTful".
To quote from O'Reilly's RESTful Web Services, pg 233, under the header URI Design (emphasis mine):
Use punctuation characters to separate multiple pieces of data at the same level of hierarchy. Use commas when the order of the items matters, as it does in latitude and longitude: /Earth/37.0,-95.2. Use semicolons when the order doesn't matter: /color-blends/red;blue.
Use query variables only to suggest arguments being plugged into an algorithm, or when the other two techniques fail. If two URIs differ only in their query variables, it implies that they're the different sets of inputs into the same underlying algorithm.
This should give you sufficient direction in determining how to construct your routes. While you could easily construct it thusly (using semicolons, because it doesn't seem that order matters here):
GET /products/id1;id2;id3/shops
You could just as easily write it this way:
GET /shops?productIds=id1,id2,id3
And this might be a more reasonable approach given that what you indicate you are doing is searching for all shops that carry a particular item, and search is an algorithm for which product ids are input parameters.
To your example of movies (if I am understanding it correctly), if you are looking for places (our resource) where the movie (or movies) is showing AND seating is available:
GET /places?movie=id1,id2,id3&availability=true
It looks like you just have additional search parameters. If I am misunderstanding your "subresource may not exist" comment, please clarify this for us so that we can address it properly.
I would make shops a separate resource.
GET /shops lists all shops
GET /shops/123 gets the details of shop 123
GET /shops?sellsProduct=id1,id2,id3 lists all shop that sell the products
In URLs like /products/id/shops the shops are a subresource of a product. But since a shop can exist independently of any product, this makes not much sense.

Dynamic representation of a REST resource

Lets assume I have an object that I expose as a REST resource in my application. This object has many fields and contains many other objects including associated collections. Something like this, but think MUCH bigger:
Customer
List<Order> orders
List<Address> shippingAddresses;
// other fields for name, etc.
Order
List<Product> products
// fields for total, tax, shipping, etc.
Product
// fields for name, UPC, description, etc.
I expose the customer in my api as /customer/{id}
Some of my clients will want all of the details for every product in each order. If I follow HATEOAS I could supply a link to get the product details. That would lead to n+1 calls to the service to populate the products within the orders for the customer. On the other hand, if I always populate it then many clients receive a bunch of information they don't need and I do a ton of database lookups that aren't needful.
How do I allow for a customer representation of my resource based on the needs of the client?
I see a few options.
Use Jackson's JsonView annotation to specify in advance what is used. The caller asks for a view appropriate to them. i.e. /customer/{id}?view=withProducts. This would require me to specify all available views at compile time and would not be all that flexible.
Allow the caller to ask for certain fields to be populated in the request, i.e. /customer/{id}?fields=orders,firstName,lastName. This would require me to have some handler that could parse the fields parameter and probably use reflection to populate stuff. Sounds super messy to me. The what do you do about sub-resources. Could I do fields=orders.products.upc and join into the collection that way? Sounds like I'm trying to write hibernate on top of REST or something.
Follow HATEOAS and require the client to make a million HTTP calls in order to populate what they need. This would work great for those that don't want to populate the item most of the time, but gets expensive for someone that is attempting to show a summary of order details or something like that.
Have separate resources for each view...
Other?
I would do something like this:
/customers/{id}/orders/?include=entities
Which is a kind of a more specific variation of your option 1.
You would also have the following options:
Specific order from a specific customer without list of products:
/customers/{id}/orders/{id}
Just the orders of a customer without products:
/customers/{id}/orders/
I tend to avoid singular resources, because most of the time or eventually someone always wants a list of things.
Option 2 (client specifies fields) is a filtering approach, and acts more like a query interface than a GETable resource. Your filter could be more expressive if you accept a partial template in a POST request that your service will populate. But that's complicated.
I'm willing to bet all you need is 2 simple representations of any complex entity. That should handle 99.9% of the cases in your domain. Given that, make a few more URIs, one for each "view" of things.
To handle the 0.1% case (for example, when you need the Products collection fully populated), provide query interfaces for the nested entities that allow you to filter. You can even provide hypermedia links to retrieve these collections as part of the simplified representations above.

REST API Design: Nested Collection vs. New Root

This question is about optimal REST API design and a problem I'm facing to choose between nested resources and root level collections.
To demonstrate the concept, suppose I have collections City, Business, and Employees. A typical API may be constructed as follows. Imagine that ABC, X7N and WWW are keys, e.g. guids:
GET Api/City/ABC/Businesses (returns all Businesses in City ABC)
GET Api/City/ABC/Businesses/X7N (returns business X7N)
GET Api/City/ABC/Businesses/X7N/Employees (returns all employees at business X7N)
PUT Api/City/ABC/Businesses/X7N/Employees/WWW (updates employee WWW)
This appears clean because it follows the original domain structure - business are in a city, and employees are at a business. Individual items are accessible via key under the collection (e.g. ../Businesses returns all businesses, while ../Businesses/X7N returns the individual business).
Here is what the API consumer needs to be able to do:
Get businesses in a city (GET Api/City/ABC/Businesses)
Get all employees at a business (GET Api/City/ABC/Businesses/X7N/Employees)
Update individual employee information (PUT Api/City/ABC/Businesses/X7N/Employees/WWW)
That second and third call, while appearing to be in the right place, use a lot of parameters that are actually unnecessary.
To get employees at a business, the only parameter needed is the key of the business (X7N).
To update an individual employee, the only parameter needed it the key of the employee (WWW)
Nothing in the backend code requires non-key information to look up the business or update the employee. So, instead, the following endpoints appear better:
GET Api/City/ABC/Businesses (returns all Businesses in City ABC)
GET Api/Businesses/X7N (returns business X7N)
GET Api/Businesses/X7N/Employees (returns all employees at business X7N)
PUT Api/Employees/WWW (updates employee WWW)
As you can see, I've created a new root for businesses and employees, even though from a domain perspective they are a sub/sub-sub-collection.
Neither solution appears very clean to me.
The first example asks for unnecessary information, but is structured in a way that appears "natural" to the consumer (individual items from a collection are retrieved via lower leafs)
The second example only asks for necessary information, but isn't structured in a "natural" way - subcollections are accessible via roots
The individual employee root would not work when adding a new employee, as we need to know which business to add the employee to, which means that call would at least have to reside under the Business root, such as POST Api/Businesses/X7N7/Employees, which makes everything even more confusing.
Is there a cleaner, third way that I'm not thinking of?
I don't see how REST adds a constraint that two resources could not have the same value. The resourceType/ID is just an example of the easiest use case rather than the best way to go from a RESTful point of view.
If you read paragraph 5.2.1.1 of Roy Fielding's dissertation carefully, you will notice that Fielding makes the disctinction between a value and a resource. Now a resource should have a unique URI, that's true. But nothing prevents two resources from having the same value:
For example, the "authors' preferred version" of an academic paper is a mapping whose value changes over time, whereas a mapping to "the paper published in the proceedings of conference X" is static. These are two distinct resources, even if they both map to the same value at some point in time. The distinction is necessary so that both resources can be identified and referenced independently. A similar example from software engineering is the separate identification of a version-controlled source code file when referring to the "latest revision", "revision number 1.2.7", or "revision included with the Orange release."
So nothing prevents you from, as you say, changing the root. In your example, a Business is a value not a resource. It is perfectly RESTful to create a resource which is a list of "every business located in a city" (just like Roy's example, "revisions included with the Orange release"), while having a "business which ID is x" resource as well (like "revision number x").
For Employees, I would keep API/Businesses/X7N/Employees as the relation between a business and its employees is a composition relationship, and thus as you say, Employees can and should only be accessed through the Businesses class root. But this is not a REST requirement, and the other alternative is perfectly RESTful as well.
Note that this goes in pair with the application of the HATEAOS principle. In your API, the list of Businesses located in a city could (and perhaps should from a theoretical point of view) be just a list of links to the API/Businesses. But this would mean that the clients would have to do one round-trip to the server for each of the items in the list. This is not efficient and, to stay pragmatic, what I do is embed the representation of the business in the list along with the self link to the URI that would be in this example API/Businesses.
You should not confuse REST with the application of a specific URI naming convention.
HOW the resources are named is entirely secondary. You are trying to use HTTP resource naming conventions - this has nothing to do with REST. Roy Fielding himself states so repeatedly in the documents quoted above by others. REST is not a protocol, it is an architectural style.
In fact, Roy Fielding states in his 2008 blog comment (http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven 6/20/2012):
"A REST API must not define fixed resource names or hierarchies (an obvious coupling of
client and server). Servers must have the freedom to control their own namespace. Instead,
allow servers to instruct clients on how to construct appropriate URIs, such as is done in
HTML forms and URI templates, by defining those instructions within media types and link relations."
So in essence:
The problem you describe is not actually a problem of REST - conceptually, it is a problem of HIERARCHY STRUCTURES versus RELATIONAL STRUCTURES.
While a business is "in" a city and so can be considered to be part of the city "hierarchy" - what about international companies which have offices in 75 cities. Then the city suddenly becomes the junior element in a hierarchy with the business name at the senior level of the structure.
The point is, you can view data from various angles, and depending on the viewpoint you take, it may be simplest to see it as a hierarchy. But the same data can be seen as a hierarchy with different levels. When you are using HTTP type resource names, then you have entered a hierarchy structure defined by HTTP. This is a constraint, yes, but it's not a REST constraint, it's a HTTP constraint.
From that angle, you can chose the solution which fits better to your scenario. If your customer cannot supply the city name when he supplies the company name (he may not know), then it would be better to have the key with only city name. As I said, it's up to you, and REST won't stand in your way ...
More to the point:
The only real REST constraints you have, if you have already decided to use HTTP with GET
PUT and so on, are:
Thou shalt not presumeth any prior ("out of band") knowledge between client and servers. *
Look at your proposal #1 above in that light. You assume that customers know the keys for the cities which are contained in your system? Wrong - that's not restful. So the server has to give the list of cities as a list of choices in some way. So are you going to list every city in the world here?
I guess not, but then you'll have to do some work on how you are planning to do this, which brings us to:
A REST API should spend almost all of its descriptive effort in defining the media type(s) used for representing resources and driving application state ...
I think, reading the mentioned Roy Fielding blog will help you out considerably.
In a RESTful-API URL design should be quite unimportant - or at least a side issue since the discoverability is encoded in the hypertext and not in the URL path. Have a look at the resources linked in the REST tag wiki here on StackOverflow.
But if you want to design human readable URLs for your UC, I would suggest the following:
Use the resource type you are creating/updating/querying as the first part of the URL (after your API prefix). So when somebody sees the URL he immediately knows to which resources this URL points. GET /Api/Employees... is the only only way to receive Employee resources from the API.
Use Unique IDs for each resource independent of the relations they are in. So GET /Api/<CollectionType>/UniqueKey should return a valid resource representation. Nobody should have to worry where the Employee is located. (But the returned Employee should have the links to the Business (and for convenience sake City) he belongs to.) GET /Api/Employees/Z6W returns the Employee with this ID no matter where is is located.
If you want to get a specific resource: Put your query parameter at the end (instead in the hierarchical order described in the question). You can use the URL query string (GET /Api/Employees?City=X7N) or a matrix parameter expression (GET /Api/Employees;City=X7N;Business=A4X,A5Y). This will allow you to easily express a collection of all Employees in a specific City - independent of the Business they are in.
Side node:
In my experience an initial hierarchical domain data model seldom survives additional requirements that come up during a project. In your case: Consider a business located in two Cities. You could create a workaround by modelling it as two separate businesses but what about the employee who works half his time in one place and the other half at the other location? Or even worse: It's only clear for which business he works but it's undefined, in which city?
The third way that I see is to make Businesses and Employees root resources and use query parameters to filter collections:
GET Api/Businesses?city=ABC (returns all Businesses in City ABC)
GET Api/Businesses/X7N (returns business X7N)
GET Api/Employees?businesses=X7N (returns all employees at business X7N)
PUT Api/Employees/WWW (updates employee WWW)
Your both solutions use concept of REST sub-resources which requires that subresource is included in parent resource so:
GET Api/City/ABC/Businesses
in response should also return data provided by:
GET Api/City/ABC/Businesses/X7N
GET Api/City/ABC/Businesses/X7N/Employees
similar for:
GET Api/Businesses/X7N
which should return data provided by:
GET Api/Businesses/X7N/Employees
It will make size of the response huge and time required to generate will increase.
To make REST API clean each resource should have only one bounded URI which fallow below patterns:
GET /resources
GET /resources/{id}
POST /resources
PUT /resources/{id}
If you need to make links between resources use HATEOAS
Go with example 1. I wouldn't worry about unnecessary information from the point of view of the server. A URL should clearly identify a resource in a unique fashion from the point of view of the client. If the client would not know what /Employee/12 means without first knowing that it is actually /Businesses/X7N/Employees/12 then the first URL seems redundant.
The client should be dealing with URLs rather than the individual parameters that make up the URLs, so there is nothing wrong with long URLs. To the client they are just strings. The server should be telling the client the URL to do what it needs to do, not the individual parameters that then require the client to construct the URL.