Representing multiple many-to-one relationships in REST - rest

I am creating a RESTful API that contains the following resources shown in the following UML-ish diagram. As shown by the multiplicities (between parentheses), there are four one-to-many relationships.
I currently have the following GET methods defined:
GET /farmers
GET /farmers/[farmer_id]
GET /farms
GET /farms/[farm_id]
GET /llamas
GET /llamas/[llama_id]
GET /events
I am trying to decide what is the best and most RESTful way to access these relations, as well as accessing events related to Farms and Farmers (via Llamas). All of these relations will be made available using hypermedia links. The two options I have come up with so far are:
Multiple URIs
GET /farmers/[farmer_id]/farms
GET /farmers/[farmer_id]/llamas
GET /farmers/[farmer_id]/events
GET /farms/[farm_id]/farmer
GET /farms/[farm_id]/llamas
GET /farms/[farm_id]/events
GET /llamas/[llama_id]/farm
GET /llamas/[llama_id]/farmer
GET /llamas/[llama_id]/events
Single URIs and Filtering
GET /farms?farmer=[farmer_id]
GET /llamas?farmer=[farmer_id]
GET /events?farmer=[farmer_id]
GET /farmers/[farm_farmer_id]
GET /llamas?farm=[farm_id]
GET /events?farm=[farm_id]
GET /farms/[llama_farm_id]
GET /farmers/[llama_farmer_id]
GET /events?llama=[llama_id]
Are either of these methods preferred for REST, or can I just pick the one I like the most?

It really doesn't matter from a REST perspective, as long as you provide the links to the client.
From an implementation perspective, you may want to favor one of those options over the others if you're using a framework that uses a specific convention for automatic link relation publishing. For instance, Spring Data REST gives you the multiple URI scheme you defined above out of the box. Therefore, it would be a lot less work for you to simply go with that convention.

As Jonathon W says, it doesn't matter from a REST perspective. However, it does matter from a HTTP perspective.
From HTTP perspective, the multiple URI represents a unique, potentially-cachable resource. Cache control headers will obviously control any caching. However, Most proxies will not cache URIs with a query string, so the single URI with filtering option will generally not be cachable.
You also need to consider what are the consequences of not including the query string parameters. Ie. What would /events output? If the llama_id parameter is required, this is not evident.
Personally, I would go for the multiple URI option

Related

Sub-resource creation url

Lets assume we have some main-resource and a related sub-resource with 1-n relation;
User of the API can:
list main-resources so GET /main-resources endpoint.
list sub-resources so GET /sub-resources endpoint.
list sub-resources of a main-resource so one or both of;
GET /main-resources/{main-id}/sub-resources
GET /sub-resouces?main={main-id}
create a sub-resource under a main-resource
POST /main-resource/{main-id}/sub-resouces: Which has the benefit of hierarchy, but in order to support this one needs to provide another set of endpoints(list, create, update, delete).
POST /sub-resouces?main={main-id}: Which has the benefit of having embedded id inside URL. A middleware can handle and inject provided values into request itself.
create a sub-resource with all parameters in body POST /sub-resources
Is providing a URI with main={main-id} query parameter embedded a good way to solve this or should I go with the route of hierarchical URI?
In a true REST environment the spelling of URIs is not of importance as long as the characters used in the URI adhere to the URI specification. While RFC 3986 states that
The path component contains data, usually organized in hierarchical form, that, along with data in the non-hierarchical query component (Section 3.4), serves to identify a resource within the scope of the URI's scheme and naming authority (if any). The path is terminated by the first question mark ("?") and number sign ("#") character, or by the end of the URI. (Source)
it does not state that a URI has to have a hierarchical structure assigned to it. A URI as a whole is a pointer to a resource and as such a combination of various URIs may give the impression of some hierarchy involved. The actual information of whether URIs have some hierarchical structure to it should though stem from link relations that are attached to URIs. These can be registered names like up, fist, last, next, prev and the like or Web linking extensions such as https://acme.org/rel/parent which acts more like a predicate in a Semantic Web relation basically stating that the URI at hand is a parent to the current resource. Don't confuse rel-URIs for real URIs though. Such rel-URIs do not necessarily need to point to an actual resource or even to a documentation. Such link relation extensions though my be defined by media-types or certain profiles.
In a perfect world the URI though is only used to send the request to the actual server. A client won't parse or try to extract some knowledge off an URI as it will use accompanying link relation names to determine whether the URI is of relevance to the task at hand or not. REST is full of such "indirection" mechanism in order to help decoupling clients from servers.
I.e. what is the difference between a URI like https://acme.org/api/users/1 and https://acme.org/api/3f067d90-8b55-4b60-befc-1ce124b4e080? Developers in the first case might be tempted to create a user object representing the data returned by the URI invoked. Over time the response format might break as stuff is renamed, removed and replaced by other stuff. This is what Fielding called typed resources which REST shouldn't have.
The second URI doesn't give you a clue on what content it returns, and you might start questioning on what benefit it brings then. While you might not be aware of what actual content the service returns for such URIs, you know at least that your client is able to process the data somehow as otherwise the service would have responded with a 406 Not Acceptable response. So, content-type negotiation ensures that your client will with high certainty receive data it is able to process. Maintaining interoperability in a domain that is likely to change over time is one of RESTs strong benefits and selling points. Depending on the capabilities of your client and the service, you might receive a tailored response-format, which is only applicable to that particular service, or receive a more general-purpose one, like HTML i.e.. Your client basically needs a mapping to translate the received representation format into something your application then can use. As mentioned, REST is probably all about introducing indirections for the purpose of decoupling clients from servers. The benefit for going this indirection however is that once you have it working it will work with responses issued not only from that server but for any other service that also supports returning that media type format. And just think a minute what options your client has when it supports a couple of general-purpose formats. It then can basically communicate and interoperate with various other services in that ecosystem without a need for you touching it. This is how browsers operate on the Web for decades now.
This is exactly why I think that this phrase of Fielding is probably one of the most important ones but also the one that is ignored and or misinterpreted by most in the domain of REST:
A REST API should spend almost all of its descriptive effort in defining the media type(s) used for representing resources and driving application state, or in defining extended relation names and/or hypertext-enabled mark-up for existing standard media types. (Source)
So, in a true REST environment the form of the URI is unimportant as clients rely on other mechanisms to determine whether to use that URI or not. Even for so called "REST APIs" that do not really care about the true meaning of REST and treat it more like old-school RPC the question at hands is probably very opinionated and there probably isn't that one fits all solution. If your framework supports injecting stuff based on the presence of certain query parameters, use that. If you prefer the more hierarchical structure of URIs, go for those. There isn't a right or wrong in such cases.
According to the URI standard when you have a hierarchical relationship between resources, then better to add it to the path instead of the query. https://datatracker.ietf.org/doc/html/rfc3986#page-22 Sometimes it is better to describe the relation itself, not just the sub-resource, but that happens only if the sub-resource can belong to multiple main resources, which is n:m relationship.

REST strategy for overloading GET verb

Consider a need to create a GET endpoint for fetching Member details using either of 4 options (It's common in legacy application with RPC calls)
Get member by ID
Get member by SSN
Get member by a combination of Phone and LastName (both must be passed)
What's a recommended strategy to live the REST spirit and yet provide this flexibility?
Some options I could think of are:
Parameters Based
/user/{ID}
/user?ssn=?
/user?phone=?&lname=?
Separate Endpoints
/user/{ID}
/user/SSN/{SSNID}
/user/{lname}/{phone}
RPC for custom
/user/{ID}
/user/findBySSN/
/user/findbycontact/
REST doesn't care what spelling you use for your identifiers.
For example, think about how you would do this on the web. You would provide forms, one for each set of search criteria. The consumer would choose which form to use, and submit the form, without ever knowing what the URI is.
In the case of HTML forms, there are specific processing rules for describing how the form information will be copied into the URI. The form takes on the aspect of a URI Template.
A URI Template provides both a structural description of a URI space and, when variable values are provided, machine-readable instructions on how to construct a URI corresponding to those values.
But there aren't any rules saying that restrict the server from providing a URI template that directs the client to copy the variable values into path segments rather than into the query string.
In other words, in REST, the server retains control of its own URI space.
You might sometimes prefer to use path segments because of their hierarchical nature, which may be convenient if you want the client to use relative resolution of relative references in your representations.
REST ≠ pretty URLs. The two are orthogonal.
Your question is about the latter, I feel.
Whilst the other answers have been good, if you want your API to work with HTML forms, go with query parameters on the collection /user resource for all fields, even for ID (assuming a human is typing these in based on information they are getting from sheets of paper on their desk, etc.)
If your server is able to produce links to each record, always produce canonical links such as /users/{id}, don't duplicate data under different URLs.

REST URL Design for One to Many and Many to Many Relationships

Your backend has two Models:
One Company to Many Employees.
You want to accomplish the following:
Get all Companies
Get a Company by ID
Get all Employees for a Company
Get all Employees
Get a Employee by ID
What is the best practice for handling the REST URLs when your models have 1:M relationships? This is what I have thought of so far:
/companies/
/companies/<company_id>/
/companies/<company_id>/employees/
/employees/
/employees/id/<employee_id>/
Now let's pretend One Company has Many Models. What is the best name to use for "Adding an employee to a Company" ? I can think of several alternatives:
Using GET:
/companies/<company_id>/add-employee/<employee_id>/
/employees/<employee_id/add-company/<company_id>/
Using POST:
/companies/add-employee/
/employees/add-company/
The URIs look fine to me, except maybe the last one, that does not need an additional "id" in the path. Also, I prefer singular forms of words, but that is just me perhaps:
/company/
/company/<company_id>/
/company/<company_id>/employee/
/employee/
/employee/<employee_id>/
The URIs do not matter that much actually, and can be changed at any point later in time when done properly. That is, all the URIs are linked to, instead of hardcoded into the client.
As far as adding an employee, I would perhaps use the same URIs defined above, and the PUT method:
PUT /employee/123
With some representation of an employee. I would prefer the PUT because it is idempotent. This means, if the operation seems to fail (timeout, network error occurs, whatever) the operation can be repeated without checking whether the previous one "really" failed on the server or not. The PUT requires some additional work on the server side, and some additional work to properly link to (such as forms), but offers a more robust design.
As an alternative you can use
POST /employee
With the employee representation as body. This does not offer any guarantees, but it is easier to implement.
Do not use GET to add an employee (or anything for that matter). This would go against the HTTP Specification for the GET method, which states that it should be a pure information retrieval method.

Designing a REST api by URI vs query string

Let's say I have three resources that are related like so:
Grandparent (collection) -> Parent (collection) -> and Child (collection)
The above depicts the relationship among these resources like so: Each grandparent can map to one or several parents. Each parent can map to one or several children. I want the ability to support searching against the child resource but with the filter criteria:
If my clients pass me an id reference to a grandparent, I want to only search against children who are direct descendants of that grandparent.
If my clients pass me an id reference to a parent, I want to only search against children who are direct descendants of my parent.
I have thought of something like so:
GET /myservice/api/v1/grandparents/{grandparentID}/parents/children?search={text}
and
GET /myservice/api/v1/parents/{parentID}/children?search={text}
for the above requirements, respectively.
But I could also do something like this:
GET /myservice/api/v1/children?search={text}&grandparentID={id}&parentID=${id}
In this design, I could allow my client to pass me one or the other in the query string: either grandparentID or parentID, but not both.
My questions are:
1) Which API design is more RESTful, and why? Semantically, they mean and behave the same way. The last resource in the URI is "children", effectively implying that the client is operating on the children resource.
2) What are the pros and cons to each in terms of understandability from a client's perspective, and maintainability from the designer's perspective.
3) What are query strings really used for, besides "filtering" on your resource? If you go with the first approach, the filter parameter is embedded in the URI itself as a path parameter instead of a query string parameter.
Thanks!
Something to note, is that since you use GET in your above examples,
dependant on the end user browser settings, proxy settings or other calling applications internal settings,
the response is likely to be cached somewhere on the message route, and the original request message may never actually reach your API layer where the most recent data is accessed.
so first, you may want to review your requirement of using the verb GET.
If you want your API to return the most recent data every time, then don't use GET, or if you still want to use GET, then require the caller to append a random number to the end of the url, to decrease the likely hood of caching.
or get the client to send the verb PURGE, after every GET.
this proxy caching behaviour that is present across the internet, in browsers, and server architectures is where REST fails for me, since caching of GET can be very useful but only sometimes.
stick to basic querystrings if you want to make it really simple for the end developer and caching of responses offers no problems for you,
or
just use POST and forget about this tiresome REST argument

How do you represent "thin" and "fat" versions of a RESTful resource?

How would you model a resource that can have two different representations. For example, one representation may be "thin" withe most of its related resources accessible by links. Another representation may be "fat" where most of its related resources are embedded. The idea being, some clients don't mind having to make many calls to browse around the linked resources, but others want to get the data all at once.
Consider a movie resource that is associated with a director, actors, etc. Perhaps the thin version of it has the movie title only, and to get the data for the director, list of actors, etc., one must make additional requests via the embedded links to them. Perhaps the fat version of it contains all the movie nested inside, including the director's data, the data for the various actor's, etc.
How should one model this?
I see a few options:
these two representations are really two different resources and require different URIs
these two representations are in fact the same resource, and you can select between the two representations via custom media types, for example application/vnd.movie.thin+json and application/vnd.movie.fat+json.
these two representations are in fact the same resource, and selecting the different representations should be done with query parameters (e.g. /movies/1?view=thin).
Something else...
What do you consider the proper approach to this kind of API?
You could use the prefer header with the return-minimal parameter.
I prefer using Content-Type for this. You can use parameters, too:
application/vnd.myapp; profile=light
The Fielding dissertation about REST tells you about the resource interface, that you have to bind your IRIs to resources, which are entity sets. (This is different from SOAP, because by there you usually bind your IRIs to operations.)
According to Darrel Miller, the path is for describing hierarchical data and the query string is for describing non-hierarchical data in IRIs, but we use the path and the query together to identify a resource inside an API.
So based on these you have two approaches:
You can say, that the same entity with fewer properties can be mapped to a new resource with an own IRI. In this case the /movies/1?view=thin or the /movies/1/view:thin is okay.
Pros:
According to the
RDF a
property has rdf:type of rdf:Property and rdfs:Resource either, and REST has connections to the semantic web and linked data.
It is a common practice to create an IRI for a single property, for example /movies/1/title, so if we can do this by a single property, then we can do this by a collection of properties as well.
It is similar to a map reduce we already use for collection of entites: /movies/recent, etc... The only difference, that by the collection of entities we reduce a list or ordered set, and by the collection of properties we reduce a map. It is much more interesting to use the both in a combination, like: /movies/recent/title, which can return the titles of the recent movies.
Cons:
By RDF everything has an rdf:type of rdfs:Resource and maybe REST does not follow the same principles by web documents.
I haven't found anything about single properties or property collections can be or cannot be considered as resources in the dissertation, however I may accidentally skipped that section of the text (pretty dry stuff)...
You can say that the same entity with fewer properties is just a different representation of the same resource, so it should not have a different IRI. In this case you have to put your data about the preferred view to somewhere else into the request. Since by GET requests there is no body, and the HTTP method is not for storing this kind of things, the only place you can put it are the HTTP headers. By long term user specific settings you can store it on the server, or in cookies maintained by the client. By short term settings you can send it in many headers. By the content-type header you can define your own MIME type which is not recommended, because we don't like having hundreds of custom MIME types probably used by a single application only. By the content-type header you can add a profile to your MIME type as Doug Moscrop suggested. By a prefer header you can use the return-minimal settings as Darrel Miller suggested. By range headers you can do the same in theory, but I met with range headers only by pagination.
Pros:
It is certainly a RESTful approach.
Cons:
Existing HTTP frameworks not always support extracting these kind of header params, so you have to write your own short code to do that.
I cannot find anything about how these headers are affecting the client and server side caching mechanisms, so some of them may not be supported in some browsers, and by servers you have to write your own caching implementation, or find a framework which supports the header you want to use.
note: I personally prefer using the first approach, but that's just an opinion.
According to Darrel Miller the naming of the IRI does not really count by REST.
You just have to make sure, that a single IRI always points to the same resource, and that's all. The structure of the IRI does not count by client side, because your client has to meet the HATEOAS constraint if you don't want it to break by any changes of the IRI naming. This means, that always the server builds the IRIs, and the client follows these IRIs it get in a hypermedia response. This is just like using web browsers to follow link, and so browse the web... By REST you can add some semantics to your hypermedia which explains to your client what it just get. This can be some RDF vocabulary, for example schema.org, microdata, iana link relations, and so on (even your own application specific vocab)...
So using nice IRIs is not a concern by REST, it is a concern only by configuring the routing on the server side. What you have to make sure by REST IRIs, that you have a resource - IRI mapping and not an operation - IRI mapping, and you don't use IRIs for maintaining client state, for example storing user id, credentials, etc...