How should I deal with object hierarchies in a RESTful API? - class

I am currently designing the API for an existing PHP application, and to this end am investigating REST as a sensible architectural approach.
I believe I have a reasonable grasp of the key concepts, but I'm struggling to find anybody that has tackled object hierarchies and REST.
Here's the problem...
In the [application] business object hierarchy we have:
Users
L which have one-to-many Channel objects
L which have one-to-many Member objects
In the application itself we use a lazy load approach to populate the User object with arrays of these objects as required. I believe in OO terms this is object aggregation, but I have seen various naming inconsistencies and do not care to start a war about the precise naming convention </flame war>.
For now, consider I have some loosely coupled objects that I may / may not populate depending on application need.
From a REST perspective, I am trying to ascertain what the approach should be. Here is my current thinking (considering GET only for the time being):
Option 1 - fully populate the objects:
GET api.example.com/user/{user_id}
Read the User object (resource) and return the User object with all possible Channel and Member objects pre-loaded and encoded (JSON or XML).
PROS: reduces the number of objects, no traversal of object hierarchies required
CONS: objects must be fully populated (expensive)
Option 2 - populate the primary object and include links to the other object resources:
GET api.example.com/user/{user_id}
Read the User object (resource) and return the User object User data populated and two lists.
Each list references the appropriate (sub) resource i.e.
api.example.com/channel/{channel_id}
api.example.com/member/{member_id}
I think this is close to (or exactly) the implications of hypermedia - the client can get the other resources if it wants (as long as I tag them sensibly).
PROS: client can choose to load the subordinates or otherwise, better separation of the objects as REST resources
CONS: further trip required to get the secondary resources
Option 3 - enable recursive retrieves
GET api.example.com/user/{user_id}
Read the User object and include links to lists of the sub-objects i.e.
api.example.com/user/{user_id}/channels
api.example.com/user/{user_id}/members
the /channels call would return a list of channel resources in the form (as above):
api.example.com/channel/{channel_id}
PROS: primary resources expose where to go to get the subordinates but not what they are (more RESTful?), no requirement to get the subordinates up front, the subordinate list generators (/channels and /members) provide interfaces (method like) making the response more service like.
CONS: three calls now required to fully populate the object
Option 4 - (re)consider the object design for REST
I am re-using the [existing] application object hierarchy and trying to apply it to REST - or perhaps more directly, provide an API interface to it.
Perhaps the REST object hierarchy should be different, or perhaps the new RESTful thinking is exposing limitations of the existing object design.
Any thoughts on the above welcomed.

There's no reason not to combine these.
api.example.com/user/{user_id} – return a user representation
api.example.com/channel/{channel_id} – return a channel representation
api.example.com/user/{user_id}/channels – return a list of channel representations
api.example.com/user/{user_id}/channel_list – return a list of channel ids (or links to their full representations, using the above links)
When in doubt, think about how you would display the data to a human user without "API" concerns: a user wants both index pages ({user_id}/channel_list) and full views ({user_id}/channels).
Once you have that, just support JSON instead of (or in addition to) HTML as the representation format, and you have REST.

The best advice I can give is to try and avoid thinking about your REST api as exposing your objects. The resources you create should support the use cases you need. If necessary you might create resources for all three options:
api.example.com/completeuser/{id}
api.example.com/linkeduser/{id}
api.example.com/lightweightuser/{id}
Obviously my names are a bit goofy, but it really doesn't matter what you call them. The idea is that you use the REST api to present data in the most logical way for the particular usage scenario. If there are multiple scenarios, create multiple resources, if necessary. I like to think of my resources more like UI models rather than business entities.

I would recommend Restful Obects which is standards for exposing domain model's restful
The idea of Restful Objects is to provide a standard, generic RESTful interface for domain object models, exposing representations of their structure using JSON and enabling interactions with domain object instances using HTTP GET, POST, PUT and DELETE.
According to the standard, the URIs will be like:
api.example.com/object/user/31
api.example.com/object/user/31/properties/username
api.example.com/object/user/31/collections/channels
api.example.com/object/user/31/collections/members
api.example.com/object/user/31/actions/someFunction
api.example.com/object/user/31/actions/someFunction/invoke
There are also other resources
api.example.com/services
api.example.com/domain-types
The specification defines a few primary representations:
object (which represents any domain object or service)
list (of links to other objects)
property
collection
action
action result (typically containing either an object or a list, or just feedback messages)
and also a small number of secondary representations such as home, and user
This is interesting as you’ll see that representations are fully self-describing, opening up the possibility of generic viewers to be implemented if required.
Alternatively, the representations can be consumed directly by a bespoke application.

Here's my conclusions from many hours searching and with input from the responders here:
Where I have an object that is effectively a multi-part object, I need to treat that as a single resource. Thus if I GET the object, all the sub-ordinates should be present. This is required in order that the resource is cacheable. If I part load the object (and provide an ETag stamp) then other requestors may receive a partial object when they expected a full one. Conclude - objects should be fully populated if they are being made available as resources.
Associated object relationships should be made available as links to other (primary) resources. In this way the objects are discoverable by traversing the API.
Also, the object hierarchy that made sense for main application site may appear not be what you need to act in RESTful manner, but is more likely revealing problems with the existing hierarchy. Having said this the API may require more specialised use cases than had been previously envisaged, and specialised resources may be required.
Hope that helps someone

Related

Sub-resource creation url

Lets assume we have some main-resource and a related sub-resource with 1-n relation;
User of the API can:
list main-resources so GET /main-resources endpoint.
list sub-resources so GET /sub-resources endpoint.
list sub-resources of a main-resource so one or both of;
GET /main-resources/{main-id}/sub-resources
GET /sub-resouces?main={main-id}
create a sub-resource under a main-resource
POST /main-resource/{main-id}/sub-resouces: Which has the benefit of hierarchy, but in order to support this one needs to provide another set of endpoints(list, create, update, delete).
POST /sub-resouces?main={main-id}: Which has the benefit of having embedded id inside URL. A middleware can handle and inject provided values into request itself.
create a sub-resource with all parameters in body POST /sub-resources
Is providing a URI with main={main-id} query parameter embedded a good way to solve this or should I go with the route of hierarchical URI?
In a true REST environment the spelling of URIs is not of importance as long as the characters used in the URI adhere to the URI specification. While RFC 3986 states that
The path component contains data, usually organized in hierarchical form, that, along with data in the non-hierarchical query component (Section 3.4), serves to identify a resource within the scope of the URI's scheme and naming authority (if any). The path is terminated by the first question mark ("?") and number sign ("#") character, or by the end of the URI. (Source)
it does not state that a URI has to have a hierarchical structure assigned to it. A URI as a whole is a pointer to a resource and as such a combination of various URIs may give the impression of some hierarchy involved. The actual information of whether URIs have some hierarchical structure to it should though stem from link relations that are attached to URIs. These can be registered names like up, fist, last, next, prev and the like or Web linking extensions such as https://acme.org/rel/parent which acts more like a predicate in a Semantic Web relation basically stating that the URI at hand is a parent to the current resource. Don't confuse rel-URIs for real URIs though. Such rel-URIs do not necessarily need to point to an actual resource or even to a documentation. Such link relation extensions though my be defined by media-types or certain profiles.
In a perfect world the URI though is only used to send the request to the actual server. A client won't parse or try to extract some knowledge off an URI as it will use accompanying link relation names to determine whether the URI is of relevance to the task at hand or not. REST is full of such "indirection" mechanism in order to help decoupling clients from servers.
I.e. what is the difference between a URI like https://acme.org/api/users/1 and https://acme.org/api/3f067d90-8b55-4b60-befc-1ce124b4e080? Developers in the first case might be tempted to create a user object representing the data returned by the URI invoked. Over time the response format might break as stuff is renamed, removed and replaced by other stuff. This is what Fielding called typed resources which REST shouldn't have.
The second URI doesn't give you a clue on what content it returns, and you might start questioning on what benefit it brings then. While you might not be aware of what actual content the service returns for such URIs, you know at least that your client is able to process the data somehow as otherwise the service would have responded with a 406 Not Acceptable response. So, content-type negotiation ensures that your client will with high certainty receive data it is able to process. Maintaining interoperability in a domain that is likely to change over time is one of RESTs strong benefits and selling points. Depending on the capabilities of your client and the service, you might receive a tailored response-format, which is only applicable to that particular service, or receive a more general-purpose one, like HTML i.e.. Your client basically needs a mapping to translate the received representation format into something your application then can use. As mentioned, REST is probably all about introducing indirections for the purpose of decoupling clients from servers. The benefit for going this indirection however is that once you have it working it will work with responses issued not only from that server but for any other service that also supports returning that media type format. And just think a minute what options your client has when it supports a couple of general-purpose formats. It then can basically communicate and interoperate with various other services in that ecosystem without a need for you touching it. This is how browsers operate on the Web for decades now.
This is exactly why I think that this phrase of Fielding is probably one of the most important ones but also the one that is ignored and or misinterpreted by most in the domain of REST:
A REST API should spend almost all of its descriptive effort in defining the media type(s) used for representing resources and driving application state, or in defining extended relation names and/or hypertext-enabled mark-up for existing standard media types. (Source)
So, in a true REST environment the form of the URI is unimportant as clients rely on other mechanisms to determine whether to use that URI or not. Even for so called "REST APIs" that do not really care about the true meaning of REST and treat it more like old-school RPC the question at hands is probably very opinionated and there probably isn't that one fits all solution. If your framework supports injecting stuff based on the presence of certain query parameters, use that. If you prefer the more hierarchical structure of URIs, go for those. There isn't a right or wrong in such cases.
According to the URI standard when you have a hierarchical relationship between resources, then better to add it to the path instead of the query. https://datatracker.ietf.org/doc/html/rfc3986#page-22 Sometimes it is better to describe the relation itself, not just the sub-resource, but that happens only if the sub-resource can belong to multiple main resources, which is n:m relationship.

How do you represent "thin" and "fat" versions of a RESTful resource?

How would you model a resource that can have two different representations. For example, one representation may be "thin" withe most of its related resources accessible by links. Another representation may be "fat" where most of its related resources are embedded. The idea being, some clients don't mind having to make many calls to browse around the linked resources, but others want to get the data all at once.
Consider a movie resource that is associated with a director, actors, etc. Perhaps the thin version of it has the movie title only, and to get the data for the director, list of actors, etc., one must make additional requests via the embedded links to them. Perhaps the fat version of it contains all the movie nested inside, including the director's data, the data for the various actor's, etc.
How should one model this?
I see a few options:
these two representations are really two different resources and require different URIs
these two representations are in fact the same resource, and you can select between the two representations via custom media types, for example application/vnd.movie.thin+json and application/vnd.movie.fat+json.
these two representations are in fact the same resource, and selecting the different representations should be done with query parameters (e.g. /movies/1?view=thin).
Something else...
What do you consider the proper approach to this kind of API?
You could use the prefer header with the return-minimal parameter.
I prefer using Content-Type for this. You can use parameters, too:
application/vnd.myapp; profile=light
The Fielding dissertation about REST tells you about the resource interface, that you have to bind your IRIs to resources, which are entity sets. (This is different from SOAP, because by there you usually bind your IRIs to operations.)
According to Darrel Miller, the path is for describing hierarchical data and the query string is for describing non-hierarchical data in IRIs, but we use the path and the query together to identify a resource inside an API.
So based on these you have two approaches:
You can say, that the same entity with fewer properties can be mapped to a new resource with an own IRI. In this case the /movies/1?view=thin or the /movies/1/view:thin is okay.
Pros:
According to the
RDF a
property has rdf:type of rdf:Property and rdfs:Resource either, and REST has connections to the semantic web and linked data.
It is a common practice to create an IRI for a single property, for example /movies/1/title, so if we can do this by a single property, then we can do this by a collection of properties as well.
It is similar to a map reduce we already use for collection of entites: /movies/recent, etc... The only difference, that by the collection of entities we reduce a list or ordered set, and by the collection of properties we reduce a map. It is much more interesting to use the both in a combination, like: /movies/recent/title, which can return the titles of the recent movies.
Cons:
By RDF everything has an rdf:type of rdfs:Resource and maybe REST does not follow the same principles by web documents.
I haven't found anything about single properties or property collections can be or cannot be considered as resources in the dissertation, however I may accidentally skipped that section of the text (pretty dry stuff)...
You can say that the same entity with fewer properties is just a different representation of the same resource, so it should not have a different IRI. In this case you have to put your data about the preferred view to somewhere else into the request. Since by GET requests there is no body, and the HTTP method is not for storing this kind of things, the only place you can put it are the HTTP headers. By long term user specific settings you can store it on the server, or in cookies maintained by the client. By short term settings you can send it in many headers. By the content-type header you can define your own MIME type which is not recommended, because we don't like having hundreds of custom MIME types probably used by a single application only. By the content-type header you can add a profile to your MIME type as Doug Moscrop suggested. By a prefer header you can use the return-minimal settings as Darrel Miller suggested. By range headers you can do the same in theory, but I met with range headers only by pagination.
Pros:
It is certainly a RESTful approach.
Cons:
Existing HTTP frameworks not always support extracting these kind of header params, so you have to write your own short code to do that.
I cannot find anything about how these headers are affecting the client and server side caching mechanisms, so some of them may not be supported in some browsers, and by servers you have to write your own caching implementation, or find a framework which supports the header you want to use.
note: I personally prefer using the first approach, but that's just an opinion.
According to Darrel Miller the naming of the IRI does not really count by REST.
You just have to make sure, that a single IRI always points to the same resource, and that's all. The structure of the IRI does not count by client side, because your client has to meet the HATEOAS constraint if you don't want it to break by any changes of the IRI naming. This means, that always the server builds the IRIs, and the client follows these IRIs it get in a hypermedia response. This is just like using web browsers to follow link, and so browse the web... By REST you can add some semantics to your hypermedia which explains to your client what it just get. This can be some RDF vocabulary, for example schema.org, microdata, iana link relations, and so on (even your own application specific vocab)...
So using nice IRIs is not a concern by REST, it is a concern only by configuring the routing on the server side. What you have to make sure by REST IRIs, that you have a resource - IRI mapping and not an operation - IRI mapping, and you don't use IRIs for maintaining client state, for example storing user id, credentials, etc...

Is it RESTful to create complex objects in a single POST?

I have a form where users create Person records. Each Person can have several attributes -- height, weight, etc. But they can also have lists of associated data such as interests, favorite movies, etc.
I have a single form where all this data is collected. To me it seems like I should POST all of this data in a single request. But is that RESTful? My reading suggests that the interests, favorite movies and other lists should be added in separate POST requests. But I don't think that makes sense because one of those could fail and then there would be a partial insert of the Person and it may be missing their interests or favorite movies.
I'd say that it depends entirely upon the addressability and uniqueness of the dependent data.
If your user-associated data is dependent upon the user (i.e., a "distinct" string, e.g. an attribute such as a string representing an (unvalidated) name of a movie), then it should be included in the POST creation of the user representation; however, if the data is independent of the user (where the data can be addressed independently of the user, e.g. a reference, such as a movie from a set of movies) then it should be added independently.
The reasoning behind this is that reference addition when bundled with the original POST implies transactionality; that is, if another user deletes the movie reference for the "favorite" movie between when it is chosen on the client and when the POST goes through, the user add will (should by that design) fail, whereas if the "favorite" movie is not associative but is just an attribute, there's nothing to fail on (attributes (presumably) cannot be invalidated by a third party).
And again, this goes very much to your specific needs, but I fall on the side of allowing the partial inserts and indicating the failures. The proper way to handle this sort of thing if you really want to not allow partial inserts is to just implement transactions on the back end; they're the only way to truly handle a situation where a critical associated resource is removed mid-process.
The real restriction in REST is that for a modifiable resource that you GET, you can also turn around and PUT the same representation back to change its state. Or POST. Since it's reasonable (and very common) to GET resources that are big bundles of other things, it's perfectly reasonable to PUT big bundles of things, too.
Think of resources in REST very broadly. They can map one-to-one with database rows, but they don't have to. An addressable resource can embed other addressable resources, or include links to them. As long as you're honoring your representation and the semantics of the underlying protocol's operations (i.e. HTTP GET POST PUT etc.), REST doesn't have anything to say about other design considerations that might make your life easier or harder.
I don't think there is a problem with adding all data in one request as long as its inherently associated with the main resource (i.e. the person in your case). If interest, fav. movies etc are resources of their own, they should also be handled as such.

REST returning an object graph

I am new to the REST architecural design, however I think I have the basics of it covered.
I have a problem with returning objects from a RESTful call. If I make a request such as http://localhost/{type A}/{id} I will return an instance of A from the database with the specified id.
My question is what happens when A contains a collection of B objects? At the moment the XML I generate returns A with a collection of B objects inside of it. As you can imagine if the B type has a collection of C objects then the XML returned will end up being a quite complicated object graph.
I can't be 100% sure but this feels to be against the RESTful principles, the XML for A should return the fields etc. for A as well as a collection of URI's to the collection of B's that it owns.
Sorry if this is a bit confusing, I can try to elaborate more. This seems like a relatively basic question, however I can't decide which approach is "more" RESTful.
Cheers,
Aidos
One essential RESTful principle is that everything has a URI.
You have URI's like this.
/A/ and /A/id/ to get a list of A's and a specific A. The A response includes the ID's of B's.
/B/ and /B/id/ to get a list of B's and a specific B. The B response includes the ID's of C's.
/C/ and /C/id/ to get a list of C's and a specific C.
You can, through a series of queries, rebuild the A-B-C structure. You get the A, then get the relevant B's. When getting a B, you get the various C's that are referenced.
Edit
Nothing prevents you from returning more.
For example, you might have the following kinds of URI's.
/flat/A/id/, /flat/B/id/ and /flat/C/id/ to return "flat" (i.e., no depth) structures.
/deep/A/id/, /deep/B/id/ and /deep/C/id/ to return structures with complete depth.
The /deep/A/id/ would be the entire structure, in a big, nested XML document. Fine for clients that can handle it. /flat/A/id/ would be just the top level in a flat document. Best for clients that can't handle depth.
There's nothing saying your REST interface can't be relational.
/bookstore/{bookstoreID}
/bookstore/{bookstoreID}/books
/book/{bookID}
Basically, you have a 1:1 correspondence with your DB schema.
Except for Many-to-Many relations forming child lists. For example,
/bookstore/657/books should return a list of book IDs or URLs. Then if you want a specific book's data you can invoke the 3rd URL.
This is just off the top of my head, please debate the merits.
Make a flat Universe that you expose to the world.
Even when I use SOAP, which can easily handle hierarchical object graphs up to whatever depth, I flatten the graph and link everything with simple IDs (you could even use your database IDs, although the idea is that you want you don't want to expose your PKs to the world).
Your object universe inside your app is not necessarily the same one you expose to the world. Let A have children and let B have children, but there's no need to reflect that in the REST request URLs.
Why flatten? Because then you can do things like fetch the objects later by ID, or send them in bursts (both the same case, more or less)... And better than all that, the request URIs don't change when the object hierarchy changes (object 37252 is always the same, even when it's been reclassed).
Edit: Well, you asked for it... Here's the architecture I ended up using:
package: server - contains the superclasses that are shared between the front-end server and the back-end server
package: frontEndServer - contains a Server interface which the front-end server must adhere to. The interface is nice because if you decide to change from SOAP to a straight Web client (that uses JSON or whatever, as well), you've got the interface all laid out. It also contains all the implementations for the frontEnd classes that will be tossed to the client, and all the logic for the interaction between classes except how to talk to the client.
package: backEndServer - contains a Server interface which the back-end server will adhere to. An example of a Server implementation would be one that talks to a MySql DB or one that talks to an XML DB, but the Server interface is neutral. This package also contains all the classes that the implementations of the Server interface use to get work done, and all the logic for the backend except for persistence.
then you have implementation packages for each of these... which include stuff like how to persist for the backend and how to talk to the client for the front end. The front-end implementation package might know, for instance, that a user has logged in, whereas the frontEndServer just knows that it has to implement methods for creating users and logging in.
After beginning to write this up I realize that it would take a while more to describe everything, but here you have the gist of it.

Isn't resource-oriented really object-oriented?

When you think about it, doesn't the REST paradigm of being resource-oriented boil down to being object-oriented (with constrained functionality, leveraging HTTP as much as possible)?
I'm not necessarily saying it's a bad thing, but rather that if they are essentially the same very similar then it becomes much easier to understand REST and the implications that such an architecture entails.
Update: Here are more specific details:
REST resources are equivalent to public classes. Private classes/resources are simply not exposed.
Resource state is equivalent to class public methods or fields. Private methods/fields/state is simply not exposed (this doesn't mean it's not there).
While it is certainly true that REST does not retain client-specific state across requests, it does retain resource state across all clients. Resources have state, the same way classes have state.
REST resources are are globally uniquely identified by a URI in the same way that server objects are globally uniquely identified by their database address, table name and primary key. Granted there isn't (yet) a URI to represent this, but you can easily construct one.
REST is similar to OO in that they both model the world as entities that accept messages (i.e., methods) but beyond that they're different.
Object orientation emphasizes encapsulation of state and opacity, using as many different methods necessary to operate on the state. REST is about transfer of (representation of) state and transparency. The number of methods used in REST is constrained and uniform across all resources. The closest to that in OOP is the ToString() method which is very roughly equivalent to an HTTP GET.
Object orientation is stateful--you refer to an object and can call methods on it while maintaining state within a session where the object is still in scope. REST is stateless--everything you want to do with a resource is specified in a single message and all you ever need to know regarding that message is sent back in a single response.
In object-orientation, there is no concept of universal object identity--objects either get identity from their memory address at any particular moment, a framework-specific UUID, or from a database key. In REST all resources are identified with a URI and don't need to be instantiated or disposed--they always exist in the cloud unless the server responds with a 404 Not Found or 410 Gone, in whch case you know there's no resource with that URI.
REST has guarantees of safety (e.g., a GET message won't change state) and idempotence (e.g., a PUT request sent multiple times has same effect as just one time). Although some guidelines for particular object-oriented technologies have something to say about how certain constructs affect state, there really isn't anything about object orientation that says anything about safety and idempotence.
I think there's a difference between saying a concept can be expressed in terms of objects and saying the concept is the same as object orientation.
OO offers a way to describe REST concepts. That doesn't mean REST itself implements OO.
You are right. Dan Connolly wrote an article about it in 1997. The Fielding thesis also talks about it.
Objects bundle state and function together. Resource-orientation is about explicitly modeling state(data), limiting function to predefined verbs with universal semantics (In the case of HTTP, GET/PUT/POST/DELETE), and leaving the rest of the processing to the client.
There is no equivalent for these concepts in the object-orientation world.
Only if your objects are DTOs (Data Transfer Objects) - since you can't really have behavior other than persistence.
Yes, your parallel to object-orientation is correct.
The thing is, most webservices (REST, RESTful, SOAP,..) can pass information in the form of objects, so that isn't what makes it different. SOAP tends to lead to fewer services with more methods. REST tends to lead to more services (1 per resource type) with a few calls each.
Yes, REST is about transfer of objects. But it isn't the whole object; just the object's current state. The implicit assumption is that the class definitions on both sides of the REST are potentially similar; otherwise the object state has been coerced into some new object.
REST only cares about 4 events in the life on an object, create (POST), retrieve (GET), update (PUT) and delete. They're significant events, but there's only these four.
An object can participate in lots of other events with lots of other objects. All the rest of this behavior is completely outside the REST approach.
There's a close relationship -- REST moves Objects -- but saying they're the same reduces your objects to passive collections of bits with no methods.
REST is not just about objects, its also about properties :: a post request to /users/john/phone_number with a new phone number is not adding a new object, its setting a property of the user object 'john'
This is not even the whole state of the object, but only a change to a small part of the state.
It's certainly not a 1:1 match.