I can't get comfortable with defining 'good REST' URIs. The scenario is an existing site with products for sale. You can view data in a number of views, drilling down a hierarchy, but basically cat1/cat/ products, or cat 2/cat3/products or any combination of categories 1 to 4. The other products view is based on a search.
How do you form the URI's?
products/??????
Having designed a site that follows the principles of the REST architecture, my advice is to base your decision on your URI structure solely on your server design, that is how your server will parse an incoming URI and deliver the corresponding resource.
One principle of REST design (as I see it) is that your users/clients will NEVER have to form a URL to find a resource, they will instead read some hypertext (i.e. HTML) which describes resources, identify what they want, and get the appropriate link from the hypertext (for example, in HTML, the href attribute of the tag holds the URL, which could just as well be a random string of characters.
Again, in my opinion, if your application requires that a user/client be able to perform searches for arbitrarily named categories, it's probably not suitable to have a RESTfully designed interface.
You can use use a query string in your URI:
/products?categories=german,adult,foo,bar
In order to avoid multiple URIs you could enforce some logic like alphabetical ordering of categories on the server side so a request to the above URI would actually redirect via a 301 Moved Permanently response to:
/products?categories=adult,bar,foo,german
For that above query part pattern to work in browsers you will have to use JavaScript to generate the query from any html forms - you could, alternatively, do it like this if you wanted to avoid that particular problem:
/products?cat1=adult&cat2=bar&cat3=foo&cat4=german
Yes - the problem here is that you don't have a unique ID for the resource because:
products/cat1/cat4
Is actually the same page as
products/cat4/cat1
(Presumably a view of everything tagged with cat1 and cat4, or all matches from cat1 and cat4)
If your categories where organised into a hierarchy you would know that "cat6" is a child of "cat1", for example.
So your two options would be
1) Enforce a natural order for the categories
or
2) Have two different URI's essentially pointing to the same resource (with a 301 permanant redirect to your preferred resource).
Related
Lets assume we have some main-resource and a related sub-resource with 1-n relation;
User of the API can:
list main-resources so GET /main-resources endpoint.
list sub-resources so GET /sub-resources endpoint.
list sub-resources of a main-resource so one or both of;
GET /main-resources/{main-id}/sub-resources
GET /sub-resouces?main={main-id}
create a sub-resource under a main-resource
POST /main-resource/{main-id}/sub-resouces: Which has the benefit of hierarchy, but in order to support this one needs to provide another set of endpoints(list, create, update, delete).
POST /sub-resouces?main={main-id}: Which has the benefit of having embedded id inside URL. A middleware can handle and inject provided values into request itself.
create a sub-resource with all parameters in body POST /sub-resources
Is providing a URI with main={main-id} query parameter embedded a good way to solve this or should I go with the route of hierarchical URI?
In a true REST environment the spelling of URIs is not of importance as long as the characters used in the URI adhere to the URI specification. While RFC 3986 states that
The path component contains data, usually organized in hierarchical form, that, along with data in the non-hierarchical query component (Section 3.4), serves to identify a resource within the scope of the URI's scheme and naming authority (if any). The path is terminated by the first question mark ("?") and number sign ("#") character, or by the end of the URI. (Source)
it does not state that a URI has to have a hierarchical structure assigned to it. A URI as a whole is a pointer to a resource and as such a combination of various URIs may give the impression of some hierarchy involved. The actual information of whether URIs have some hierarchical structure to it should though stem from link relations that are attached to URIs. These can be registered names like up, fist, last, next, prev and the like or Web linking extensions such as https://acme.org/rel/parent which acts more like a predicate in a Semantic Web relation basically stating that the URI at hand is a parent to the current resource. Don't confuse rel-URIs for real URIs though. Such rel-URIs do not necessarily need to point to an actual resource or even to a documentation. Such link relation extensions though my be defined by media-types or certain profiles.
In a perfect world the URI though is only used to send the request to the actual server. A client won't parse or try to extract some knowledge off an URI as it will use accompanying link relation names to determine whether the URI is of relevance to the task at hand or not. REST is full of such "indirection" mechanism in order to help decoupling clients from servers.
I.e. what is the difference between a URI like https://acme.org/api/users/1 and https://acme.org/api/3f067d90-8b55-4b60-befc-1ce124b4e080? Developers in the first case might be tempted to create a user object representing the data returned by the URI invoked. Over time the response format might break as stuff is renamed, removed and replaced by other stuff. This is what Fielding called typed resources which REST shouldn't have.
The second URI doesn't give you a clue on what content it returns, and you might start questioning on what benefit it brings then. While you might not be aware of what actual content the service returns for such URIs, you know at least that your client is able to process the data somehow as otherwise the service would have responded with a 406 Not Acceptable response. So, content-type negotiation ensures that your client will with high certainty receive data it is able to process. Maintaining interoperability in a domain that is likely to change over time is one of RESTs strong benefits and selling points. Depending on the capabilities of your client and the service, you might receive a tailored response-format, which is only applicable to that particular service, or receive a more general-purpose one, like HTML i.e.. Your client basically needs a mapping to translate the received representation format into something your application then can use. As mentioned, REST is probably all about introducing indirections for the purpose of decoupling clients from servers. The benefit for going this indirection however is that once you have it working it will work with responses issued not only from that server but for any other service that also supports returning that media type format. And just think a minute what options your client has when it supports a couple of general-purpose formats. It then can basically communicate and interoperate with various other services in that ecosystem without a need for you touching it. This is how browsers operate on the Web for decades now.
This is exactly why I think that this phrase of Fielding is probably one of the most important ones but also the one that is ignored and or misinterpreted by most in the domain of REST:
A REST API should spend almost all of its descriptive effort in defining the media type(s) used for representing resources and driving application state, or in defining extended relation names and/or hypertext-enabled mark-up for existing standard media types. (Source)
So, in a true REST environment the form of the URI is unimportant as clients rely on other mechanisms to determine whether to use that URI or not. Even for so called "REST APIs" that do not really care about the true meaning of REST and treat it more like old-school RPC the question at hands is probably very opinionated and there probably isn't that one fits all solution. If your framework supports injecting stuff based on the presence of certain query parameters, use that. If you prefer the more hierarchical structure of URIs, go for those. There isn't a right or wrong in such cases.
According to the URI standard when you have a hierarchical relationship between resources, then better to add it to the path instead of the query. https://datatracker.ietf.org/doc/html/rfc3986#page-22 Sometimes it is better to describe the relation itself, not just the sub-resource, but that happens only if the sub-resource can belong to multiple main resources, which is n:m relationship.
Consider a need to create a GET endpoint for fetching Member details using either of 4 options (It's common in legacy application with RPC calls)
Get member by ID
Get member by SSN
Get member by a combination of Phone and LastName (both must be passed)
What's a recommended strategy to live the REST spirit and yet provide this flexibility?
Some options I could think of are:
Parameters Based
/user/{ID}
/user?ssn=?
/user?phone=?&lname=?
Separate Endpoints
/user/{ID}
/user/SSN/{SSNID}
/user/{lname}/{phone}
RPC for custom
/user/{ID}
/user/findBySSN/
/user/findbycontact/
REST doesn't care what spelling you use for your identifiers.
For example, think about how you would do this on the web. You would provide forms, one for each set of search criteria. The consumer would choose which form to use, and submit the form, without ever knowing what the URI is.
In the case of HTML forms, there are specific processing rules for describing how the form information will be copied into the URI. The form takes on the aspect of a URI Template.
A URI Template provides both a structural description of a URI space and, when variable values are provided, machine-readable instructions on how to construct a URI corresponding to those values.
But there aren't any rules saying that restrict the server from providing a URI template that directs the client to copy the variable values into path segments rather than into the query string.
In other words, in REST, the server retains control of its own URI space.
You might sometimes prefer to use path segments because of their hierarchical nature, which may be convenient if you want the client to use relative resolution of relative references in your representations.
REST ≠ pretty URLs. The two are orthogonal.
Your question is about the latter, I feel.
Whilst the other answers have been good, if you want your API to work with HTML forms, go with query parameters on the collection /user resource for all fields, even for ID (assuming a human is typing these in based on information they are getting from sheets of paper on their desk, etc.)
If your server is able to produce links to each record, always produce canonical links such as /users/{id}, don't duplicate data under different URLs.
How would you model a resource that can have two different representations. For example, one representation may be "thin" withe most of its related resources accessible by links. Another representation may be "fat" where most of its related resources are embedded. The idea being, some clients don't mind having to make many calls to browse around the linked resources, but others want to get the data all at once.
Consider a movie resource that is associated with a director, actors, etc. Perhaps the thin version of it has the movie title only, and to get the data for the director, list of actors, etc., one must make additional requests via the embedded links to them. Perhaps the fat version of it contains all the movie nested inside, including the director's data, the data for the various actor's, etc.
How should one model this?
I see a few options:
these two representations are really two different resources and require different URIs
these two representations are in fact the same resource, and you can select between the two representations via custom media types, for example application/vnd.movie.thin+json and application/vnd.movie.fat+json.
these two representations are in fact the same resource, and selecting the different representations should be done with query parameters (e.g. /movies/1?view=thin).
Something else...
What do you consider the proper approach to this kind of API?
You could use the prefer header with the return-minimal parameter.
I prefer using Content-Type for this. You can use parameters, too:
application/vnd.myapp; profile=light
The Fielding dissertation about REST tells you about the resource interface, that you have to bind your IRIs to resources, which are entity sets. (This is different from SOAP, because by there you usually bind your IRIs to operations.)
According to Darrel Miller, the path is for describing hierarchical data and the query string is for describing non-hierarchical data in IRIs, but we use the path and the query together to identify a resource inside an API.
So based on these you have two approaches:
You can say, that the same entity with fewer properties can be mapped to a new resource with an own IRI. In this case the /movies/1?view=thin or the /movies/1/view:thin is okay.
Pros:
According to the
RDF a
property has rdf:type of rdf:Property and rdfs:Resource either, and REST has connections to the semantic web and linked data.
It is a common practice to create an IRI for a single property, for example /movies/1/title, so if we can do this by a single property, then we can do this by a collection of properties as well.
It is similar to a map reduce we already use for collection of entites: /movies/recent, etc... The only difference, that by the collection of entities we reduce a list or ordered set, and by the collection of properties we reduce a map. It is much more interesting to use the both in a combination, like: /movies/recent/title, which can return the titles of the recent movies.
Cons:
By RDF everything has an rdf:type of rdfs:Resource and maybe REST does not follow the same principles by web documents.
I haven't found anything about single properties or property collections can be or cannot be considered as resources in the dissertation, however I may accidentally skipped that section of the text (pretty dry stuff)...
You can say that the same entity with fewer properties is just a different representation of the same resource, so it should not have a different IRI. In this case you have to put your data about the preferred view to somewhere else into the request. Since by GET requests there is no body, and the HTTP method is not for storing this kind of things, the only place you can put it are the HTTP headers. By long term user specific settings you can store it on the server, or in cookies maintained by the client. By short term settings you can send it in many headers. By the content-type header you can define your own MIME type which is not recommended, because we don't like having hundreds of custom MIME types probably used by a single application only. By the content-type header you can add a profile to your MIME type as Doug Moscrop suggested. By a prefer header you can use the return-minimal settings as Darrel Miller suggested. By range headers you can do the same in theory, but I met with range headers only by pagination.
Pros:
It is certainly a RESTful approach.
Cons:
Existing HTTP frameworks not always support extracting these kind of header params, so you have to write your own short code to do that.
I cannot find anything about how these headers are affecting the client and server side caching mechanisms, so some of them may not be supported in some browsers, and by servers you have to write your own caching implementation, or find a framework which supports the header you want to use.
note: I personally prefer using the first approach, but that's just an opinion.
According to Darrel Miller the naming of the IRI does not really count by REST.
You just have to make sure, that a single IRI always points to the same resource, and that's all. The structure of the IRI does not count by client side, because your client has to meet the HATEOAS constraint if you don't want it to break by any changes of the IRI naming. This means, that always the server builds the IRIs, and the client follows these IRIs it get in a hypermedia response. This is just like using web browsers to follow link, and so browse the web... By REST you can add some semantics to your hypermedia which explains to your client what it just get. This can be some RDF vocabulary, for example schema.org, microdata, iana link relations, and so on (even your own application specific vocab)...
So using nice IRIs is not a concern by REST, it is a concern only by configuring the routing on the server side. What you have to make sure by REST IRIs, that you have a resource - IRI mapping and not an operation - IRI mapping, and you don't use IRIs for maintaining client state, for example storing user id, credentials, etc...
So, I'm building a Web site application that will comprise a small set of content files each dedicated to a general purpose, rather than a large set of files each dedicated to a particular piece of information. In other words, instead of /index.php, /aboutme.php, /contact.php, etc., there would just be /index.php (basically just a shell with HTML and ) and then content.php, error.php, 404.php, etc.
To deliver content, I plan to store "directory structures" and associated content in a data table, capture URIs, and then query the data table to see if the URI matches a stored "directory structure". If there's a match, the associated content will be returned to the application, which will use Pear's HTTP_Request2 to send a request to content.php. Then, content.php will display the appropriate content that was returned from the database.
EDIT 1: For example:
a user types www.somesite.com/contact/ into their browser
index.php loads
a script upstream of the HTML header on index.php does the following:
submits a mysql query and looks for WHERE path = $_SERVER[REQUEST_URI]
if it finds a match, it sets $content = $dbResults->content and POSTs $content to /pages/content.php for display. The original URI is preserved, although /pages/content.php and /index.php are actually delivering the content
if it finds no match, the contents of /pages/404.php are returned to the user. Here, too, the original URI is preserved even though index.php and /pages/404.php are actually delivering the content.
Is this a RESTful approach? On the one hand, the URI is being used to access a particular resource (a tuple in a data table), but on the other hand, I guess I always thought of a "resource" as an actual file in an actual directory.
I'm not looking to be a REST purist, I'm just really delving into the more esoteric aspects of and approaches to working with HTTP and am looking to refine my knowledge and understanding...
OK, my conclusion is that there is nothing inherently unRESTful about my approach, but how I use the URIs to access the data tables seems to be critical. I don't think it's in the spirit of REST to store a resource's full path in that resource's row in a data table.
Instead, I think it's important to have a unique tuple for each "directory" referenced in a URI. The way I'm setting this up now is to create a "collection" table and a "resource" table. A collection is a group of resources, and a resource is a single entity (a content page, an image, etc., or even a collection, which allows for nesting and taxonomic structuring).
So, for instance, /portfolio would correspond with a portfolio entry in the collection table and the resource table, as would /portfolio/spec-ads; however, /portfolio/spec-ads/hersheys-spec-ad would correspond to an entry only in the resource table. That entry would contain, say, embed code for a Hershey's spec ad on YouTube.
I'm still working out an efficient way to build a query from a parsed URI with multiple "directories," but I'm getting there. The next step will be to work the other way and build a query that constructs a nav system with RESTful URIs. Again, though, I think the approach I laid out in the original question is RESTful, so long as I properly correlate URIs, queries, and the data architecture.
The more I walk down this path, the more I like it...
Which one of these URIs would be more 'fit' for receiving POSTs (adding product(s))? Are there any best practices available or is it just personal preference?
/product/ (singular)
or
/products/ (plural)
Currently we use /products/?query=blah for searching and /product/{productId}/ for GETs PUTs & DELETEs of a single product.
Since POST is an "append" operation, it might be more Englishy to POST to /products, as you'd be appending a new product to the existing list of products.
As long as you've standardized on something within your API, I think that's good enough.
Since REST APIs should be hypertext-driven, the URI is relatively inconsequential anyway. Clients should be pulling URIs from returned documents and using those in subsequent requests; typically applications and people aren't going to need to guess or visually interpret URIs, since the application will be explicitly instructing clients what resources and URIs are available.
Typically you use POST to create a resource when you don't know the identifier of the resource in advance, and PUT when you do. So you'd POST to /products, or PUT to /products/{new-id}.
With both of these you'll return 201 Created, and with the POST additionally return a Location header containing the URL of the newly created resource (assuming it was successfully created).
In RESTful design, there are a few patterns around creating new resources. The pattern that you choose largely depends on who is responsible for choosing the URL for the newly created resource.
If the client is responsible for choosing the URL, then the client should PUT to the URL for the resource. In contrast, if the server is responsible for the URL for the resource then the client should POST to a "factory" resource. Typically the factory resource is the parent resource of the resource being created and is usually a collection which is pluralized.
So, in your case I would recommend using /products
You POST or GET a single thing: a single PRODUCT.
Sometimes you GET with no specific product (or with query criteria). But you still say it in the singular.
You rarely work plural forms of names. If you have a collection (a Catalog of products), it's one Catalog.
I would only post to the singular /product. It's just too easy to mix up the two URL-s and get confused or make mistakes.
As many said, you can probably choose any style you like as long as you are consistent, however I'd like to point out some arguments on both sides; I'm personally biased towards singular
In favor of plural resource names:
simplicity of the URL scheme as you know the resource name is always at plural
many consider this convention similar to how databases tables are addressed and consider this an advantage
seems to be more widely adopted
In favor of singular resource names (this doesn't exclude plurals when working on multiple resources)
the URL scheme is more complex but you gain more expressivity
you always know when you are dealing with one or more resources based on the resource name, as opposed to check whether the resource has an additional Id path component
plural is sometimes harder for non-native speakers (when is not simply an "s")
the URL is longer
the "s" seems to be a redundant from a programmers' standpoint
is just awkward to consider the path parameter as a sub-resource of the collection as opposed to consider it for what it is: simply an ID of the resource it identifies
you can apply the filtering parameters only where they are needed (endpoint with plural resource name)
you could use the same url for all of them and use the MessageContext to determine what type of action the caller of the web service wanted to perform.
No language was specified but in Java you can do something like this.
WebServiceContext ws_ctx;
MessageContext ctx = ws_ctx.getMessageContext();
String action = (String)ctx.get(MessageContext.HTTP_REQUEST_METHOD);
if(action.equals("GET")
// do something
else if(action.equals("POST")
// do something
That way you can check the type of request that was sent to the web service and perform the appropriate action based upon the request method.