RESTful API design - using a resource URI vs an ID - rest

this is my first post, so please bear with me.
I am designing a new RESTful API and I have two design choices in how my clients interact with resources that they create.
As an example, I have a resource: "book", which is a simple, singleton resource.
Creating a new book is very simple:
I know I can also use PUT if I want the operation to be idempotent.
This question is solely about the 200 OK response options, returning either:
an anonymous resource identifier (UUID) of the created "book":
book_id = 12345-67890
title = "a fantastic story"
a full FQDN URI to the created "book":
book_uri = "
title = "a fantastic story"
This of course significantly effects the subsequent manipulation of the "book" by the client.
To get the title of the above book, the client API calls would be either:
Example: GET
Notes: The client will always use the same endpoint as the POST call, with the book-id simply appended.
GET {book-uri}
Example: GET
Notes: The client will use the {book-uri} object variable directly from the POST response. Importantly, the returned {book-uri) may be a completely different URI to that of the POST used to create the "book".
So my questions (please) are:
Q1) which is the better model for the client to use and why?
Q2) can you see any issues with using Option 2 in a high volume, commercial system?
Thanks for any help and answers in advance.

can you see any issues with using Option 2 in a high volume, commercial system?
So, Option 2, where the HTTP response includes a URI for the newly created resource, is how the web itself works, and the web seems to be doing pretty well as a high volume commercial system.
Note also that option #2 allows the server to control its URIs. For instance, if you later decide that you want to revise the resource model, and use different spellings for the resource identifiers, then you can do that without needing to make any changes to the client.
You can also introduce, for example, a URI shortening component, because again you've got an identifier with standardized rules for how it works.
You don't necessarily need to use a full URI - we've also got standardized rules for how a URI fragment can be used to compute a URI in a given context, so you'll likely have options like
book_uri = "/upstairs/book/12345-67890",
title = "a fantastic story"
... depending on whether or not the book resource is staged on the same host as the resource that handles the POST request.
Is this better? That's going to depend on what tradeoffs you need to make, and how much you value each of the benefits versus the costs.
The REST interface is designed to be efficient for large-grain hypermedia data transfer, optimizing for the common case of the Web, but resulting in an interface that is not optimal for other forms of architectural interaction. -- Fielding, 2000


REST API: Does validation on identifiers break encapsulation? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 months ago.
Improve this question
I figured I'd post here to get some ideas/feedback on something I've come up against recently. The API I've developed has validation on an identifier that's passed through as a path parameter:
e.g. /resource/resource_identifier
There are some specific business rules as to what makes an idenfier valid and my API has validation which enforces these rules and returns a 400 when that's violated.
Now the reason I'm writing this is that I've been doing this sort of thing in every REST (ish) API I've ever written. It's kind of ingrained in me now but ecently I've been told that this is 'bad' and that it breaks encapsulation. Furthermore, it does this by forcing a consumer to have knowledge about the format of an identifier. I'm told that I should be returning a 404 instead and simply accept anything as an idenfier.
We've had some pretty heated debates about this and what encapsulation actually means in the context of REST. I've found numerous definitions but they aren't specific. As with any REST contention it's hard to substantiate an argument for either.
If StackOverflow would allow me, I'd like to try and gain a concensus on this and why APIs like Spotify for example, use 400 in this scenario.
I've been doing this sort of thing in every REST (ish) API I've ever written. It's kind of ingrained in me now but recently I've been told that this is 'bad'
In the context of HTTP, it is an "anti-pattern", yes.
I'm told that I should be returning a 404 instead
And that is the right pattern when you want the advantages of responding like a general purpose web server.
Here's the point: if you want general purpose components in the HTTP application to be able to do sensible things with your response messages, then you need to provide them with the appropriate meta data.
In the case of a target resource identifier that satisfies the request-target production rules defined in RFC 9112 but is otherwise unsatisfactory; you can choose any response semantics you want (400? 403? 404? 499? 200?).
But if you choose 404, then general purpose components will know that the response is an error that can be re-used for other requests (under appropriate conditions - see RFC 9111).
why APIs like Spotify for example, use 400 in this scenario.
Remember: engineering is about trade offs.
The benefits of caching may not outweigh more cost effective request processing, or more efficient incident analysis, or ....
It's also possible that it's just habit - it's done that way because that's the way that they have always done it; or because they were taught it as a "best practice", or whatever. One of the engineering trade offs we need to consider is whether or not to invest in analyzing a trade off!
An imperfect system that ships earns more market share than a perfect solution that doesn't.
While it may sound natural to expose the resource internal ID as ID used in the URI, remember that the whole URI itself is the identifier of a resource and not only the last bit of the URI. Clients are usually also not interested in the characters that form the URI (or at least they shouldn't care about it) but only in the state they receive upon requesting that from the API/server.
Further, if you think long-term, which should be the reason why you want to build your design on top of a REST architecture, is there a chance that the internal identifier of a resource could ever change? If so, introducing an indirection could make more sense then i.e. by using UUIDs instead of product IDs in the URI and then have a further table/collection to perform a mapping from UUID to domain object ID. Think of a resource that exposes some data of a product. It may sound like a good idea to use the product ID at the end of the URI as they identify the product in your domain model clearly. But what happens if you company undergoes a merge with an other company that happens to have an overlap on product but then uses different identifiers than you? I've seen such cases in reality, unfortunately, and almost all of them wanted to avoid change for their clients and so had to support multiple URIs for the same products in the end.
This is exactly why Mike Amundsen said
... your data model is not your object model is not your resource model ... (Source)
REST is full of such indirection mechanisms to allow such systems to avoid coupling. I.e. besides above mentioned mechanism, you also have link-relations to allow servers to switch URIs when needed while clients can still lookup the URI via the exposed relation name, or its focus on negotiated media types and its representation formats rather than forcing clients to speak their API-specific RPC-like, plain-JSON slang.
Jim Webber further coined the term domain application protocol to describe that HTTP is an application protocol for exchanging documents and any business rules we infer are just side effects of the actual document management performed by HTTP. So all we do in "REST" is basically to send documents back and forth and infer some business logic to act upon receiving certain documents.
In regards to encapsulation, this isn't the scope of REST nor HTTP. What data you return depends on your business needs and/or on the capabilities of the representation formats exchanged. If a certain media-type isn't able to express a certain capability, providing such data to clients might not make much sense.
In general, I'd would recommend not to use domain internal IDs as part of URIs for the above mentioned reasons. Usually that information should be part of the exchanged payload to give users/customers the option to refer to that resources on other channels like e/mail, telephone, ... Of course, that depends on the resource and its purpose at hand. As a user I'd prefer to refer to myself with my full name rather than some internal user- or customer ID or the like.
edit: sorry, missed the validation aspect ...
If you expect user/client input on the server/API side, you should always validate the data before starting to process it. Usually though, URIs are provided by the server and might only trigger business activities if the URI requested matches one of your defined rules. In general, most frameworks will respond with 400 Bad Request responses when they couldn't map the URI to a concrete action, giving the client a chance to correct its mistake and reissue the updated request. As URIs shouldn't be generated or altered by clients anyways, validating such parameters might be unnecessary overhead unless they might introduce security risks. Here it might be a better approach then to toughen-up the mapping rules of URIs to actions then and let those frameworks respond with a 400 message when clients use stuff they aren't supposed to.
Encapsulation makes sense when we want to hide data and implementation behind an interface. Here we want to expose the structure of the data, because it is for communication, not for storage and the service certainly needs this communication in order to function. Validation of data is a very basic concept, because it makes the service reliable and because is protects against hacking attempts. The id here is a parameter and checking its structure is just parameter validation, which should return 400 if failed. So this is not restricted to the body of the request, the problem can be anywhere in the HTTP message as you can read below. Another argument against 404 that the requested resource cannot possibly exist, because we are talking about a malformed id and so a malformed URI. It is very important to validate every user input, because a malformed parameter can be used for injections e.g. for SQL injection if it is not validated.
The HyperText Transfer Protocol (HTTP) 400 Bad Request response status
code indicates that the server cannot or will not process the request
due to something that is perceived to be a client error (for example,
malformed request syntax, invalid request message framing, or
deceptive request routing).
The HTTP 404 Not Found response status code indicates that the server
cannot find the requested resource. Links that lead to a 404 page are
often called broken or dead links and can be subject to link rot.
A 404 status code only indicates that the resource is missing: not
whether the absence is temporary or permanent. If a resource is
permanently removed, use the 410 (Gone) status instead.
In the case of REST we describe the interface using the HTTP protocol, URI standard, MIME types, etc. instead of the actual programming language, because they are language independent standards. As of your specific case it would be nice to check the uniform interface constraints including the HATEOAS constraint, because if your service makes the URIs as it should, then it is clear that a malformed id is something malicious. As of Spotify and other APIs, 99% of them are not REST APIs, maybe REST-ish. Read the Fielding dissertation and standards instead of trying to figure it out based on SO answers and examples. So this a classic RTFM situation.
In the context of REST a very simple example of data hiding is storing a number something like:
PUT /x {"value": "111"} "content-type:application/vnd.example.binary+json"
GET /x "accept:application/vnd.example.decimal+json" -> {"value": 7}
Here we don't expose how we store the data. We just send the binary and decimal representations of it. This is called data hiding. In the case of id it does not make sense to have an external id and convert it to an internal id, it is why you use the same in your database, but it is fine to check if its structure is valid. Normally you validate it and convert it into a DTO.
Implementation hiding is more complicated in this context, it is sort of avoiding micromanagement with the service and rather implement new features if it happens frequently. It might involve consumer surveys about what features they need and checking logs and figuring out why certain consumers send way too many messages and how to merge them into a single one. For example we have a math service:
PUT /x 7
PUT /y 8
PUT /z 9
PUT /s 0
PATCH /s {"add": "x"}
PATCH /s {"add": "y"}
PATCH /s {"add": "z"}
GET /s -> 24
POST /expression {"sum": [7,8,9]} -> 24
If you want to translate between structured programming, OOP and REST, then it is something like this:
Number countCartTotal(CartId cartId);
interface iCart {
Number countTotal();
GET api/cart/{cartid}/total -> {total}
So an endpoint represents an exposed operation something like verbNoun(details) e.g. countCartTotal(cartId), which you can split into verb=countTotal, noun=cart, details=cartId and build the URI from it. The verb must be transformed into a HTTP method. In this case using GET makes the most sense, because we need data instead of sending data. The rest of the verb must be transformed into a noun, so countTotal -> GET totalCount. Then you can merge the two nouns: totalCount + cart -> cartTotal. Then you can build an URI template based on the resulting noun and the details: cartTotal + cartId -> cart/{cartid}/total and you are done with the endpoint design GET {root}/cart/{cartid}/total. Now you can bind it to the countCartTotal(cartId) or to the repo.resource(iCart, cartId).countTotal().
So I think if the structure of the id does not change, then you can even add it to the API documentation if you want to. Though it is not necessary to do so.
From security perspective you can return 404 if the only possible reason to send such a request is a hacking attempt, so the hacker won't know for certain why it failed and you don't expose details of the protection. In this situation it would be overthinking the problem, but in certain scenarios it makes sense e.g. where the API can leak data. For example when you send a password reset link, then a web application usually asks for an email address and most of them send an error message if it is not registered. This can be used to check if somebody is registered on the site, so better to hide this kind of errors. I guess in your case the id is not something sensitive and if you have proper access control, then even if a hacker knows the id, they cannot do much with that information.
Another possible aspect is something like what if the structure of the id changes. Well we write a different validation code, which allows only the new structure or maybe both structures and make a new version of the API with v2/api and v2/docs root and documentation URIs.
So I fully support your point of view and I think the other developer you mentioned does not even understand OOP and encapsulation, not to mention webservices and REST APIs.

Sub-resource creation url

Lets assume we have some main-resource and a related sub-resource with 1-n relation;
User of the API can:
list main-resources so GET /main-resources endpoint.
list sub-resources so GET /sub-resources endpoint.
list sub-resources of a main-resource so one or both of;
GET /main-resources/{main-id}/sub-resources
GET /sub-resouces?main={main-id}
create a sub-resource under a main-resource
POST /main-resource/{main-id}/sub-resouces: Which has the benefit of hierarchy, but in order to support this one needs to provide another set of endpoints(list, create, update, delete).
POST /sub-resouces?main={main-id}: Which has the benefit of having embedded id inside URL. A middleware can handle and inject provided values into request itself.
create a sub-resource with all parameters in body POST /sub-resources
Is providing a URI with main={main-id} query parameter embedded a good way to solve this or should I go with the route of hierarchical URI?
In a true REST environment the spelling of URIs is not of importance as long as the characters used in the URI adhere to the URI specification. While RFC 3986 states that
The path component contains data, usually organized in hierarchical form, that, along with data in the non-hierarchical query component (Section 3.4), serves to identify a resource within the scope of the URI's scheme and naming authority (if any). The path is terminated by the first question mark ("?") and number sign ("#") character, or by the end of the URI. (Source)
it does not state that a URI has to have a hierarchical structure assigned to it. A URI as a whole is a pointer to a resource and as such a combination of various URIs may give the impression of some hierarchy involved. The actual information of whether URIs have some hierarchical structure to it should though stem from link relations that are attached to URIs. These can be registered names like up, fist, last, next, prev and the like or Web linking extensions such as which acts more like a predicate in a Semantic Web relation basically stating that the URI at hand is a parent to the current resource. Don't confuse rel-URIs for real URIs though. Such rel-URIs do not necessarily need to point to an actual resource or even to a documentation. Such link relation extensions though my be defined by media-types or certain profiles.
In a perfect world the URI though is only used to send the request to the actual server. A client won't parse or try to extract some knowledge off an URI as it will use accompanying link relation names to determine whether the URI is of relevance to the task at hand or not. REST is full of such "indirection" mechanism in order to help decoupling clients from servers.
I.e. what is the difference between a URI like and Developers in the first case might be tempted to create a user object representing the data returned by the URI invoked. Over time the response format might break as stuff is renamed, removed and replaced by other stuff. This is what Fielding called typed resources which REST shouldn't have.
The second URI doesn't give you a clue on what content it returns, and you might start questioning on what benefit it brings then. While you might not be aware of what actual content the service returns for such URIs, you know at least that your client is able to process the data somehow as otherwise the service would have responded with a 406 Not Acceptable response. So, content-type negotiation ensures that your client will with high certainty receive data it is able to process. Maintaining interoperability in a domain that is likely to change over time is one of RESTs strong benefits and selling points. Depending on the capabilities of your client and the service, you might receive a tailored response-format, which is only applicable to that particular service, or receive a more general-purpose one, like HTML i.e.. Your client basically needs a mapping to translate the received representation format into something your application then can use. As mentioned, REST is probably all about introducing indirections for the purpose of decoupling clients from servers. The benefit for going this indirection however is that once you have it working it will work with responses issued not only from that server but for any other service that also supports returning that media type format. And just think a minute what options your client has when it supports a couple of general-purpose formats. It then can basically communicate and interoperate with various other services in that ecosystem without a need for you touching it. This is how browsers operate on the Web for decades now.
This is exactly why I think that this phrase of Fielding is probably one of the most important ones but also the one that is ignored and or misinterpreted by most in the domain of REST:
A REST API should spend almost all of its descriptive effort in defining the media type(s) used for representing resources and driving application state, or in defining extended relation names and/or hypertext-enabled mark-up for existing standard media types. (Source)
So, in a true REST environment the form of the URI is unimportant as clients rely on other mechanisms to determine whether to use that URI or not. Even for so called "REST APIs" that do not really care about the true meaning of REST and treat it more like old-school RPC the question at hands is probably very opinionated and there probably isn't that one fits all solution. If your framework supports injecting stuff based on the presence of certain query parameters, use that. If you prefer the more hierarchical structure of URIs, go for those. There isn't a right or wrong in such cases.
According to the URI standard when you have a hierarchical relationship between resources, then better to add it to the path instead of the query. Sometimes it is better to describe the relation itself, not just the sub-resource, but that happens only if the sub-resource can belong to multiple main resources, which is n:m relationship.

Is it a good practice to use 'createModel' in REST?

I'm looking for a best way for implementing an endpoint of REST-full application that will be responsible for creating a new library orders. Let's assume that I have the following resources.
If I want to get all books of a particular author I can use the next endpoint:
If I want to fetch all orders of a particular book I can use the endpoint provided below:
My question is what will be the most suitable URL and a request model for an endpoint that will create orders?
From my perspective it can be
And one more question. Is it a good practice in REST to use request models like CreateOrder? If I want to create a REST-full web application can I use the following request model:
class CreateOrder
AuthorId: number;
BookId: number;
ClientId: number;
Sometimes it makes me confused. Should request models look like our resources or not?
Let's assume that I have the following resources.
Your "resources" look suspiciously like "tables". Resources are closer to (logical) documents about information.
what will be the most suitable URL and a request model for an endpoint that will create orders
For the most part, it doesn't matter what URL you use to create orders. In a hypermedia application (think HTML), I'm going to submit a "form", and the meta data associated with that form are going to describe for the client how to compose a request from the form data.
So the human, or the code, that is manipulating the form doesn't need to know anything about the URL (when is the last time that you looked to see where Google was actually sending your search?)
As far as general purpose web components are concerned, the URL/URI is just an opaque identifier - they don't care what the spelling means.
A thing they do care about is whether the spelling is the same as something that they have cached. One of the consequences of a successful POST /x message is that the cached representation(s) of /x are invalidated.
So if you like, you can think about which cached document should be refreshed when an order is created, and send the request to the identifier for that document.
Should request models look like our resources or not?
It's not necessary. Again, think about the web -- what would the representation of create order look like if you were POSTing form data?
or maybe
If the "who is creating an order" is answered using the authorization headers.
In our HTTP requests and responses, we are fundamentally sending message representations - sequences of bytes that conform to some schema. There's no particular reason that those sequences of bytes must, or must not, be the same as those we use elsewhere in the implementation.
Your end-point does not need to always start with /books. You can introduce another end-point /orders for creating or getting orders. So , to create an order , you can :
And does the 'request model' that you mean is the HTTP request body structure ? If yes, it does not need to be 100% match with your back-end persisted/domain model. Just include enough parameters that the server needs to know in order to create an order. (e.g. Include bookId rather than the whole book object etc.)
BTW , to get all books for a particular author , it is more common to use query parameter such as :
What you are doing is not REST, it is CRUD over HTTP. REST does not care about your URI structures and resources are very far from database tables. If CRUD is all you need, then download a CRUD generator library, which will generate all the upper and you won't need to write it manually.

bulk GET using HATEOAS

I've seen many examples of HATEOAS where every resource has links to related resources. An API that returns N items of a certain resource per page, the client would probably need N calls to fetch any nested resource by consuming HATEOAS. For example:
GET city/documents:
id: 1,
city: {
self: ''
document: { ... }
}, {
id: 2,
city: {
self: ''
document: { ... }
FYI, the query parameter uses the FIQL syntax to define the filters.
Now, if the client was to fetch the city details for each document (to show on UI), it will probably need N additional calls. However in my case, the /cities API can additionally take multiple city ids like this: /cities?filter=id=in=(1,2) that can reduce N calls to one. Is there a way to articulate something like this using HATEOAS? I've read about the templates but not sure how should the template look like and how would client consume it?
I've seen many examples of HATEOAS where every resource has links to related resources. An API that returns N items of a certain resource per page, the client would probably need N calls to fetch any nested resource by consuming HATEOAS.
Yes. Less true in a world with Server-Push, where the server can proactively provide multiple resources in response to a query. If you imagine asking for a web page, and getting the html, and then also the images and the java script resources too, then you've got the right sort of idea.
API can additionally take multiple city ids like this: /cities?filter=id=in=(1,2) that can reduce N calls to one. Is there a way to articulate something like this using HATEOAS?
Let's walk through it carefully. What you've done here is introduced a new resource, with identifier /cities?filter=id=in=(1,2). You might have another resource /cities?filter=id=in=(1,20) and another resource /cities?filter=id=in=(1,2000). In your implementation, these might be a "single endpoint" that extracts parameters from the identifier and uses them to generate the correct representation.
So what you get is something like a data transfer object - a large grained resource fetched in a single go.
I've read about the templates but not sure how should the template look like and how would client consume it?
The simplest example, which you have likely seen already, is a web form. You allow the client to provide the start and end elements, and the form processing takes that information and creates the specified URI from it.
So the client needs to understand what the form is for, and how to identify the semantics of the different elements in the form. The agent needs to understand the processing rules that transfer the form data into the URI.
URI Templates are the same basic idea; they give you a domain agnostic language with which to describe where the parameters go in a resource identifier. The basic pattern is the same - there needs to be agreement about the semantics of the parameters, the server provides a URI, the client provides a parameter map, and the generic code can take care of the merge
uri = template.apply(parameterMap)
URI Templates aren't quite as powerful as forms; with a form, you can introduce a default value for a parameter, but there is no analogous capability in URI templates.
HAL-Forms may give you a better sense of how a form based approach might work in JSON.

HATEOAS - Discovery and URI Templating

I'm designing a HATEOAS API for internal data at my company, but have been having troubles with the discovery of links. Consider the following set of steps for someone to retrieve information about a specific employee in this system:
User sends GET to http://coredata/ to get all available resources, returns a number of links including one tagged as rel = "http://coredata/rels/employees"
User follows HREF on the rel from the first request, performing a GET at (for example) http://coredata/employees
The data returned from this last call is my conundrum and a situation where I've heard mixed suggestions. Here are some of them:
That GET will return all employees (with perhaps truncated data), and the client would be responsible for picking the one it wants from that list.
That GET would return a number of URI templated links describing how to query / get one employee / get all employees. Something like:
"_links": {
"http://coredata/rels/employees#RetrieveOne": {
"href": "http://coredata/employees/{id}"
"http://coredata/rels/employees#Query": {
"href": "http://coredata/employees{?login,firstName,lastName}"
"http://coredata/rels/employees#All": {
"href": "http://coredata/employees/all"
I'm a little stuck here with what remains closest to HATEOAS. For option 1, I really do not want to make my clients retrieve all employees every time for the sake of navigation, but I can see how using URI templating in example two introduces some out-of-band knowledge.
My other thought was to use the RetrieveOne, Query, and All operations as my cool URLs, but that seems to violate the concept that you should be able to navigate to the resources you want from one base URI.
Has anyone else managed to come up with a good way to handle this? Navigation is dead simple once you've retrieved one resource or a set of resources, but it seems very difficult to use for discovery.
Option 2 is not too bad as you're using RFC 6570 to characterize the URI patterns; while HATEOAS is usually stated in terms of not having clients synthesize URIs, if a server is prepared to make guarantees on the URI template and to tell it to clients explicitly in a standard format, it's acceptable. (I would be tempted to have the “list all employees” URL be without the all suffix, so as to distinguish it from the employee with that ID; the client should not — in principle — know what an employee ID looks like.)
In fact, the main problem is actually that clients have to understand what those tag URIs mean; there's just no real way to guess that “http://coredata/rels/employees#All” means “list all employees”. That's where you get into embedding knowledge in clients, semantic labeling, etc. and HATEOAS doesn't really address those things.
TL;DR: Use OPTIONS method to return programmatically consumable documentation and always implement pagination.
We create a number of internal REST services at my work. We have standardized on the use of the OPTIONS method to return the metadata of a resource. The metadata we return acts a parsable documentation of that resource. It indicates url templates, various options such as PAGE, PAGESIZE and the different methods that the resource supports. We also return rel links so top level resource discovery can occur with the use of OPTIONS without pulling and actual data.
We also implement pagination specifically to prevent issues around returning large amounts of data unnecessarily.
My HATEOAS API returns HTML as well as HAL+JSON, as you are using, and they both use the same URIs, so my JSON responses simply return what a human web user would see (minus all the pretty colours). e.g.
{"_links": {
"http://coredata/companies": { "href": "/companies?page=1" }
GET /companies?page=1
{"_links": {
"next": { "href": "?page=2" }