Handling IDs in RESTful CRUD APIs - rest

I am designing a Create/Read/Update/Delete public web API endpoint, using RESTful patterns over HTTP with JSON payloads, and am wondering about a design issue that must be extremely common, yet I'm finding a hard time finding guidance for.
Let's focus only on the "Read" and "Update" part of the API. I am a bit confused about the current "proper" REST best practices when it comes to handling IDs. Here's what I mean:
"Reading" employees over HTTP GET returns one or more "widget" JSON object (for example, I have a way to retrieve all widgets that meet a certain criteria). Each widget object has an ID field which is a GUID.
I see several design options for the "Update widget" endpoint:
HTTP PUT to /api/widgets/{widget-id} of the entire widget object. The API will fail if the "ID" field in the object is present. (I don't like this approach because data from the "Read" endpoint cannot be round-tripped to the "Update" endpoint without modification)
HTTP PUT to /api/widgets/{widget-id} of the entire widget object. The API will ignore the "ID" field in the object, if present. (I think it's better than the above, but the ID supplied can be incorrect, and I think it's wrong to silently ignore bad data)
HTTP PUT to /api/widgets/{widget-id} of the entire widget object. The API will verify that the ID field in the object must match the ID in the URI, and will fail otherwise. (I think this is even better, but data is still duplicated between the URI and message body)
HTTP PUT to /api/widgets/{widget-id} of the entire widget object. The API will verify that the ID field in the object must be either absent or must match the ID in the URI, and will fail otherwise. (This is the approach I'm leaning towards)
HTTP PUT to /api/widgets of the entire widget object, including the ID field - i.e. ID of the object to update will come from the message body instead of the URI.
Same as #5 but with HTTP POST - perhaps with "Update" semantics if the ID is specified, and "Create" semantics if it isn't.
I can see various tradeoffs here. Option 6 seems particularly elegant to me, but not particularly "RESTful" and may be an unfamiliar pattern to the API's users. Most of the API design guideline documents I've seen seem to recommend the "PUT to /api/widgets/{widget-id}" approach but are silent on the #1/2/3/4 distinction above.
So what is the "RESTfully correct"/best-practies way to do this? (Something that would be most familiar and least confusing to developers using my public API endpoint). And are there any other design options (or design considerations) I'm not thinking of here?

You can have the ID exposed if its definitely mandatory. Another approach is that, you can have a unique field in your Entity. Without passing the ID, you can create a DTO which contains the Unique field too. In your case {widget-id} is unique and the id is as always an auto generated id of int. Use POST and that's the best approach in your case of public API's.
If in case you need multiple actions over widgets, create 4 different endpoints with "Widget" (example: site.com/widget) which defines get, post, put as well as delete as different methods. This means that a single API will function differently based on different methods that it is invoked with.

Related

Should an element in a REST API return its own ID?

What is the benefit of returning the ID of the element? Isn't it already part of the url and therefore known? I am not talking about using the REST API with HAL or something similiar.
api/employees/1
{
"Id" : 1
"Name" : "Joe Bloggs",
"Department" : "IT"
}
api/employees/1
{
"Name" : "Joe Bloggs",
"Department" : "IT"
}
I guess it makes sense to add more information regarding the usage of the API:
The API in question is a public API in a closed network (not internet). We provide sample clients but our customer write their own client for our API. The ID of an element is no sensitive information. The data is not about exmployees (as stated in the question) but about asset management.
The reason I am asking is, that customers are complaining that if they use some kind of middleware (whatever this is), they only receive the content of a element but do not have access to the url of the element (how?).
If you write your own client, is there any kind of situation where you can't get the ID based on the URL? Should we add the ID for people, who somehow do not have access to the url?
What is the client actually using the ID for? Presenting a product ID isn't that wrong IMO but does a user has to know the ID you store the user entity at in the DB when she uses an email to authenticate with the API anyways? So to answer the actual question: it depends. If the client, however, is using it to construct the next URI to invoke I strongly recommend to return links with meaningful relation names instead as this helps to decouple the client from the API as the client does not have to have a-priori knowledge of the API itself.
Depending on the resource it might not be benefitial to have an ascending ID as this might favor guessing attacks and also may lead to strange situation if you remove an item in the middle of the collection. Are the IDs of the subsequent items updated? Is a gap exposed between items? Usually UUIDs or the like are a much safer way to expose such information.
One further aspect to consider is that clients in an ideal REST environment should not interpret URIs itself but use the relation name the URI was returned for instead to determine whether to invoke that URI or not. A client which extracts an ID from an URI most likely has some a-priori knowledge of the API and is thus thighly coupled to that API and will with a certainty break if the API is ever going to be changed.
With that being said, there is the concept of URI patterns which should help a client in extracting things like IDs and names from URIs. Personally I'm not that keen on such things as they promote a misleading approach to the application of REST in an API design.
If you write your own client, is there any kind of situation where you can't get the ID based on the URL? Should we add the ID for people, who somehow do not have access to the url?
Extracting the ID of an URI requires knowledge of the URI structure. If you ever, at some later time, want to change the URI structure for whatever reason all clients that were built around that knowledge will break. URIs shouldn't contain content as the body is actually there for. As the ID seems to be content for some of the client include it in the response body. You are of course free to add some of the information to the URI though you shouldn't require clients on parsing that URI and extract the required information of it.

REST design for update/add/delete item from a list of subresources

I would like to know which is the best practice when you are having a resource which contains a list of subresources. For example, you have the resource Author which has info like name, id, birthday and a List books. This list of books exists only in relation with the Author. So, you have the following scenario:
You want to add a new book to the book list
You want to update the name of a book from the list
You want to delete a book from the list
SOLUTION 1
I searched which is the correct design and I found multiple approaches. I want to know if there is a standard way of designing this. I think the design by the book says to have the following methods:
To add: POST /authors/{authorId}/book/
To update: PUT /authors/{authorId}/book/{bookId}
To delete: DELETE /authors/{authorId}/book/{bookId}
SOLUTION 2
My solution is to have only one PUT method which does all these 3 things because the list of books exists only inside object author and you are actually updating the author. Something like:
PUT /authors/{authorId}/updateBookList (and send the whole updated book list inside the author object)
I find multiple errors in my scenario. For example, sending more data from the client, having some logic on the client, more validation on the API and also relying that the client has the latest version of Book List.
My question is: is it anti-pattern to do this?
SITUATION 1. In my situation, my API is using another API, not a database. The used API has just one method of "updateBookList", so I am guessing it is easier to duplicate this behavior inside my API too. Is it also correct?
SITUATION 2. But, supposing my API would use a database would it be more suitable to use SOLUTION 1?
Also, if you could provide some articles, books where you can find similar information. I know this kind of design is not written in stone but some guidelines would help. (Example: from Book REST API Design Rulebook - Masse - O'Reilly)
Solution 2 sounds very much like old-style RPC where a method is invoked that performs some processing. This is like a REST antipattern as REST's focus is on resources and not on methods. The operations you can perform on a resource are given by the underlying protocol (HTTP in your case) and thus REST should adhere to the semantics of the underlying protocol (one of its few constraints).
In addition, REST doesn't care how you set up your URIs, hence there are no RESTful URLs actually. For an automated system a URI following a certain structure has just the same semantics as a randomly generated string acting as a URI. It's us humans who put sense into the string though an application should use the rel attribute which gives the URI some kind of logical name the application can use. An application who expects a certain logical composition of an URL is already tightly coupled to the API and hence violates the principles REST tries to solve, namely the decoupling of clients from server APIs.
If you want to update (sub)resources via PUT in a RESTful way, you have to follow the semantics of put which basically state that the received payload replaces the payload accessible at the given URI before the update.
The PUT method requests that the state of the target resource be
created or replaced with the state defined by the representation
enclosed in the request message payload.
...
The target resource in a POST request is intended to handle the
enclosed representation according to the resource's own semantics,
whereas the enclosed representation in a PUT request is defined as
replacing the state of the target resource. Hence, the intent of PUT
is idempotent and visible to intermediaries, even though the exact
effect is only known by the origin server.
In regards to partial updates RFC 7231 states that partial updates are possible by either using PATCH as suggested by #Alexandru or by issuing a PUT request directly at a sub-resource where the payload replaces the content of the sub-resource with the one in the payload. For the resource containing the sub-resouce this has an affect of a partial update.
Partial content updates are possible by
targeting a separately identified resource with state that overlaps a
portion of the larger resource, or by using a different method that
has been specifically defined for partial updates (for example, the
PATCH method defined in [RFC5789]).
In your case you could therefore send the updated book collection directly via a PUT operation to something like an .../author/{authorId}/books resource which replaces the old collection. As this might not scale well for authors that have written many publications PATCH is probably preferable. Note, however, that PATCH requires an atomic and transactional behavior. Either all actions succeed or none. If an error occurs in the middle of the actions you have to role back all already executed steps.
In regards to your request for further literature, SO isn't the right place to ask this as there is an own off-topic close/flag reason exactly for this.
I'd go with the first option and have separate methods instead of cramming all logic inside a generic PUT. Even if you're relying on an API instead of a database, that's just a 3rd party dependency that you should be able to switch at any point, without having to refactor too much of your code.
That being said, if you're going to allow the update of a large number of books at once, then PATCH might be your friend:
Looking at the RFC 6902 (which defines the Patch standard), from the client's perspective the API could be called like
PATCH /authors/{authorId}/book
[
{ "op": "add", "path": "/ids", "value": [ "24", "27", "35" ]},
{ "op": "remove", "path": "/ids", "value": [ "20", "30" ]}
]
Technically, solution 1 hands down.
REST API URLs consist of resources (and identifiers and filter attribute name/values). It should not contain actions (verbs). Using verbs encourages creation of stupid APIs.
E.g. I know a real-life-in-production API that wants you to
do POST on /getrecords to get all records
do POST on /putrecords to add a new record
Reasons to choose solution 2 would not be technical.
For requirement #2 (You want to update the name of a book from the list), it is possible to use JSON PATCH semantics, but use HTTP PATCH (https://tools.ietf.org/html/rfc5789) semantics to design the URL (not JSON PATCH semantics as suggested by Alexandru Marculescu).
I.e.
Do PATCH on /authors/{authorId}/book/{bookId}, where body contains only PK and changed attributes. Instead of:
To update: PUT on /authors/{authorId}/book/{bookId}
JSON PATCH semantics may of course be used to design the body of a PATCH request, but it just complicates things IMO.

RESTful API required parameters in query string?

When designing a RESTful API, what to do if a GET request only makes sense if there are specific parameters associated with the request? Should the parameters be passed as a query string, and if so, what to do when all the parameters aren't specified or are formatted incorrectly?
For example, lets say i have a Post resource, which can be accessed by `api/posts` endpoint. Each post has a geographical location, and posts can be retrieved ONLY when specifying an area that the posts may reside in. Thus, 3 parameters are required: latitude, longitude and radius.
I can think of 2 options in this case:
1. Putting the parameters in query string: api/posts/?lat=5.54158&lng=71.5486&radius=10
2. Putting the parameters in the URL: api/posts/lat/5.54158/lng/71.5486/radius/10
Which of these would be the correct approach? It seems wrong to put required parameters in the query string, but the latter approach feels somewhat 'uglier'.
PS. I'm aware there are many discussion on this topic already (for example: REST API Best practices: Where to put parameters?), but my question is specifically addressed to the case when parameters are required, not optional.
The first approach is better.
api/posts/?lat=5.54158&lng=71.5486&radius=10
The second approach is a little misleading.
api/posts/lat/5.54158/lng/71.5486/radius/10
You should think of each of your directories as resources. In this cause, sub-resources (for example: "api/posts/lat/5.54158") are not really resources and thus misleading. There are cases where this pattern is a better solution, but looking at what's given, I'd go with using the query string. Unless you have some entity linking to link you directly to this URL, I don't really like it.
You should put everything in the query string and set the server to return an error code when not receiving the 3 required parameters.
Because it's a group of parameter that identify an object.
Taking the example:
lat=5.54158; lng=71.5486 radius=10
It would be very unlikely to this url make sense:
api/posts/lat/5.54158/lng/yyyy/radius/zz
It's different than:
api/memb/35/..
because the member with id 35 can have a lot of functions ( so, valid urls ) as:
api/memb/35/status or
api/memb/35/lastlogin
When designing a RESTful API, what to do if a GET request only makes
sense if there are specific parameters associated with the request?
Should the parameters be passed as a query string, and if so, what to
do when all the parameters aren't specified or are formatted
incorrectly?
By REST your API must fulfill the REST constraints, which are described in the Fielding dissertation. One of these constraints is the uniform interface constraint, which includes the HATEOAS constraint. According to the HATEOAS constraint your API must serve a standard hypermedia format as response. This hypermedia contains hyperlinks (e.g. HTML links, forms) annotated with metadata (e.g. link relation or RDF annotation). The clients check the metadata, which explains to them what the hyperlink does. After that they can decide whether they want to follow the link or not. When they follow the link, they can build the HTTP request based on the URI template, parameters, etc... and send it to the REST service.
In your case it does not matter which URI structure you use, it is for service usage only, since the client always uses the given URI template and the client does not care what is in that template until it is a valid URI template which it can fill with parameters.
In most of the cases your client has enough validation information to test whether the params are incorrect or missing. In that case it does not send a HTTP request, so you have nothing to do in the service. If an invalid param gets through, then in your case your service sends back a 404 - not found, since the URI is the resource identifier, and no resource belongs to an invalid URI (generated from the given URI template and invalid params).

RESTFul approach for updating a field

I wonder, what would the more RESTFul, flexible and better approach of updating(!) a field (state) of an item
/api/v1/items/:id?action=start
/api/v1/items/:id/start
/api/v1/items/:id/ + action in the body
/api/v1/items/:id/status/{active|stopped}
or items
/api/v1/items?action=start
/api/v1/items/start
/api/v1/items/ + action in the body
/api/v1/items/status/{active|stopped}
I would prefer the third API structure:
/api/v1/items/:id/ + action in the body
My reasons include:
According to the Richardson Maturity Model the URL should point to a specific resource or set of resources. You do not want to add update information within the URL, as it doesn't qualify as a valid endpoint.
You want to use PUT for update/replacement operations which affect a resource. Let the URL select the resource and let the body define the exact fields you want to update, and any other logic otherwise.
Using the body rather than the query string allows you to insert arbitrarily large information (to a certain limit, but greater than a query string) which logically might be paired with the operation (start in your case). It allows greater flexibility in extending the operation in the future as well.
You can probably list the relevant actions that can be performed on the endpoint inside the response of /api/v1/items. This would be a list of informative hypermedia controls. Again, the Richardson maturity model provides a very good example.
As alternative you can implement the PATCH method. It would provide you with the possibility to update selective fields. The only problem with PATCH is thats its unknown because the RFC is young. The actual implemention depends on your server and client side libraries and frameworks.
When you dont want to use PATCH the only alternative is to implement overriden POST and define the update mechanism. For example, you can say: Every field != null will override the resource field value.
Lets re phrase the question:
how do i change few attribute of my resource. (status is just another attribute)
Answer:
identify a resource.
Use POST (since the request is non idempotent)
supply in body, since in future you may need to change more attribute than just status for this resource.
POST /api/v1/items/:id + action in the body
use only POST method.
Reason:
Put should be used when it changes the complete set of properties not one or partial property(ies).
Please, let’s move on. We don’t need to use PUT for every state change in HTTP. REST has never said that we should. It is okay to use POST - roy t fielding

RESTful url to GET resource by different fields

Simple question I'm having trouble finding an answer to..
If I have a REST web service, and my design is not using url parameters, how can I specify two different keys to return the same resource by?
Example
I want (and have already implemented)
/Person/{ID}
which returns a person as expected.
Now I also want
/Person/{Name}
which returns a person by name.
Is this the correct RESTful format? Or is it something like:
/Person/Name/{Name}
You should only use one URI to refer to a single resource. Having multiple URIs will only cause confusion. In your example, confusion would arise due to two people having the same name. Which person resource are they referring to then?
That said, you can have multiple URIs refer to a single resource, but for anything other than the "true" URI you should simply redirect the client to the right place using a status code of 301 - Moved Permanently.
Personally, I would never implement a multi-ID scheme or redirection to support it. Pick a single identification scheme and stick with it. The users of your API will thank you.
What you really need to build is a query API, so focus on how you would implement something like a /personFinder resource which could take a name as a parameter and return potentially multiple matching /person/{ID} URIs in the response.
I guess technically you could have both URI's point to the same resource (perhaps with one of them as the canonical resource) but I think you wouldn't want to do this from an implementation perspective. What if there is an overlap between IDs and names?
It sure does seem like a good place to use query parameters, but if you insist on not doing so, perhaps you could do
person/{ID}
and
personByName/{Name}
I generally agree with this answer that for clarity and consistency it'd be best to avoid multiple ids pointing to the same entity.
Sometimes however, such a situation arises naturally. An example I work with is Polish companies, which can be identified by their tax id ('NIP' number) or by their national business registry id ('KRS' number).
In such case, I think one should first add the secondary id as a criterion to the search endpoint. Thus users will be able to "translate" between secondary id and primary id.
However, if users still keep insisting on being able to retrieve an entity directly by the secondary id (as we experienced), one other possibility is to provide a "secret" URL, not described in the documentation, performing such an operation. This can be given to users who made the effort to ask for it, and the potential ambiguity and confusion is then on them, if they decide to use it, not on everyone reading the documentation.
In terms of ambiguity and confusion for the API maintainer, I think this can be kept reasonably minimal with a helper function to immediately detect and translate the secondary id to primary id at the beginning of each relevant API endpoint.
It obviously matters much less than normal what scheme is chosen for the secret URL.