REST API and ETag on individual resources of a list

REST API and ETag on individual resources of a list - rest

Considering I have a REST API exposing a repository of users :
/users/ -> returns an array of UserModel
/users/{Id} -> returns a UserModel
I needs to implement a client application that support offline mode (API not available) that will display the list of users and the detail of each user.
I am considering to synchronize in the client app the users this way :
Get the full list of users by calling a GET on /users/ and persist the list of users
Each time a user is accessing a user profile, if REST API available, check if the user has been updated by calling the REST API and update the user details if necessary
Display the user profile
I am considering using ETag (https://en.wikipedia.org/wiki/HTTP_ETag) to implement this behavior.
My issue is that I don't want my client application to get user details one by one by calling GET /users/{Id} but in a bulk by calling GET /users/ (with some paging if needed). If I do so, the client application will get a global ETAG of the list of users, but not ETags of each user. Thus it will not be able to verify individually if a user entity is up-to-date.
As a workaround, I am considering to add an ETAG field to the UserModel of the API. This way after calling GET /users/, the client app will be able to check if a specific user has been updated by calling GET /users/{Id} with the If-None-Match <User'sETagValue> header.
I know that the solution do no stick to he HTTP 1.1 standard, and that it adds a little complexity to the ETag generation.
However, I can't find any other post describing such a solution and I am wondering if it presents major issues ? And If there are more elegant solutions ?
Thanks for your help,
Edit : WebDav standard defines a "DAV:getetag" property that looks similar to my approach (http://www.webdav.org/specs/rfc4918.html#cache-control)

The WebDAV spec is also the first thing that came to mind for me.
I don't see an issue with adding etags to the response of your collection. You might even define a collection in a more general way, so that the format is just a list of URIs, their responses and headers so your client can treat it as a list of resources that need to be written to a cache.

Related

REST endpoint for complex actions

I have a REST API which serves data from the database to the frontend React app and to Android app.
The API have multiple common endpoints for each model:
- GET /model/<id> to retrieve a single object
- POST /model to create
- PATCH /model/<id> to update a single model
- GET /model to list objects
- DELETE /model/<id> to delete an object
Currently I'm developing an Android app and I find such scheme to make me do many extra requests to the API. For example, each Order object has a user_creator entry. So, if I want to delete all the orders created by specified user I need to
1) List all users GET /user
2) Select the one I need
3) List all orders he created GET /order?user=user_id
4) Select the order I want to delete
5) Delete the order DELETE /order/<id>
I'm wondering whether this will be okay to add several endpoints like GET /order/delete?user=user_id. By doing this I can get rid of action 4 and 5. And all the filtering will be done at the backend. However it seems to me as a bad architecture solution because all the APIs I've used before don't have such methods and all the filtering, sorting and other "beautifying" stuff is usually at the API user side, not the backend.
In your answer please offer a solution that is the best in your opinion for this problem and explain your point of view at least in brief, so I can learn from it

Taking your problem is in isolation:
You have an Order collection and a User collection
User 1..* Orders
You want to delete all orders for a given user ID
I would use the following URI:
// delete all orders for a given user
POST /users/:id/orders/delete
Naturally, this shows the relationship between Users & Orders and is self-explanatory that you are only dealing with orders associated with a particular user. Also, given the operation will result in side-effects on the server then you should POST rather than GET (reading a resource should never change the server). The same logic could be used to create an endpoint for pulling only user orders e.g.
// get all orders for a given user
GET /users/:id/orders

The application domain of HTTP is the transfer of documents over a network. Your "REST API" is a facade that acts like a document store, and performs useful work as a side effect of transferring documents. See Jim Webber (2011).
So the basic idioms are that we post a document, or we send a bunch of edits to an existing document, and the server interprets those changes and does something useful.
So a simple protocol, based on the existing remote authoring semantics, might look like
GET /orders?user=user_id
Make local edits to the representation of that list provided by the server
PUT /orders?user=user_id
The semantics of how to do that are something that needs to be understood by both ends of the exchange. Maybe you remove unwanted items from the list? Maybe there is a status entry for each record in the list, and you change the status from active to expired.
On the web, instead of remote authoring semantics we tend to instead use form submissions. You get a blank form from somewhere, you fill it out yourself, you post it to the indicated inbox, and the person responsible for processing that inbox does the work.
So we load a blank form into our browser, and we make our changes to it, and then we post it to the resource listed in the form.
GET /the-blank-form?user=user_id
Make changes in the form...
POST ????
What should the target-uri be? The web browser doesn't care; it is just going to submit the form to whatever target is specified by the representation it received. One answer might be to send it right back where we got it:
POST /the-blank-form?user=user_id
And that works fine (as long as you manage the metadata correctly). Another possibility is to instead send the changes to the resource you expect to reflect those changes:
POST /orders?user=user_id
and it turns out that works fine too. HTTP has interesting cache invalidation semantics built into the specification, so we can make sure the client's stale copy or the orders collection resource is invalidated by using that same resource as the target of the POST call.
Currently my API satisfies the table from the bottom of the REST, so, any extra endpoint will break it. Will it be fatal or not, that's the question.
No, it will be fine -- just add/extend a POST handler on the appropriate resource to handle the new semantics.
Longer answer: the table in wikipedia is a good representation of common practices; but common practices aren't quite on the mark. Part of the problem is that REST includes a uniform interface. Among other things, that means that all resources understand the same message semantics. The notion of "collection resources" vs "member resources" doesn't exist in REST -- the semantics are the same for both.
Another way of saying this is that a general-purpose component never knows if the resource it is talking to is a collection or a member. All unsafe methods (POST/PUT/PATCH/DELETE/etc) imply invalidation of the representations of the target-uri.
Now POST, as it happens, means "do something that hasn't been standardized" -- see Fielding 2009. It's the method that has the fewest semantic constraints.
The POST method requests that the target resource process the representation enclosed in the request according to the resource's own specific semantics. -- RFC 7231
It's perfectly fine for a POST handler to branch based on the contents of the request payload; if you see X, create something, if you see Y delete something else. It's analogous to having two different web forms, with different semantics, that submit to the same target resource.

Creating user record / profile for first time sign in

I use an authentication service Auth0 to allow users to log into my application. The application is a Q&A platform much like stackoverflow. I store a user profile on my server with information such as: 'about me', votes, preferences, etc.
When new user signs in i need to do 1 of 2 things:
For an existing user - retrieve the user profile from my api server
For a new user - create a new profile on the database
After the user signs in, Auth0(the authentication service) will send me some details(unique id, name and email) about the user but it does not indicate whether this is a new user(a sign up) or a existing user(a sign in).
This is not a complex problem but it would be good to understand best practice. I can think of 2 less than ideal ways to deal with this:
**Solution 1 - GET request **
Send a get request to api server passing the unique id
If a record is found return it
Else create new profile on db and return the new profile
This seems incorrect because the GET request should not be writing to the server.
**Solution 2 - One GET and a conditional POST request **
Send a get request to api server passing the unique id
The server checks the db and returns the profile or an error message
If the api server returns an error message send a post request to create a new profile
Else redirect to the home page
This seems inefficient because we need 2 requests to achieve a simple result.
Can anyone shed some light on what's best practice?

There's an extra option. You can use a rule in Auth0 to send a POST to the /users/create endpoint in your API server when it's the first time the user is logging in, assuming both the user database in Auth0 and in your app are up-to-date.
It would look something like this:
[...]
var loginCount = context.stats.loginsCount;
if (loginCount == 1) {
// send POST to your API and create the user
// most likely you'll want to await for response before moving on with the login flow
}
[...]
If, on the other hand, you're referring to proper API design and how to implement a find-or-create endpoint that's RESTful, maybe this answer is useful.

There seems to be a bit of disagreement on the best approach and some interesting subtleties as discussed in this post: REST Lazy Reference Create GET or POST?
Please read the entire post but I lean towards #Cormac Mulhall and #Blake Mitchell answers:
The client wants the current state of the resource from the server. It is not aware this might mean creating a resource and it does not care one jolt that this is the first time anyone has attempted to get this resource before, nor that the server has to create the resource on its end.
The following quote from The RESTful cookbook provided by #Blake Mitchell makes a subtle distinction which also supports Mulhall's view:
What are idempotent and/or safe methods?
Safe methods are HTTP methods that do not modify resources. For instance, using GET or HEAD on a resource URL, should NEVER change the resource. However, this is not completely true. It means: it won't change the resource representation. It is still possible, that safe methods do change things on a server or resource, but this should not reflect in a different representation.
Finally this key distinction is made in Section 9.1.1 of the HTTP specification:
Naturally, it is not possible to ensure that the server does not
generate side-effects as a result of performing a GET request; in
fact, some dynamic resources consider that a feature. The important
distinction here is that the user did not request the side-effects,
so therefore cannot be held accountable for them.
Going back to the initial question, the above seems to support Solution 1 which is to create the profile on the server if it does not already exist.

How to make initial request for nested resource from self describing REST API

Background:
I have a single page application that pulls data from a REST API. The API is designed such that the only URL necessary is the API root, ie https://example.com/api which provides URLs for other resources so that the client doesn't need to have any knowledge of how they are constructed.
API Design
The API has three main classes of data:
Module: Top level container
Category: A sub-container in a specific module
Resource: An item in a category
SPA Design
The app consuming the API has views for listing modules, viewing a particular module's details, and viewing a particular resource. The way the app works is it keeps all loaded data in a store. This store is persistent until the page is closed/refreshed.
The Problem:
My question is, if the user has navigated to a resource's detail view (example.com/resources/1/) and then they refresh the page, how do I load that particular resource without knowing its URL for the API?
Potential Solutions:
Hardcode URLs
Hardcoding the URLs would be fairly straightforward since I control both the API and the client, but I would really prefer to stick to a self describing API where the client doesn't need to know about the URLs.
Recursive Fetch
I could fetch the data recursively. For example, if the user requests a Resource with a particular ID, I could perform the following steps.
Fetch all the modules.
For each module, fetch its categories
Find the category that contains the requested resource and fetch the requested resource's details.
My concern with this is that I would be making a lot of unnecessary requests. If we have 100 modules but the user is only ever going to view 1 of them, we still make 100 requests to get the categories in each module.
Descriptive URLs
If I nested URLs like example.com/modules/123/categories/456/resources/789/, then I could do 3 simple lookups since I could avoid searching through the received data. The issue with this approach is that the URLs quickly become unwieldy, especially if I also wanted to include a slug for each resource. However, since this approach allows me to avoid hardcoding URLs and avoid making unnecessary network requests, it is currently my preferred option.
Notes:
I control both the client application and the API, so I can make changes in either place.
I am open to redesigning the API if necessary
Any ideas for how to address this issue would by greatly appreciated.

Expanding on my comment in an answer.
I think this is a very common problem and one I've struggled with myself. I don't think Nicholas Shanks's answer truly solves this.
This section in particular I take some issues with:
The user reloading example.com/resources/1/ is simply re-affirming the current application state, and the client does not need to do any API traversal to get back here.
Your client application should know the current URL, but that URL is saved on the client machine (in RAM, or disk cache, or a history file, etc.)
The implication I take from this, is that urls on your application are only valid for the life-time of the history file or disk cache, and cannot be shared with other users.
If that is good enough for your use-case, then this is probably the simplest, but I feel that there's a lot of cases where this is not true. The most obvious one indeed being the ability to share urls from the frontend-application.
To solve this, I would sum the issue up as:
You need to be able to statelessly map a url from a frontend to an API
The simplest, but incorrect way might simply be to map a API url such as:
http://api.example.org/resources/1
Directly to url such as:
http://frontend.example.org/resources/1
The issue I have with this, is that there's an implication that /resource/1 is taken from the frontend url and just added on to the api url. This is not something we're supposed to do, because it means we can't really evolve this api. If the server decides to link to a different server for example, the urls break.
Another option is that you generate uris such as:
http://frontend.example.org/http://api.example.org/resources/1
http://frontend.example.org/?uri=http://api.example.org/resources/1
I personally don't think this is too crazy. It does mean that the frontend needs to be able to load that uri and figure out what 'view' to load for the backend uri.
A third possibility is that you add another api that can:
Generate short strings that the frontend can use as unique ids (http://frontend.example.org/[short-string])
This api would return some document to the frontend that informs what view to load and what the (last known) API uri was.
None of these ideas sound super great to me. I want a better solution to this problem, but these are things I came up with as I was contemplating this.
Super curious if there's better ideas out there!

The current URL that the user is viewing, and the steps it took to get to the current place, are both application state (in the HATEOAS sense).
The user reloading example.com/resources/1/ is simply re-affirming the current application state, and the client does not need to do any API traversal to get back here.
Your client application should know the current URL, but that URL is saved on the client machine (in RAM, or disk cache, or a history file, etc.)
The starting point of the API is (well, can be) compiled-in to your client. Commpiled-in URLs are what couple the client to the server, not URLs that the user has visited during use of the client, including the current URL.
Your question, "For example, if the user requests a Resource with a particular ID", indicates that you have not grasped the decoupling that HATEOAS provides.
The user NEVER asks for a resource with such-and-such an ID. The user can click a link to get a query form, and then the server provides a form that generates requests to /collection/{id}. (In HTML, this is only possible for query strings, not path components, but other hypermedia formats don't have this limitation).
When the user submits the form with the ID number in the field, the client can build the request URL from the data supplied by the server+user.

Capturing audit trail information via REST

I'm struggling with coming up with the "right" way to capture audit information via a REST service. Let's say I've got an internal REST API for an Employee resource. I want to capture things when an Employee is added/modified/removed such as the user who did the change, the application the user was using, when it was done (assume this could be asynchronous so the user's action may have taken place at a different time than the REST call), etc. Also, the user that initiated the change may not be the authenticated user making the REST call.
My thoughts are that those properties do not belong in the body of the request - meaning that they are not attributes of the Employee object. They are not something that would be retrieved and returned on a GET, so they shouldn't be in the POST/PUT. They also do not belong as a parameter because parameters should be for specifying additional things about Employees or a search/filter critiera on GET requests for Employees.
My current thoughts are to have the client specify this information in the HTTP headers. That keeps the URL parameters & body pure for the Employee resource. Is that an appropriate use of the headers? Are there other options that I'm not seeing?

I'm working on a project with a very similar problem, and we did end up using HTTP headers to track auditing information. Actually, this was a byproduct of requiring an Authorization header which specifies the client user and application, and we use this information inside the REST service to store details in an audit log.
In your case, I don't think it's "wrong" to add custom X headers to specify the original user/application/time the request was made and storing these to an audit history in the service somewhere. Basically proxying on information via extra request headers. I also agree that these should not be part of the request body or URL parameters.

Is it bad practice for a REST endpoint to return different response fields based on the request?

I'm developing a 'user' endpoint for my mobile app. When the authenticated user GETs another user's profile, I want to return fewer fields than when they GET their own.
Is it semantically bad/against REST principles to return a different set of fields from a REST endpoint depending on some criteria such as whether the requesting user is retrieving their data vs another user's, or should I just have 2 endpoints for the same data source?

It's totally fine to return differing sets of data for the same URL based on authentication criteria. Think about a plain old web site. If you're logged in, you usually see diff contextual information than you do if you're anonymous, right? So getting more info back in the content of your response when it's the "current" user vs. a diff user is the same thing. If you really wanted to separate the fields you could with a sub-URL, but you definitely don't have to.