how to specify a range of data or multiple entities in a restful web-service - rest

To access an instance of a User in a restful web-service the url is structured as shown in the curl request below:
curl -v -X GET -s "$BASE_URL/User/${customer_id}.json"
If I wanted to specify all User entities or page through a range of User entities, such as the first 50 Users in my database, how would I structure my request so that it is compliant with REST ???

You should start by trying to de-emphasize the meaning of the characters in a URI. While nice, pretty and readable URIs are a good thing, they have nothing to do with REST -- in fact it's a good exercise to judge the design of a RESTful interface by changing all the URIs to meaningless strings, and add the prettiness afterwards. A client should never have any knowledge of the server's URI structure, for the simple reason that the server won't be able to change it afterwards.
Next, if a list of customers is a meaningful concept to you, you should by all means turn it into a resource of its own right. In its representation, you might want to include some information about each individual customer, along with a link to each individual customer resource.
Paging? Good idea -- turn each page into its own resource, and link them together. Take a look at the way Google presents search results for a hint.
Often that's all you need because it's not necessary for the client to be able to specify the page size. But if it is, you essentially have two options: You can have a resource that encapsulates a form that allows you to specify the parameters that, when submitted, will directly or indirectly via a redirect take you to the appropriate URI. Or you can include a poor man's form, a URI template.
In any case, and to re-iterate my first point: Don't view a RESTful interface as an API that manifests itself as a set of URI construction rules. Instead, think of resources, representations, and hypermedia.

Related

REST endpoint for complex actions

I have a REST API which serves data from the database to the frontend React app and to Android app.
The API have multiple common endpoints for each model:
- GET /model/<id> to retrieve a single object
- POST /model to create
- PATCH /model/<id> to update a single model
- GET /model to list objects
- DELETE /model/<id> to delete an object
Currently I'm developing an Android app and I find such scheme to make me do many extra requests to the API. For example, each Order object has a user_creator entry. So, if I want to delete all the orders created by specified user I need to
1) List all users GET /user
2) Select the one I need
3) List all orders he created GET /order?user=user_id
4) Select the order I want to delete
5) Delete the order DELETE /order/<id>
I'm wondering whether this will be okay to add several endpoints like GET /order/delete?user=user_id. By doing this I can get rid of action 4 and 5. And all the filtering will be done at the backend. However it seems to me as a bad architecture solution because all the APIs I've used before don't have such methods and all the filtering, sorting and other "beautifying" stuff is usually at the API user side, not the backend.
In your answer please offer a solution that is the best in your opinion for this problem and explain your point of view at least in brief, so I can learn from it
Taking your problem is in isolation:
You have an Order collection and a User collection
User 1..* Orders
You want to delete all orders for a given user ID
I would use the following URI:
// delete all orders for a given user
POST /users/:id/orders/delete
Naturally, this shows the relationship between Users & Orders and is self-explanatory that you are only dealing with orders associated with a particular user. Also, given the operation will result in side-effects on the server then you should POST rather than GET (reading a resource should never change the server). The same logic could be used to create an endpoint for pulling only user orders e.g.
// get all orders for a given user
GET /users/:id/orders
The application domain of HTTP is the transfer of documents over a network. Your "REST API" is a facade that acts like a document store, and performs useful work as a side effect of transferring documents. See Jim Webber (2011).
So the basic idioms are that we post a document, or we send a bunch of edits to an existing document, and the server interprets those changes and does something useful.
So a simple protocol, based on the existing remote authoring semantics, might look like
GET /orders?user=user_id
Make local edits to the representation of that list provided by the server
PUT /orders?user=user_id
The semantics of how to do that are something that needs to be understood by both ends of the exchange. Maybe you remove unwanted items from the list? Maybe there is a status entry for each record in the list, and you change the status from active to expired.
On the web, instead of remote authoring semantics we tend to instead use form submissions. You get a blank form from somewhere, you fill it out yourself, you post it to the indicated inbox, and the person responsible for processing that inbox does the work.
So we load a blank form into our browser, and we make our changes to it, and then we post it to the resource listed in the form.
GET /the-blank-form?user=user_id
Make changes in the form...
POST ????
What should the target-uri be? The web browser doesn't care; it is just going to submit the form to whatever target is specified by the representation it received. One answer might be to send it right back where we got it:
POST /the-blank-form?user=user_id
And that works fine (as long as you manage the metadata correctly). Another possibility is to instead send the changes to the resource you expect to reflect those changes:
POST /orders?user=user_id
and it turns out that works fine too. HTTP has interesting cache invalidation semantics built into the specification, so we can make sure the client's stale copy or the orders collection resource is invalidated by using that same resource as the target of the POST call.
Currently my API satisfies the table from the bottom of the REST, so, any extra endpoint will break it. Will it be fatal or not, that's the question.
No, it will be fine -- just add/extend a POST handler on the appropriate resource to handle the new semantics.
Longer answer: the table in wikipedia is a good representation of common practices; but common practices aren't quite on the mark. Part of the problem is that REST includes a uniform interface. Among other things, that means that all resources understand the same message semantics. The notion of "collection resources" vs "member resources" doesn't exist in REST -- the semantics are the same for both.
Another way of saying this is that a general-purpose component never knows if the resource it is talking to is a collection or a member. All unsafe methods (POST/PUT/PATCH/DELETE/etc) imply invalidation of the representations of the target-uri.
Now POST, as it happens, means "do something that hasn't been standardized" -- see Fielding 2009. It's the method that has the fewest semantic constraints.
The POST method requests that the target resource process the representation enclosed in the request according to the resource's own specific semantics. -- RFC 7231
It's perfectly fine for a POST handler to branch based on the contents of the request payload; if you see X, create something, if you see Y delete something else. It's analogous to having two different web forms, with different semantics, that submit to the same target resource.

REST API design for resource modification: catch all POST vs multiple endpoints

I'm trying to figure out best or common practices for API design.
My concern is basically this:
PUT /users/:id
In my view this endpoint could by used for a wide array of functions.
I would use it to change the user name or profile, but what about ex, resetting a password?
From a "model" point of view, that could be flag, a property of the user, so it would "work" to send a modification.
But I would expect more something like
POST /users/:id/reset_password
But that means that almost for each modification I could create a different endpoint according to the meaning of the modification, i.e
POST /users/:id/enable
POST /users/:id/birthday
...
or even
GET /user/:id/birthday
compared to simply
GET /users/:id
So basically I don't understand when to stop using a single POST/GET and creating instead different endpoints.
It looks to me as a simple matter of choice, I just want to know if there is some standard way of doing this or some guideline. After reading and looking at example I'm still not really sure.
Disclaimer: In a lot of cases, people ask about REST when what they really want is an HTTP compliant RPC design with pretty URLs. In what follows, I'm answering about REST.
In my view this endpoint could by used for a wide array of functions. I would use it to change the user name or profile, but what about ex, resetting a password?
Sure, why not?
I don't understand when to stop using a single POST/GET and creating instead different endpoints.
A really good starting point is Jim Webber's talk Domain Driven Design for RESTful systems.
First key idea - your resources are not your domain model entities. Your REST API is really a facade in front of your domain model, which supports the illusion that you are just a website.
So your resources are analogous to documents that represent information. The URI identifies the document.
Second key idea - that URI is used by clients to cache representations of the resource, so that we don't need to send requests back to the server all the time. Instead, we have built into HTTP a bunch of standard ways for communicating caching meta data from the server to the client.
Critical to that is the rule for cache invalidation: a successful unsafe request invalidates previously cached representations of the same resource (ie, the same URI).
So the general rule is, if the client is going to do something that will modify a resource they have already cached, then we want the modification request to go to that same URI.
Your REST API is a facade to make your domain model look like a web site. So if we think about how we might build a web site to do the same thing, it can give us insights to how we arrange our resources.
So to borrow your example, we might have a web page representation of the user. If we were going to allow the client to modify that page, then we might think through a bunch of use cases (enable, change birthday, change name, reset password). For each of these supported cases, we would have a link to a task-specific form. Each of those forms would have fields allowing the client to describe the change, and a url in the form action to decide where the form gets submitted.
Since what the client is trying to achieve is to modify the profile page itself, we would have each of those forms submit back to the profile page URI, so that the client would know to invalidate the previously cached representations if the request were successful.
So your resource identifiers might look like:
/users/:id
/users/:id/forms/enable
/users/:id/forms/changeName
/users/:id/forms/changeBirthday
/users/:id/forms/resetPassword
Where each of the forms submits its information to /users/:id.
That does mean, in your implementation, you are probably going to end up with a lot of different requests routed to the same handler, and so you may need to disambiguate them there.

RESTful endpoints with different permissions

I'm creating an API where based on the permissions that authenticated user different properties of objects can be changed.
Whats the common way to approach this problem?
Should i have endpoints like
/admin/users and /users with different API definition and capabilities?
It sounds rather inflexible design, what about situation where user can have permission like can_modify_foo_prop and can_modify_bar_prop?
I was thinking a better solution would be instead to just provide one endpoint /users and based on authenticated user roles some fields would be read-only/hidden instead? That seems more flexible but could be more annoying to document/implement.
Remember, URI means Uniform Resource Identifier. Which means a given user (concept) should be always identified by the same URI, hence I would suggest your second proposal, to have a single hierarchy/list of users:
/users/1
/users/2
...
It is acceptable to define (in the appropriate Media-Type) the returning document to hold properties based on the current user's permissions.
Now, regarding whether that is easy to use or not is subjective. I would argue that returning pure data is always somewhat inconvenient, because the client has to parse and understand the data. This is why HTML and HTML Forms were created, so the client does not need to know how to present the data, and also doesn't need to know what's editable and what is not. Then again, I don't know your exact use-case.

Authorization and the query string in REST API design

We have a design where a user has access to products, but only in a certain category.
A superuser can list all products in all categories with a simple GET request, such as GET /products.
Simple filtering already exists via the query string, so the superuser can already restrict his search to category N by requesting GET /products?category=N.
Say that user X is authenticated, and user X has access to products with category of 3.
Should an API server:
mandate that the user pass the filter for the appropriate category -- ie require GET /products?filter=3, and GET /products would fail with 403 Forbidden? -- or
expect a simple GET /products and silently filter the results to those that the user is authorized to access?
expect a simple GET /products and silently filter the results to those that the user is authorized to access?
Changing the representation of a resource depending on which user is looking at it strikes me was a Bad Idea [tm].
Consider, for instance, the use case where a uri gets shared between two users. They each navigate to the "same" resource, but get back representations of two different entites.
Remember - here's what Fielding had to say about resources
The key abstraction of information in REST is a resource. Any information that can be named can be a resource: a document or image, a temporal service (e.g. "today's weather in Los Angeles"), a collection of other resources, a non-virtual object (e.g. a person), and so on. In other words, any concept that might be the target of an author's hypertext reference must fit within the definition of a resource. A resource is a conceptual mapping to a set of entities, not the entity that corresponds to the mapping at any particular point in time.
Now, there are conceptual mappings which depend on the perspective of the user (e.g. "today's weather in my city"). But I'm a lot more comfortable addressing those by delegating the work to another resource (Moved Temporarily to "today's weather in Camelot") than I am in treating the authorization credentials as a hidden part of the resource identifier.
If consumers of your api are arriving at this resource by following a link, then 403 Forbidden on the unfiltered spelling is fine. On the other hand, if consumers are supposed to be bookmarking this URI, then implementing it as a redirecting dispatcher may make more sense.
The redirecting approach is consistent with remarks made by Fielding in a conversation about content negotiation on the rest-discuss list in 2006
In general, I avoid content negotiation without redirects like the
plague because of its effect on caching.
E.g., if you follow a link to
http://www.day.com/
then the site will redirect according to the user's preference to one of
http://www.day.com/site/en/index.html
http://www.day.com/site/de/index.html
Both of the solutions are perfectly valid. You can always hide things when the user does not have the permission to see them. This means, that you can display a filtered representation of /products by the actual user, or you can hide the /products resource and give link only to the /products?filter=3 resource. I don't think there is a best practice here, since both solutions are easy to implement.
#VoiceOfUnreason:
Apparently you did not read Fielding's work carefully enough.
Changing the representation of a resource depending on which user is
looking at it strikes me was a Bad Idea [tm].
The representation depends on the whole request, not just on the GET /resource part. The Authorization header is part of the HTTP request as well. So changing the representation depending on the Authorization header is valid in REST. It is just the same as changing the representation depending on the Accept header. Or do you say that we should serve XML even if somebody requests for JSON? It is a huge mistake to think that a single resource can have only a single representation. Another mistake to return a broken link to the client and let them waste resources by sending pointless HTTP messages...
Consider, for instance, the use case where a uri gets shared between
two users. They each navigate to the "same" resource, but get back
representations of two different entites.
REST is for machine to machine communication and not for human to machine communication! The browser is not a REST client. Or did you mean that a facebook bot will share the link with a google crawler? Such a nonsense...

REST: Why URI as Data Container?

I am supposed to make web services for an app and thought I could do a nice job practicing the good practice. As I found out it's using REST. But there is one thing that makes very little sense in it.
Why use URI to pass any variable?
What we did in our last project is use POST only and pass whatever as raw POST data (which was JSON). That's not very RESTful. But it has some advantages. It was quite simple on the client side - I had a general function that takes URI and data as arguments and then it wraps it up and sends it.
Now, if I used proper REST, I would have to pass some data as part of the URI (user ID, for instance). All the other data (username, email and etc.) would have to go as raw data, like we did, I guess. That means I would have to separate user ID and the other data at some point. That's not so bad but still - why?
EDIT
Here is a more detailed example:
Let's say you want to access (GET) and update (POST) user data. You may have a service accessible under /user but what RESTful service would do is accept user's ID as part of the URI (/user/1234). All the other data (name, email and etc) would go to request content (probably as JSON).
What I pose is that it seems useless to make put user id in the URI. If you wanted to update user data - you would send additional data as content anyway. If you wanted to access it - you could use same generic method to request web service.
I know GET gets cached by a browser but I believe you have to cache it manually anyway if you use AJAX (web) or any HTTP client library (other platforms).
From point of scalability - you can always add more services.
You use the URI to identify the resource (user/document/webpage) you want to work with, and pass the related data inside the request.
It has the advantage that web infrastructure components can find out the location of the resource without having any idea how your content is represented. For example, you can use standard caches and load balancers, all they need to know is the URL and headers (which are always represented the same way) Whether you use JSON, protobuf or WAV audio to communicate with your resource is irrelevant.
This will for example let you keep different resources in totally different places, if you send it all as content you won't have the advantage of being able to place the resources in totally different locations, as for example http://cloud.google.com/resource1 and http://cloud.amazon.com/resource2.
All this will allow you to scale massively, which you won't be able to do if you put it all on http://my.url.com/rest and pass all resource info as content.
Re: Your edit
Passing the user id in the URL is the only way to identify the individual resource (user). Remember, it's the user that's the resource, not the "user store".
For example, a cache that caches http://my.url/user won't be much good, since it would return the same cached page for every user. If the cache can work with http://my.url/user/4711, it can cache every user separately. In the same way, a load balancer could know that users 1-5000 are handled by one machine, 5001-10000 by another etc. and make intelligent decisions based on the URL only.
Imagine a RESTful web service as a database.
To get or modify specific object you need to identify it by providing its primary key.
You identify a user by his ID, not his Name+Nickname+e-mail+mother's maiden name.
The information that identifies an object or selects a set of objects goes to the URL. The information that modifies objects should be POSTed to the corresponding URL.