How to design endpoints for data not considered as a resource in a REST API

How to design endpoints for data not considered as a resource in a REST API - rest

User context:
An school administrator logs into a dashboard. The page displays a block of data at the top of the page:
Number of students who used the service over the past week
The aggregate feedback (positive, negative, neutral) left by the students over the past week in percentages.
Other aggregate data
Underneath is a bunch of charts and graphs representing usage of the service broken down by month, daily usage broken down by hour, etc.
My problem:
I'm trying to build an API following REST principals where endpoints should define a resource and HTTP verbs as the action to take on those resources. My problem is going about building endpoints for this more 'analytical' and aggregate data that doesn't really seem to fit anywhere in my resources. Ideally, each graph or chart could be one request to an endpoint, and the block of aggregate data at the top would also be its own request, rather than 3 requests (1 for each piece of data). Can someone guide me in the right direction on how to go about building the endpoints for these specific scenarios?
Thanks

Can someone guide me in the right direction on how to go about building the endpoints for these specific scenarios?
TL;DR: How would you build a web site to support those scenarios? Do that.
If you were using something like a document store, then you would take the URI, say /feedbackReports/lastWeek, and use that as a key, and pull from the document store a representation of that report, and return it to the client (along with various bits of metadata).
If you were using something like a file system, then you would take the URI, and construct some reference to a file, like /www_root/feedbackReports/lastWeek, and read the representation of that report from disk, and return it to the client (along with various bits of metadata).
Is you were using something like a relational database, then you would take the URI, and see that the "last week" report was being asked for, and from that you would inject a bunch of "-7 days" parameters into prepared statements, and run them, then reshape the data in memory into some representation of that report, and return it to the client (along with various bits of metadata).
I'm trying to build an API following REST principals where endpoints should define a resource and HTTP verbs as the action to take on those resources
The REST principle in question is that the API isolates the clients (and intermediary components) from all of the implementation details. The API is the mask that your application wears so that web integrations just work.
My problem is going about building endpoints for this more 'analytical' and aggregate data that doesn't really seem to fit anywhere in my resources.
So create more resources.
Note: these are integration resources; which is to say that they produce the representations that web clients need to interact with your domain.
Jim Webber, in 2008
URIs do NOT map onto domain objects - that violates encapsulation.
Work (ex: issuing commands to the domain model) is a side effect of
managing resources. In other words, the resources are part of the
anti-corruption layer. You should expect to have many many more
resources in your integration domain than you do business objects
in your business domain.

Related

Rest API design: Managing access to sub-entities

Note: I realize that this is close to being off-topic for being opinion-based, but I am hoping that there is some accepted best practice to handle this that I just don't know about.
My problem is the following: I need to design a Rest API for a program where users can create their own projects, and each project contains files that can only be seen by users that have access. I am stuck with how to design the "List all files of a project" query.
Standard Rest API practice would suggest two endpoints, like:
`GET /projects` # List all projects
`POST /projects` # Create new project
`GET /projects/id` # Get specific project
etc.
and the same for the files of a project.
However, there should never be a reason to list all files - only the files of a single project. To make it more complicated, access management needs to be a thing, users should never see files that are in projects they don't have access to.
I can see multiple ways to handle that:
So the obvious way is to implement the GET function, optionally with a filter. However, this isn't optimal, since if the user doesn't set a filter, it would have to crawl through all projects, check for each project whether the user has access, and then list all files the user has access to:
GET /files?project=test1
I could also make the files command a subcommand of the projects command - e.g.
GET /projects/#id/files
However, I have the feeling this isn't too restful, since it doesn't expose entities directly?
Is there any concencus on which should usually be implemented? Is it okay to "force" users to set a parameter in the first one? Or is there a third alternative that solves what I am looking for? Happy about any literature recommendations on how to design this as well.

Standard Rest API practice would suggest two endpoints
No, it wouldn't. REST practice would suggest figuring out the resources in your resource model.
Think "documents": I should be able to retrieve (GET) a document that describes all of the files in the project. Great! This document should only be accessible when the request authorization matches some access control list. Also good.
Maybe there should also be a document for each user, so they can see a list of all of the projects they have access to, where that document includes links to the "all of the files in the project" documents. And of course that document should also be subject to access control.
Note that "documents" here might be text, or media files, or scripts, or CSS, or pretty much any kind of information that you can transmit over the network. We can gloss the details, because "uniform interface" means that we manage them all the same way.
In other words, we're just designing a "web site" filled with interlinked documents, with access control.
Each document is going to need a unique identifier. That identifier can be anything we want: /5393d5b0-0517-4c13-a821-c6578cb97668 is fine. Because it can be anything we want, we have extra degrees of freedom.
For example, we might design our identifiers such that the document whose identifiers begin with /users/12345 are only accessible by requests with authorization headers that match user 12345, and that all documents whose identifiers begin with /projects/12345 are only accessible by requests with authorization headers that match any of the users that have access to that specific project, and so on.
In other words, it is completely acceptable to choose resource identifier spellings that make your implementation easier.
(Note: in an ideal world, you would have "cool" identifiers that are implementation agnostic, so that they still work even if you change the underlying implementation details of your server.)
I have the feeling this isn't too restful, since it doesn't expose entities directly?
It's fine. Resource models and entity models are different things; we shouldn't expect them to always match one to one.

After looking further, I came across this document from Microsoft. Some quotes:
Also consider the relationships between different types of resources and how you might expose these associations. For example, the /customers/5/orders might represent all of the orders for customer 5. You could also go in the other direction, and represent the association from an order back to a customer with a URI such as /orders/99/customer. However, extending this model too far can become cumbersome to implement. A better solution is to provide navigable links to associated resources in the body of the HTTP response message. This mechanism is described in more detail in the section Use HATEOAS to enable navigation to related resources.
In more complex systems, it can be tempting to provide URIs that enable a client to navigate through several levels of relationships, such as /customers/1/orders/99/products. However, this level of complexity can be difficult to maintain and is inflexible if the relationships between resources change in the future. Instead, try to keep URIs relatively simple. Once an application has a reference to a resource, it should be possible to use this reference to find items related to that resource. The preceding query can be replaced with the URI /customers/1/orders to find all the orders for customer 1, and then /orders/99/products to find the products in this order.
This makes me think that using solution 2 is probably the best case for me, since each file will be associated with only a single project, and should be deleted when a project is deleted. Files cannot exist on their own, outside of projects.

How to name REST API endpoints for very especific features?

So, I'm building a RESTful API, that's like a Parking system, it have ParkingLots and Cars that enter or leave ParkingLots, at this moment, my endpoints looks like this.
POST '/parking-lots' // To create a ParkingLot
POST '/cars' // To create a Car
But, how to name an endpoint that has EnterParkingLot or LeaveParkingLot feature following REST best pratices? I didn't found an article or blog post that answer this question so far.

But, how to name an endpoint that has EnterParkingLot or LeaveParkingLot feature following REST best pratices? I didn't found an article or blog post that answer this question so far.
The key abstraction of information in REST is a resource. Resources are generalizations of documents, not endpoints.
Useful work is a side effect of editing documents (Webber, 2011).
If you are having trouble figuring out the URI, that probably means that you haven't been thinking enough about the documents (aka you don't yet have a clear understanding of your "resource model").
One idea that can sometimes help is to think about a client with cached documents. When they send you one of these messages, which document in the cache is the most important one to invalidate?
In REST, it is normal (not necessarily common) to have a single resource handle many different kinds of edits. For instance, we might submit an EnterParkingLot form, that would update a parking lot resource, and then later submit a LeaveParkingLot form that updates the same parking lot resource (with the code on the server parsing the requests to distinguish the different kinds of edits).
But it would also be fine to add a "parking ticket" resource to your resource model, with a different ticket tracking the arrival and departure of each car.
Domain experts are typically good at telling you what the documents are, and what names make sense.

I would do it like:
POST '/parking-lot/$n/enter'
POST '/parking-lot/$n/leave'

RESTful design for accessing the same resource in different formats from different audiences

Use case
I am building a webshop where people can register/sign in and can purchase (and afterwards manage) their SaaS licenses. For this purpose I created (among others) the following REST endpoints:
// Lists all licenses which are linked to this user
r.Get("/users/{userID}/licenses", api.LicenseSvc.HandleGetLicenses())
// List details (such as purchase date, seat information, ...) for a given licenseID
r.Get("/users/{userID}/licenses/{licenseID}", api.LicenseSvc.HandleGetLicense())
// Creates a new license for the signed in user
r.Post("/users/{userID}/licenses", api.LicenseSvc.HandleCreateLicense())
// Each license has a limited number of seats. The license manager can free up seats to make room for others
r.Delete("/users/{userID}/licenses/{licenseID}/seats/{seatID}", api.LicenseSvc.HandleDeleteSeat())
The above endpoints are only supposed to be used in the webshop/license management panel. At the same time the same service has to serve endpoints for the SaaS products which actually use the license(s) a user has created/purchased before. This SaaS product needs different endpoints such as:
An endpoint to check at startup whether a given license is valid at all
An endpoint which also gets the license by ID (see above), but it should only return a subset of the license resource (e. g. it shouldn't return the date when the license was purchased)
My question
Due to the fact that I am building one REST API which is consumed by two different "audiences" (on the one hand the license manager/customer and on the other hand the SaaS software) I feel like I am running into conflicts.
The authentication of both audiences is different, both audiences want to access the same kind of resource (e. g. a license) but the resource "format" (no customer sensitive data for the SaaS requester) should be different depending on the audience.
Should this be reflected via different REST URLs or should I handle all that logic inside of my route handlers?
Should I even create two different services serving these different audiences? Like one API for the userpanel/license management and one for the SaaS products!?

REST is not very well prepared for these multi-client scenarios. i have seen many REST apis where certain attributes of resources will only be filled under certain circumstances where i find that perfectly fine. however i have also seen many examples of multiple use cases bunched into single resource models where the swagger documentation with its limiting structure terribly fails to communicate the purpose of each field.
so: as long as the use cases do not differ too much, i would try to keep the count of endpoints low.
tip: have a look at GraphQL, it is much better equipped for handling such cases with querying only for certain sets or even only asking for certain fields, putting the client in control. however using GraphQL as the primary interface is still somewhat exotic and comes with quite substantial initial cost compared to the plentiful REST infrastructure available. still worth it.

REST versus more complicated data requests

REST APIs work great for get-one, get-a-list etc.
But our frontend has a dashboard, and one part of the dashboard is a more complicated. It requires a query that aggregates/joins several different resources.
Returning the data is not a problem. But what of the taxonomy of the endpoint that returns this data? Since the data is not a resource, what should the URL look like?

For REST principles it does not matter much if data returned 'aggregates/joins several different resources'. It is implementation detail of underlying data store. The dashboard should not care how exactly that store is implemented, if it uses joins, multiple queries.
Whatever is displayed on dashboard (single item or list of items) still may be treated as resource.
Example: Imagine use case when dashboard shows aggregated user profile from multiple portals (Facebook, Linkedin, etc). You may still have REST resource /user/id for that, even if obtaining that single resource would require many complex operations.

Comparing data with RESTful API

For a website I am working on defining a RESTful API. I believe I got it (mostly) correct using proper resource URIs and proper use of GET/POST/UPDATE/DELETE.
However there is one point where I can't quite figure out what the proper way to do it "in" REST would be - comparison of lists.
Let's say I have a bookstore and a customer can have a wishlist. The wishlist consists of books (their full Book record, i.e. name, synopsis, etc) and a full copy of the list exists on the client. What would be a good way to design the RESTful API to allow a client to query the correctness of its local wishlist (i.e. get to know what books have been added/removed on the wishlist on the server side)?
One option would be to just download the full wishlist from the server and compare it locally. However this is quite a large amount of data (due to the embedded content) and this is a mobile client with a low-bandwidth connection, so this would cause a lot of problems.
Another option would be to download not the whole wishlist (i.e. not including book infos) but only a list of the books' identifiers. This would be not much data (compared to the previous option) and the client could compare the lists locally. However to get the full book record for newly added books, a REST call would have to be made for every single new book. Again, as this is a mobile client with bad network connectivity, this could be problematic.
A third option and my favorite, would be that the client sends its list of identifiers to the server and the server compares it to the wishlist and returns what books were removed and the data for books that were added. This would mean a single roundtrip and only the necessary amount of data. As the wishlist size is estimated to be less than 100 entries, sending just the IDs would be a minimal amount of data (~0.5kb). However I don't know what kind of call would be appropriate - it can't be GET as we are sending data (and putting it all in the URL does not feel right), it can't be POST/UPDATE as we do not change anything on the server. Obviously it's not DELETE either.
How would you implement this third option?
Side-question: how would you solve this problem (i.e. why is option 3 stupid or what better, simple solutions may there be)?
Thank you.
P.S.: A fourth option would be to implement a more sophisticated protocol where the server keeps track of changes to the list (additions/deletes) and the client can e.g. query for changes based on a version identifier or simply a timestamp. However I like the third option better as implementation-wise it is much more simpler and less error-prone on both client and server.

There is nothing in HTTP that says that POST must update the server. People seem to forget the following line in RFC2616 regarding one use of POST:
Providing a block of data, such as the result of submitting a
form, to a data-handling process;
There is nothing wrong with taking your client side wishlist and POSTing to a resource whose sole purpose is to return a set of differences.
POST /Bookstore/WishlistComparisonEngine

The whole concept behind REST is that you leverage the power of the underlying HTTP protocol.
In this case there are two HTTP headers that can help you find out if the list on your mobile device is stale. An added benefit is that the client on your mobile device probably supports these headers natively, which means you won't have to add any client side code to implement them!
If-Modified-Since: check to see if the server's copy has been updated since your client first retrieved it
Etag: check to see if a unique identifier for your client's local copy matches that which is on the server. An easy way to generate the unique string required for ETags on your server is to just hash the service's text output using MD5.
You might try reading Mark Nottingham's excellent HTTP caching tutorial for information on how these headers work.
If you are using Rails 2.2 or greater, there is built in support for these headers.
Django 1.1 supports conditional view processing.
And this MIX video shows how to implement with ASP.Net MVC.

I think the key problems here are the definitions of Book and Wishlist, and where the authoritative copies of Wishlists are kept.
I'd attack the problem this way. First you have Books, which are keyed by ISBN number and have all the metadata describing the book (title, authors, description, publication date, pages, etc.) Then you have Wishlists, which are merely lists of ISBN numbers. You'll also have Customer and other resources.
You could name Book resources something like:
/book/{isbn}
and Wishlist resources:
/customer/{customer}/wishlist
assuming you have one wishlist per customer.
The authoritative Wishlists are on the server, and the client has a local cached copy. Likewise the authoritative Books are on the server, and the client has cached copies.
The Book representation could be, say, an XML document with the metadata. The Wishlist representation would be a list of Book resource names (and perhaps snippets of metadata). The Atom and RSS formats seem good fits for Wishlist representations.
So your client-server synchronization would go like this:
GET /customer/{customer}/wishlist
for ( each Book resource name /book/{isbn} in the wishlist )
GET /book/{isbn}
This is fully RESTful, and lets the client later on do PUT (to update a Wishlist) and DELETE (to delete it).
This synchronization would be pretty efficient on a wired connection, but since you're on a mobile you need to be more careful. As #marshally points out, HTTP 1.1 has a lot of optimization features. Do read that HTTP caching tutorial, and be sure to have your web server properly set Expires headers, ETags, etc. Then make sure the client has an HTTP cache. If your app were browser-based, you could leverage the browser cache. If you're rolling your own app, and can't find a caching library to use, you can write a really basic HTTP 1.1 cache that stores the returned representations in a database or in the file system. The cache entries would be indexed by resource names, and hold the expiration dates, entity tag numbers, etc. This cache might take a couple days or a week or two to write, but it is a general solution to your synchronization problems.
You can also consider using GZIP compression on the responses, as this cuts down the sizes by maybe 60%. All major browsers and servers support it, and there are client libraries you can use if your programming language doesn't already (Java has GzipInputStream, for instance).

If I strip out the domain-specific details from your question, here's what I get:
In your RESTful client-server application, the client stores a local copy of a large resource. Periodically, the client needs to check with the server to determine whether its copy of the resource is up-to-date.
marshally's suggestion is to use HTTP caching, which IMO is a good approach provided it can be done within your app's constraints (e.g., authentication system).
The downside is that if the resource is stale in any way, you'll be downloading the entire list, which sounds like it's not feasible in your situation.
Instead, how about re-evaluating the need to keep a local copy of the Wishlist in the first place:
How is your client currently using the local Wishlist?
If you had to, how would you replace the local copy with data fetched from the server?
What have you done to minimize your client's data requirements when building its Wishlist view(s) and executing business logic?

Your third alternative sounds nice, but I agree that it doesn't feel to RESTfull ...
Here's another suggestion that may or may not work: If you keep a version history of of your list, you could ask for updates since a specific version. This feels more like something that can be a GET operation. The version identifiers could either be simple version numbers (like in e.g. svn), or if you want to support branching or other non-linear history they could be some kind of checksums (like in e.g. monotone).
Disclaimer: I'm not an expert on REST philosophy or implementation by any means.
Edit: Did you ad that PS after I loaded the question? Or did I simply not read your question all the way through before writing an answer? Sorry. I still think the versioning might be a good idea, though.