Transient REST Representations - rest

Let's say I have a RESTful, hypertext-driven service that models an ice cream store. To help manage my store better, I want to be able to display a daily report listing quantity and dollar value of each kind of ice cream sold.
It seems like this reporting capability could be exposed as a resource called DailyReport. A DailyReport can be generated quickly, and there doesn't seem to be any advantage to actually storing reports on the server. I only want a DailyReport for some days, other days I don't care about getting a DailyReport. Furthermore, storing DailyReports on the server would complicate client implementations, which would need remember to delete reports they no longer need.
A DailyReport is transient; its representation can be retrieved only once. One way to implement this would be to offer a link "/daily-reports", a POST to which will return a response containing a DailyReport representation listing the information for that day's sales.
Edit: Let's also say that I really do want to do a POST request. A DailyReport has many different options for creating a view such as sorting ice cream types alphabetically, by dollar value - or including an hourly breakdown - or optionally including the temperature for that day - or filtering out certain ice cream types (as a list). Rather than using query parameters with a GET, I'd rather POST a DailyReport representation with the appropriate options (using a well-defined custom media type to document each option). The representation I get back would display my options along with the report itself.
Is this the correct way to think about the problem, or should some other approach be used instead? If correct, what special considerations might be important when implementing the DailyReport resource? (For example, it probably wouldn't be appropriate to set the Location header when returning after a POST request).

If you want to make daily reports for past days available, you could implement it as a GET to /daily_reports/2009/08/20. I agree with John Millikin that a POST is unnecessary here - there's no need for something like this to be a user-creatable resource.
The advantage of making the report for each day available as its own URI is cacheability.
EDIT: A good solution might be to merge the two answers, making daily_report/ a no-cache representation of the current day's data and daily_reports/yyyy/mm/dd a cacheable representation of a full day's data.

There's no need to use a POST for this, since requesting the report doesn't change the state of the server. I would use a resource like this:
GET /daily-report/
200 OK
Pragma: no-cache
<daily-report for="2009-04-20" generated-at="2009-4-20T12:13:14Z">
<!-- contents of the report here -->
</daily-report>
Responding to your edit: if you are POSTing a description of the report to a URL, and retrieving a temporary data set as a result, that's not REST at all. It's RPC, in the same vein as SOAP. RPC is not an inherently bad thing, but please, please don't call it RESTful.

Sometimes it is desirable to keep a record of requests for reports, in those cases it is not unreasonable to POST to a collection resource. It is also useful for long running reports where you want handle the execution asynchronously. How long the server holds onto those report requests is up to you.
I would do something like
POST /DailyReportRequests
which would return a representation of the request, including options, and when the report is completed, a link to the report results.
Another alternative which is good when you have a set of pre-canned reports is to create a DailyReports resource that contains a list of preconfigured report links. The OpenSearchDescription spec allows you to do something similar to this using the Query tag.

I think Greg's approach is the correct one. To expound upon it, I don't think you should provide a /daily-report resource that changes daily, because running the report on Tuesday at 11:59 would yield different results than running it Wednesday at 00:01, which can be A) confusing for clients expecting the resource to be the same, and B) doesn't allow clients to retrieve a previous day's data after the day has passed. You should provide a unique resource identifier for each daily report that's available, that way clients can access the information they need at any time.

Related

REST resource for checking if a resource has existing subresources

My application provides the following resource:
GET /user/:id/orders
As commonly used, this returns a list of all the user's orders.
Now, a client wants to check if a user has any orders at all without actually getting the complete list.
My current approach looks like this:
GET /user/:id/orders/exist
But it looks kind of odd to me.
Is there a more "standard" way of designing this? In the end, this resource only needs to return the information:
yes, user has orders
no, user doesn't have any orders
What you will see in some API is the notion of a resource that exists (204) or does not exist (404).
But I really don't recommend that: saving a few bytes in your representation of the resource doesn't help very much when you are already sending a response-line and a bunch of HTTP headers.
Your "resource model" can be anything you want.
The REST interface is designed to be efficient for large-grain hypermedia data transfer, optimizing for the common case of the Web, but resulting in an interface that is not optimal for other forms of architectural interaction -- Fielding, 2000
So you can create fine grained resources if you like; but there are can be consequences to that. "Tradeoffs" are a thing.
Resources are generalizations of "documents"; if it makes sense to have a report that is just a count of the number of orders, or a statement that the number of orders is greater than zero, or whatever, then that report can certainly be a resource in your resource model.
If you know what the report is, then you might be able to guess at a name for the report, and from there to a spelling for it.
There's no particular reason that the report identifier has to be part of the /user hierarchy; the machines don't care what spelling conventions you use.
/user/:id/orders/report
/user/:id/orders-report
/user/:id/report
/report/:id
/report?user=:id
/report/user=:id
Those are all fine; choose whichever variation is appropriate to your local conventions.
Note that you want to be aware of caching - when you have information in two different resources, it is easy for the client's locally cached copies to contradict each other (report says that there are orders, but the orders list is empty; or the other way around). As far as REST, and general purpose components are concerned, different resources are different, and vary independently of each other.
In the large grained world, you don't have that problem so often, because you throw the kitchen sink into a single resource; as long as its produced representations are internally consistent, the cached copies will be as well.

What REST method should be used to implement a simple inter-process communication?

This is more a theorical question than a practical one.
We have a backend application that uploads csv files to a frontend application, then and only then the backend sends an empty POST request to tell the frontend to start to process those files to update its database.
For this question it doesn't matter if this is a good design (I think it isn't), what are those files, and what database is: I am only want to know better about the REST "sintax".
I'm referring to wikipedia and restfulapi.net, but I'm not convinced about any alternative, because:
GET: Request sender doesn't receive data;
POST (the currently used): Request sender doesn't want to insert data that are on the request body (just data from external files, if existent. Also they can be insert/update/delete);
PUT: Sounds good, but again, data are not on the request body;
PATCH: Sounds best, but data are not on the body (Also, I am wrong or is it deprecated/unused?);
DELETE: Doesn't always need to delete.
I know it is habit to use POST requests to let machines yell "go!" to each other, but I never thought it was right.
What do you think - in theory - would be the proper method?
The actual reference for the semantics of the HTTP methods is the RFC 7231 and not the ones you referenced in your question.
POST is a catch all method and requests that the target resource process the representation enclosed in the request according to the resource's own specific semantics.
4.3.3. POST
The POST method requests that the target resource process the
representation enclosed in the request according to the resource's
own specific semantics. For example, POST is used for the following
functions (among others):
Providing a block of data, such as the fields entered into an HTML
form, to a data-handling process;
Posting a message to a bulletin board, newsgroup, mailing list,
blog, or similar group of articles;
Creating a new resource that has yet to be identified by the
origin server; and
Appending data to a resource's existing representation(s).
[...]
Responses to POST requests are only cacheable when they include
explicit freshness information. However, POST caching is not widely implemented.
In these scenarios, the receiving application knows where the CSV files will be and monitors that location. When it finds one, it processes it and then deletes or archives it. The application will likely have its own criteria for considering itself ready to process, e.g. time of day, size of file etc.
If the data load on the front end takes a long time you could "partition" the updates based on "importance". How you define importance would be up to your business rules. You could then POST a list of CSV filenames/locations to the front end. The list would be ordered by importance. The front end could then update its database based on that importance. Scheduling less important data for a more appropriate time of day.
If the backend knows the difference between new users and updated users you could use PUT and POST. The front end could assign higher priority to PUT requests as they relate to new users, perhaps assigning lower priority and staggered syncing for CSV filenames in POST requests.

REST Best practise for filtering and knowing the result is singular: List or single?

Variety of REST practises suggest (i.e. 1, 2, 3) to use plurals in your endpoints and the result is always a list of objects, unless it's filtered by a specific value, such as /users/123 Query parameters are used to filter the list, but still result in a list, nevertheless. I want to know if my case should 'abandon' those best practices.
Let's use cars for my example below.
I've got a database full of cars and each one has a BuildNumber ("Id"), but also a model and build year which combination is unique. If I then query for /cars/ and search for a specific model and year, for example /cars?model=golf&year=2018 I know, according to my previous sentence, my retrieve will always contain a single object, never multiple. My result, however, will still be a list, containing just one object, nevertheless.
In such case, what will be the best practise as the above would mean the object have to be extracted from the list, even though a single object could've been returned instead.
Stick to best practises and export a list
Make a second endpoind /car/ and use the query parameters ?model=golf&year=2018, which are primarily used for filtering in a list, and have the result be a single object, as the singular endpoint states
The reason that I'm asking this is simply for the cleanness of the action: I'm 100% sure my GET request will result in single object, but still have to perform actions to extract it from the list. These steps should've been unnecessary. Aside of that, In my case I don't know the unique identifier, so cars/123 for retrieving a specific car isn't an option. I know, however, filters that will result in one object and one specific object altogether. The additional steps simply feel redundant.
1: https://learn.microsoft.com/en-us/azure/architecture/best-practices/api-design
2: https://blog.mwaysolutions.com/2014/06/05/10-best-practices-for-better-restful-api/
3: https://medium.com/hashmapinc/rest-good-practices-for-api-design-881439796dc9
As you've specifically asked for best practices in regards to REST:
REST doesn't care how you specify your URIs or that semantically meaningful tokens are used inside the URI at all. Further, a client should never expect a certain URI to return a certain type but instead rely on content-type negotiation to tell the server all of the capabilities the client supports.
You should furthermore not think of REST in terms of object orientation but more in terms of affordance and statemachines where a client get served every information needed in order to make an educated decision on what to do next.
The best sample to give here is probably to take a close look at the Web and how it's done for HTML pages. How can you filter for a specific car and how it will be presented to you? The same concepts that are used in the Web also apply to REST as both use the same interaction model. In regards to your car sample, the API should initially return some control-structures that teach a client how a request needs to be formed and what options could be filtered for. In HTML this is done via forms. For non-HTML based REST APIs dedicated media-types should be defined that translate the same approach to non-HTML structures. On sending the request to the server, your client would include all of the supported media-types it supports in an Accept HTTP header, which informs the server about the capabilities of the client. Media-types are just human-readable specification on how to process payloads of such types. Such specifications may include hints on type information a link relation might return. In order to gain wide-usage of media-types they should be defined as generic as possible. Instead of defining a media-type specific for a car, which is possible, it probably would be more convenient to use an existing or define a new general data-container format (similar to HTML).
All of the steps mentioned here should help you to design and implement an API that is free to evolve without having to risk to break clients, that furthermore is also scalable and minimizes interoperability concerns.
Unfortunately your question targets something totally different IMO, something more related to RPC. You basically invoke a generic method via HTTP on an endpoint, similar like SOAP, RMI or CORBA work. Whether you respect the semantics of HTTP operations or not is only of sub-interest here. Even if you'd reached level 3 of the Richardson Maturity Model (RMM) it does not mean that you are compliant to REST. Your client might still break if the server changes anything within the response. The RMM further doesn't even consider media-types at all, hence I consider it as rather useless.
However, regardless if you use a (true) REST or RPC/CRUD client, if retrieving single items is your preference instead of feeding them into a collection you should consider to include the URI of the items of interest instead of its data directly into the collection, as Evert also has suggested. While most people seem to be concerned on server performance and round-trip-times, it actually is very elegant in terms of caching. Further certain link-relation names such as prefetch may inform the client that it may fetch the targets payload early as it is highly possible that it's content will be requested next. Through caching a request might not even have to be triggered or sent to the server for processing, which is probably the best performance gain you can achieve.
1) If you use query like cars/where... - use CARS
2) If you whant CAR - make method GetCarById
You might not get a perfect answer to this, because all are going to be a bit subjective and often in a different way.
My general thought about this is that every item in my system will have its own unique url, for example /cars/1234. That case is always singular.
But this specific item might appear as a member in collections and search results. When /cars/1234 apears in these, they will always appear as a list with 1 item (or 0 or more depending on the query).
I feel that this is ultimately the most predictable.
In my case though, if a car appears as a member of a search or colletion, it's 'true url' will still be displayed.

What to do about huge resources in REST API

I am bolting a REST interface on to an existing application and I'm curious about what the most appropriate solution is to deal with resources that would return an exorbitant amount of data if they were to be retrieved.
The application is an existing timesheet system and one of the resources is a set of a user's "Time Slots".
An example URI for these resources is:
/users/44/timeslots/
I have read a lot of questions that relate to how to provide the filtering for this resource to retrieve a subset and I already have a solution for that.
I want to know how (or if) I should deal with the situation that issuing a GET on the URI above would return megabytes of data from tens or hundreds of thousands of rows and would take a fair amount of server resource to actually respond in the first place.
Is there an HTTP response that is used by convention in these situations?
I found HTTP code 413 which relates to a Request entity that is too large, but not one that would be appropriate for when the Response entity would be too large
Is there an alternative convention for limiting the response or telling the client that this is a silly request?
Should I simply let the server comply with this massive request?
EDIT: To be clear, I have filtering and splitting of the resource implemented and have considered pagination on other large collection resources. I want to respond appropriately to requests which don't make sense (and have obviously been requested by a client constructing a URI).
You are free to design your URIs as you want encoding any concept.
So, depending on your users (humans/machines) you can use that as a split on a conceptual level based on your problem space or domain. As you mentioned you probably have something like this:
/users/44/timeslots/afternoon
/users/44/timeslots/offshift
/users/44/timeslots/hours/1
/users/44/timeslots/hours/1
/users/44/timeslots/UTC1624
Once can also limit by the ideas/concepts as above. You filter more by adding queries /users/44/timeslots?day=weekdays&dow=mon
Making use or concept and filters like this will naturally limit the response size. But you need to try design your API not go get into that situation. If your client misbehaves, give it a 400 Bad Request. If something goes wrong on your server side use a 5XX code.
Make use of one of the tools of REST - hypermedia and links (See also HATEOAS) Link to the next part of your hypermedia, make use of "chunk like concepts" that your domain understands (pages, time-slots). No need to download megabytes which also not good for caching which impacts scalability/speed.
timeslots is a collection resource, why won't you simply enable pagination on that resource
see here: Pagination in a REST web application
calling get on the collection without page information simply returns the first page (with a default page size)
Should I simply let the server comply with this massive request?
I think you shouldn't, but that's up to you to decide, can the server handle big volumes? do you find it a valid usecase?
This may be too weak of an answer but here is how my team has handled it. Large resources like that are Required to have the additional filtering information provided. If the filtering information is not there to keep the size within a specific range then we return an Internal Error (500) with an appropriate message to denote that it was a failure to use the RESTful API properly.
Hope this helps.
You can use a custom Range header - see http://otac0n.com/blog/2012/11/21/range-header-i-choose-you.html
Or you can (as others have suggested) split your resource up into smaller resources at different URLs (representing sections, or pages, or otherwise filtered versions of the original resource).

RESTful way to create multiple items in one request

I am working on a small client server program to collect orders. I want to do this in a "REST(ful) way".
What I want to do is:
Collect all orderlines (product and quantity) and send the complete order to the server
At the moment I see two options to do this:
Send each orderline to the server: POST qty and product_id
I actually don't want to do this because I want to limit the number of requests to the server so option 2:
Collect all the orderlines and send them to the server at once.
How should I implement option 2? a couple of ideas I have is:
Wrap all orderlines in a JSON object and send this to the server or use an array to post the orderlines.
Is it a good idea or good practice to implement option 2, and if so how should I do it.
What is good practice?
I believe that another correct way to approach this would be to create another resource that represents your collection of resources.
Example, imagine that we have an endpoint like /api/sheep/{id} and we can POST to /api/sheep to create a sheep resource.
Now, if we want to support bulk creation, we should consider a new flock resource at /api/flock (or /api/<your-resource>-collection if you lack a better meaningful name). Remember that resources don't need to map to your database or app models. This is a common misconception.
Resources are a higher level representation, unrelated with your data. Operating on a resource can have significant side effects, like firing an alert to a user, updating other related data, initiating a long lived process, etc. For example, we could map a file system or even the unix ps command as a REST API.
I think it is safe to assume that operating a resource may also mean to create several other entities as a side effect.
Although bulk operations (e.g. batch create) are essential in many systems, they are not formally addressed by the RESTful architecture style.
I found that POSTing a collection as you suggested basically works, but problems arise when you need to report failures in response to such a request. Such problems are worse when multiple failures occur for different causes or when the server doesn't support transactions.
My suggestion to you is that if there is no performance problem, for example when the service provider is on the LAN (not WAN) or the data is relatively small, it's worth it to send 100 POST requests to the server. Keep it simple, start with separate requests and if you have a performance problem try to optimize.
Facebook explains how to do this: https://developers.facebook.com/docs/graph-api/making-multiple-requests
Simple batched requests
The batch API takes in an array of logical HTTP requests represented
as JSON arrays - each request has a method (corresponding to HTTP
method GET/PUT/POST/DELETE etc.), a relative_url (the portion of the
URL after graph.facebook.com), optional headers array (corresponding
to HTTP headers) and an optional body (for POST and PUT requests). The
Batch API returns an array of logical HTTP responses represented as
JSON arrays - each response has a status code, an optional headers
array and an optional body (which is a JSON encoded string).
Your idea seems valid to me. The implementation is a matter of your preference. You can use JSON or just parameters for this ("order_lines[]" array) and do
POST /orders
Since you are going to create more resources at once in a single action (order and its lines) it's vital to validate each and every of them and save them only if all of them pass validation, ie. you should do it in a transaction.
I've actually been wrestling with this lately, and here's what I'm working towards.
If a POST that adds multiple resources succeeds, return a 200 OK (I was considering a 201, but the user ultimately doesn't land on a resource that was created) along with a page that displays all resources that were added, either in read-only or editable fashion. For instance, a user is able to select and POST multiple images to a gallery using a form comprising only a single file input. If the POST request succeeds in its entirety the user is presented with a set of forms for each image resource representation created that allows them to specify more details about each (name, description, etc).
In the event that one or more resources fails to be created, the POST handler aborts all processing and appends each individual error message to an array. Then, a 419 Conflict is returned and the user is routed to a 419 Conflict error page that presents the contents of the error array, as well as a way back to the form that was submitted.
I guess it's better to send separate requests within single connection. Of course, your web-server should support it
You won't want to send the HTTP headers for 100 orderlines. You neither want to generate any more requests than necessary.
Send the whole order in one JSON object to the server, to: server/order or server/order/new.
Return something that points to: server/order/order_id
Also consider using CREATE PUT instead of POST