REST API for processed data

REST API for processed data - rest

It may be opinionated and not belong here, but I don't seem to find any info about this.
I have a 'sales' resource so I can GET, POST, PUT, DELETE sales of my store.
I want to do a dashboard with information about those sales: e.g the average sales per day of the last month.
Since REST is resource-oriented, that means I have to manually retrieve all sales of the last month and calculate the average per day on client using GET /sales?sale_date>=...? That doesn't seem optimal, since I could have thousands of sales in that period of time.
Also, I don't think REST can allow a URL like GET /sales/average-per-day-last-month. What is the alternative of doing this?

I don't think REST can allow a URL like GET /sales/average-per-day-last-month
Change your thinking - that's a perfectly satisfactory URL for a perfectly satisfactory resource.
Any information that can be named can be a resource -- Fielding, 2000
"Any information that can be named" of course includes "any sales report that you can imagine".
GET /sales/average-per-day-last-month HTTP/x.y
That's perfect. Depending on your needs, you might also want to have other resources that are the "average-per-day" report for specific months.
GET /sales/2021-02/average-per-day HTTP/x.y
REST doesn't constrain your resource model at all; and it doesn't constrain your resource identifier design beyond expecting conformance to the production rules described by RFC 3986.

Related

POST to get REST resource - three approaches - which one would you recommend?

I have REST resource (Ex: Tickets). To be able to obtain a set of Tickets that match a given set of constraints (Ex: start date, end date, price and other criterion) a user will need to pass information. This information can be included as query parameters and the protocol can define:
GET: Tickets?start-date=date&end-date=date&price=someprice...
The set of constraints to pass could be a lot.
In such situations, is it better to use a POST and pass the set of constraints as JSON object within the body?
POST: Tickets
Body:
{
"start-date": "date"
"end-date" : "date"
. . .
}
What are the drawbacks of such an approach? Does it still agree with the REST guidelines?Ref: http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven
Another alternative is the client could create a new resource called "Constraints" on the server, obtain a constraint-id (ex:123) as a response. Then it could use:
GET: Tickets?constraints-id=123
But this will mean that the server will periodically have to expire and delete "Constraint" objects, as clients might keep creating those without completing the business flow (ex: without confirming a Ticket in the end)
A third approach could be still use POST, but not create any resource. We can use a URI scheme like this:
POST: Tickets\Constraints
Body:
Body:
{
"start-date": "date"
"end-date" : "date"
. . .
}
Response:
200 OK ...
Tickets
This will mean that allthough no resource was created on the server, the need to POST the constraints to obtain Tickets is still made clear.
Which of these approaches would you recommend? What is most intuitive? Or is other any other alternative you would recommend?

Simply according to the HTTP spec, a POST is not a valid method to send a large amount of data for a query, as the intention is that the body of the request is to be stored by the server in some way, which is not the case in your example.
My current project faced the same problem and we decided to go with the more correct GET with many templated query parameters. Despite supporting over a dozen query params which can be quite long in length, most servers specify a GET request maximum length of 8KB, which I would expect to be an ample amount. I suppose this limit could be reached if you were attempting to send a GET with a large amount of the same query parameter to describe a long list, but if this is this case then it would suggest taking a step back and seeing how this has become a requirement of the API.
In my opinion a GET is the most intuitive and clearest use, and definitely seems to be the "correct" RESTful implementation. If the size of the request is an issue for you and you control the environment you are deploying to, you can even increase your server's max request size.

Yes, definitely OK and a good idea, especially if the post data is large, as it may exceed the max url length. It is better as part of the body of the message rather than on the url.

REST API Design: Nested Collection vs. New Root

This question is about optimal REST API design and a problem I'm facing to choose between nested resources and root level collections.
To demonstrate the concept, suppose I have collections City, Business, and Employees. A typical API may be constructed as follows. Imagine that ABC, X7N and WWW are keys, e.g. guids:
GET Api/City/ABC/Businesses (returns all Businesses in City ABC)
GET Api/City/ABC/Businesses/X7N (returns business X7N)
GET Api/City/ABC/Businesses/X7N/Employees (returns all employees at business X7N)
PUT Api/City/ABC/Businesses/X7N/Employees/WWW (updates employee WWW)
This appears clean because it follows the original domain structure - business are in a city, and employees are at a business. Individual items are accessible via key under the collection (e.g. ../Businesses returns all businesses, while ../Businesses/X7N returns the individual business).
Here is what the API consumer needs to be able to do:
Get businesses in a city (GET Api/City/ABC/Businesses)
Get all employees at a business (GET Api/City/ABC/Businesses/X7N/Employees)
Update individual employee information (PUT Api/City/ABC/Businesses/X7N/Employees/WWW)
That second and third call, while appearing to be in the right place, use a lot of parameters that are actually unnecessary.
To get employees at a business, the only parameter needed is the key of the business (X7N).
To update an individual employee, the only parameter needed it the key of the employee (WWW)
Nothing in the backend code requires non-key information to look up the business or update the employee. So, instead, the following endpoints appear better:
GET Api/City/ABC/Businesses (returns all Businesses in City ABC)
GET Api/Businesses/X7N (returns business X7N)
GET Api/Businesses/X7N/Employees (returns all employees at business X7N)
PUT Api/Employees/WWW (updates employee WWW)
As you can see, I've created a new root for businesses and employees, even though from a domain perspective they are a sub/sub-sub-collection.
Neither solution appears very clean to me.
The first example asks for unnecessary information, but is structured in a way that appears "natural" to the consumer (individual items from a collection are retrieved via lower leafs)
The second example only asks for necessary information, but isn't structured in a "natural" way - subcollections are accessible via roots
The individual employee root would not work when adding a new employee, as we need to know which business to add the employee to, which means that call would at least have to reside under the Business root, such as POST Api/Businesses/X7N7/Employees, which makes everything even more confusing.
Is there a cleaner, third way that I'm not thinking of?

I don't see how REST adds a constraint that two resources could not have the same value. The resourceType/ID is just an example of the easiest use case rather than the best way to go from a RESTful point of view.
If you read paragraph 5.2.1.1 of Roy Fielding's dissertation carefully, you will notice that Fielding makes the disctinction between a value and a resource. Now a resource should have a unique URI, that's true. But nothing prevents two resources from having the same value:
For example, the "authors' preferred version" of an academic paper is a mapping whose value changes over time, whereas a mapping to "the paper published in the proceedings of conference X" is static. These are two distinct resources, even if they both map to the same value at some point in time. The distinction is necessary so that both resources can be identified and referenced independently. A similar example from software engineering is the separate identification of a version-controlled source code file when referring to the "latest revision", "revision number 1.2.7", or "revision included with the Orange release."
So nothing prevents you from, as you say, changing the root. In your example, a Business is a value not a resource. It is perfectly RESTful to create a resource which is a list of "every business located in a city" (just like Roy's example, "revisions included with the Orange release"), while having a "business which ID is x" resource as well (like "revision number x").
For Employees, I would keep API/Businesses/X7N/Employees as the relation between a business and its employees is a composition relationship, and thus as you say, Employees can and should only be accessed through the Businesses class root. But this is not a REST requirement, and the other alternative is perfectly RESTful as well.
Note that this goes in pair with the application of the HATEAOS principle. In your API, the list of Businesses located in a city could (and perhaps should from a theoretical point of view) be just a list of links to the API/Businesses. But this would mean that the clients would have to do one round-trip to the server for each of the items in the list. This is not efficient and, to stay pragmatic, what I do is embed the representation of the business in the list along with the self link to the URI that would be in this example API/Businesses.

You should not confuse REST with the application of a specific URI naming convention.
HOW the resources are named is entirely secondary. You are trying to use HTTP resource naming conventions - this has nothing to do with REST. Roy Fielding himself states so repeatedly in the documents quoted above by others. REST is not a protocol, it is an architectural style.
In fact, Roy Fielding states in his 2008 blog comment (http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven 6/20/2012):
"A REST API must not define fixed resource names or hierarchies (an obvious coupling of
client and server). Servers must have the freedom to control their own namespace. Instead,
allow servers to instruct clients on how to construct appropriate URIs, such as is done in
HTML forms and URI templates, by defining those instructions within media types and link relations."
So in essence:
The problem you describe is not actually a problem of REST - conceptually, it is a problem of HIERARCHY STRUCTURES versus RELATIONAL STRUCTURES.
While a business is "in" a city and so can be considered to be part of the city "hierarchy" - what about international companies which have offices in 75 cities. Then the city suddenly becomes the junior element in a hierarchy with the business name at the senior level of the structure.
The point is, you can view data from various angles, and depending on the viewpoint you take, it may be simplest to see it as a hierarchy. But the same data can be seen as a hierarchy with different levels. When you are using HTTP type resource names, then you have entered a hierarchy structure defined by HTTP. This is a constraint, yes, but it's not a REST constraint, it's a HTTP constraint.
From that angle, you can chose the solution which fits better to your scenario. If your customer cannot supply the city name when he supplies the company name (he may not know), then it would be better to have the key with only city name. As I said, it's up to you, and REST won't stand in your way ...
More to the point:
The only real REST constraints you have, if you have already decided to use HTTP with GET
PUT and so on, are:
Thou shalt not presumeth any prior ("out of band") knowledge between client and servers. *
Look at your proposal #1 above in that light. You assume that customers know the keys for the cities which are contained in your system? Wrong - that's not restful. So the server has to give the list of cities as a list of choices in some way. So are you going to list every city in the world here?
I guess not, but then you'll have to do some work on how you are planning to do this, which brings us to:
A REST API should spend almost all of its descriptive effort in defining the media type(s) used for representing resources and driving application state ...
I think, reading the mentioned Roy Fielding blog will help you out considerably.

In a RESTful-API URL design should be quite unimportant - or at least a side issue since the discoverability is encoded in the hypertext and not in the URL path. Have a look at the resources linked in the REST tag wiki here on StackOverflow.
But if you want to design human readable URLs for your UC, I would suggest the following:
Use the resource type you are creating/updating/querying as the first part of the URL (after your API prefix). So when somebody sees the URL he immediately knows to which resources this URL points. GET /Api/Employees... is the only only way to receive Employee resources from the API.
Use Unique IDs for each resource independent of the relations they are in. So GET /Api/<CollectionType>/UniqueKey should return a valid resource representation. Nobody should have to worry where the Employee is located. (But the returned Employee should have the links to the Business (and for convenience sake City) he belongs to.) GET /Api/Employees/Z6W returns the Employee with this ID no matter where is is located.
If you want to get a specific resource: Put your query parameter at the end (instead in the hierarchical order described in the question). You can use the URL query string (GET /Api/Employees?City=X7N) or a matrix parameter expression (GET /Api/Employees;City=X7N;Business=A4X,A5Y). This will allow you to easily express a collection of all Employees in a specific City - independent of the Business they are in.
Side node:
In my experience an initial hierarchical domain data model seldom survives additional requirements that come up during a project. In your case: Consider a business located in two Cities. You could create a workaround by modelling it as two separate businesses but what about the employee who works half his time in one place and the other half at the other location? Or even worse: It's only clear for which business he works but it's undefined, in which city?

The third way that I see is to make Businesses and Employees root resources and use query parameters to filter collections:
GET Api/Businesses?city=ABC (returns all Businesses in City ABC)
GET Api/Businesses/X7N (returns business X7N)
GET Api/Employees?businesses=X7N (returns all employees at business X7N)
PUT Api/Employees/WWW (updates employee WWW)
Your both solutions use concept of REST sub-resources which requires that subresource is included in parent resource so:
GET Api/City/ABC/Businesses
in response should also return data provided by:
GET Api/City/ABC/Businesses/X7N
GET Api/City/ABC/Businesses/X7N/Employees
similar for:
GET Api/Businesses/X7N
which should return data provided by:
GET Api/Businesses/X7N/Employees
It will make size of the response huge and time required to generate will increase.
To make REST API clean each resource should have only one bounded URI which fallow below patterns:
GET /resources
GET /resources/{id}
POST /resources
PUT /resources/{id}
If you need to make links between resources use HATEOAS

Go with example 1. I wouldn't worry about unnecessary information from the point of view of the server. A URL should clearly identify a resource in a unique fashion from the point of view of the client. If the client would not know what /Employee/12 means without first knowing that it is actually /Businesses/X7N/Employees/12 then the first URL seems redundant.
The client should be dealing with URLs rather than the individual parameters that make up the URLs, so there is nothing wrong with long URLs. To the client they are just strings. The server should be telling the client the URL to do what it needs to do, not the individual parameters that then require the client to construct the URL.

What is a good strategy for adding additional information in a GET query over REST?

Given that we provide a restful api that serves book entities listening at
/books
And a client can get a book at the usual
GET /books/{id}
Suppose that we want to begin offering discounts on books to only our most vigilant buyers. These buyers would be given a discount code, and that code will reduce the price of the book.
Thus, a generic response may be
GET /books/4
{"id":4, "price":"24.95"}
Where a response to a query with a discount code may be
GET /books/4
{"id":4, "price":"24.95", "yourPrice":"19.95"}
The back-end processing we can get figured out, but what is the best practice for a client submitting a discount code over a restful api?
Certain books will be eligible for discounts while others will not. Discounts will not be broad (20% off everything), but instead will map to a specific price for that particular code (or client/code combo).
We've considered:
kludging the url
GET /codes/{someCode}/books/{id}
Adding the code in a header value
Using a query string
GET /books?code=myCode
anything else?
EDIT: Our goal is not to implement single-use codes. Instead, these discount codes could be used some fixed number of times for some fixed set of books.

I like using query variables. I just looked at the RESTful Web Services book, my main reference in this area, and they say:
Use query variables only to suggest
arguments being plugged into an
algorithm... If two URIs differ only
in their query variables, it implies
they're the different sets of inputs
into the same underlying algorithm.
It seems to me your discount codes are inputs to a discounting algorithm.
Charles

If you're going to be submitting anything that's not idempotent, I would suggest using POST instead of GET. You wouldn't want a client to be able to use their code more than once.

Anything you add in the URL or header values are open to be intercepted, and possibly allowing other users to 'fake' their discount ID. 1 approach would be to introduce a new POST call, that will allow the ID to be encrypted with simple HTTPS. The POSTed data could be as simple as the discountID or customerID.
Added - Sorry Michael, you already said that :)

You can register the code in a table so when the user retrieves that book automatically returns that book with the proper discount, for example:
The user can add some code
POST /register/{code}
This will add an entry to a table {user} - {code} so when the user retrieves by
GET /books/{id}
will use that entry to apply the discount. I'm guessing that you already have some relation between {code}-{book} so wont get into that.

Appending to a resource's attribute RESTfully

This is a follow up to Updating a value RESTfully with Post
How do I simply append to a resource's attribute using REST. Imagine I have customer.balance and balance is an int. Let' say I just want to tell the server to append 5 to whatever the current balance is. Can I do this restfully? If so, how?
Keep in mind that the client doesn't know the customer's existing balance, so it can't just
get customer
customer.balance += 5
post customer
(there would also be concurrency issues with the above.)

Simple, slightly ugly:
This is a simpler variation of my answer to your other question.
I think you're still within the constraints of REST if you do the following. However, I'm curious about what others think about this situation as well, so I hope to hear from others.
Your URI will be:
/customer/21/credits
You POST a credit resource (maybe <credit>5</credit>) to the URI, the server can then take the customer's balance and += it with the provided value. Additionally, you can support negative credits (e.g. <credit>-10</credit>);
Note that /customer/21/credits doesn't have to support all methods. Supporting POST only is perfectly acceptable.
However, this gets a little weird if credits aren't a true resource within your system. The HTTP spec says:
If a resource has been created on the origin server, the response SHOULD be 201 (Created) and contain an entity which describes the status of the request and refers to the new resource, and a Location header.
Technically you're not creating a resource here, you're appending to the customer's balance (which is really an aggregate of all previous credits in the system). Since you're not keeping the credit around (presumably), you wouldn't really be able to return a reference to the newly "created" credit resource. You could probably return the customer's balance, or the <customer> itself, but that's a bit unintuitive to clients. This is why I think treating each credit as a new resource in the system is easier to work with (see below).
My preferred solution:
This is adapted from my answer in your other question. Here I'll try to approach it from the perspective of what the client/server are doing:
Client:
Builds a new credit resource:
<credit>
<amount>5</amount>
</credit>
POSTs resource to /customer/21/credits
POSTing here means, "append this new <credit> I'm providing to the list of <credit>s for this customer.
Server:
Receives POST to /customer/21/credits
Takes the amount from the request and +=s it to the customer's balance
Saves the credit and its information for later retrieval
Sends response to client:
<credit href="/customer/21/credits/credit-id-4321234">
<amount>5</amount>
<date>2009-10-16 12:00:23</date>
<ending-balance>45.03</ending-balance>
</credit>
This gives you the following advantages:
Credits can be accessed at a later date by id (with GET /customer/21/credits/[id])
You have a complete audit trail of credit history
Clients can, if you support it, update or remove credits by id (with PUT or DELETE)
Clients can retrieve an ordered list of credits, if you support it; e.g. GET /customer/21/credits might return:
<credits href="/customer/21/credits">
<credit href="/customer/21/credits/credit-id-7382134">
<amount>13</amount>
...
</credit>
<credit href="/customer/21/credits/credit-id-134u482">
...
</credit>
...
</credits>
Makes sense, since the customer's balance is really the end result of all credits applied to that customer.

To think about this in a REST-ful way, you would need to think about the action itself as a resource. For example, if this was banking, and you wanted to update the balance on an account, you would create a deposit resource, and then add one of those. The consequence of this would be to update the customer's balance
This also helps deal with concurrency issues, because you would be submitting a +5 action rather than requiring prior knowledge of the customer's balance. And, you would also be able to recall that resource (say deposit/51 for deposit with an ID of 51) and see other details about it (ie. Reason for deposit, date of deposit etc.).
EDIT: Realised that using an id of 5 for the deposit actually confuses the issue, so changed it to 51.

Well, there is alternative other than #Rob-Hruska 's solution.
The fundamental idea is the same: to think each credit/debit operation as a standalone transaction. However I once used a backend which supports storing schema-less data in json, so that I end up with defining the API as PUT with dynamic field names. Something like this:
PUT /customer/21
{"transaction_yyyymmddHHMMSS": 5}
I know this is NOT appropriate in the "credit/debit" context because an active account could have growing transaction records. But in my context I am using such tactics to store finite data (actually I was storing different batches of GPS way points during a driving trip).
Cons: This api style has heavy dependence on backend behavior's schema-less feature.
Pros: At least my approach is fully RESTful from the semantic point of view.
By contrast, #Rob-Hruska 's "Simple, slightly ugly" solution 1 does not have a valid Location header to return in the "201 Created" response, which is not a common RESTful behavior. (Or perhaps, we can let #Rob-Hruska's solution 1 to also return a dummy Location header, which points to a "410 Gone" or "404 Not Found" page. Is this more RESTful? Comments are welcome!)

Transient REST Representations

Let's say I have a RESTful, hypertext-driven service that models an ice cream store. To help manage my store better, I want to be able to display a daily report listing quantity and dollar value of each kind of ice cream sold.
It seems like this reporting capability could be exposed as a resource called DailyReport. A DailyReport can be generated quickly, and there doesn't seem to be any advantage to actually storing reports on the server. I only want a DailyReport for some days, other days I don't care about getting a DailyReport. Furthermore, storing DailyReports on the server would complicate client implementations, which would need remember to delete reports they no longer need.
A DailyReport is transient; its representation can be retrieved only once. One way to implement this would be to offer a link "/daily-reports", a POST to which will return a response containing a DailyReport representation listing the information for that day's sales.
Edit: Let's also say that I really do want to do a POST request. A DailyReport has many different options for creating a view such as sorting ice cream types alphabetically, by dollar value - or including an hourly breakdown - or optionally including the temperature for that day - or filtering out certain ice cream types (as a list). Rather than using query parameters with a GET, I'd rather POST a DailyReport representation with the appropriate options (using a well-defined custom media type to document each option). The representation I get back would display my options along with the report itself.
Is this the correct way to think about the problem, or should some other approach be used instead? If correct, what special considerations might be important when implementing the DailyReport resource? (For example, it probably wouldn't be appropriate to set the Location header when returning after a POST request).

If you want to make daily reports for past days available, you could implement it as a GET to /daily_reports/2009/08/20. I agree with John Millikin that a POST is unnecessary here - there's no need for something like this to be a user-creatable resource.
The advantage of making the report for each day available as its own URI is cacheability.
EDIT: A good solution might be to merge the two answers, making daily_report/ a no-cache representation of the current day's data and daily_reports/yyyy/mm/dd a cacheable representation of a full day's data.

There's no need to use a POST for this, since requesting the report doesn't change the state of the server. I would use a resource like this:
GET /daily-report/
200 OK
Pragma: no-cache
<daily-report for="2009-04-20" generated-at="2009-4-20T12:13:14Z">
<!-- contents of the report here -->
</daily-report>
Responding to your edit: if you are POSTing a description of the report to a URL, and retrieving a temporary data set as a result, that's not REST at all. It's RPC, in the same vein as SOAP. RPC is not an inherently bad thing, but please, please don't call it RESTful.

Sometimes it is desirable to keep a record of requests for reports, in those cases it is not unreasonable to POST to a collection resource. It is also useful for long running reports where you want handle the execution asynchronously. How long the server holds onto those report requests is up to you.
I would do something like
POST /DailyReportRequests
which would return a representation of the request, including options, and when the report is completed, a link to the report results.
Another alternative which is good when you have a set of pre-canned reports is to create a DailyReports resource that contains a list of preconfigured report links. The OpenSearchDescription spec allows you to do something similar to this using the Query tag.

I think Greg's approach is the correct one. To expound upon it, I don't think you should provide a /daily-report resource that changes daily, because running the report on Tuesday at 11:59 would yield different results than running it Wednesday at 00:01, which can be A) confusing for clients expecting the resource to be the same, and B) doesn't allow clients to retrieve a previous day's data after the day has passed. You should provide a unique resource identifier for each daily report that's available, that way clients can access the information they need at any time.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse