REST interface for finding average - rest

Suppose I want to create a REST interface to find the average of a list of numbers. Assume that the numbers are submitted one at a time. How would you do this?
POST a number to https://example.com/api/average
If this is the first number a hash will be returned
POST a number to https://example.com/api/average/hash
....
GET https://example.com/api/average/hash to find the average
DELETE https://example.com/api/average/hash since we don't need it any more
Is this the right way to do it? Any suggestions?

It makes more sense to think of the list of numbers as the resource. Suppose each list's resource URL is /list/{id} where {id} is a placeholder for the list's ID. Then:
POST /list creates a new list, the server generates a list ID (or 'hash') and specifies the /list/{id} URL in the response's Location header.
POST /list/{id} adds a number to the list
GET /list/{id}/average returns the average
DELETE /list/{id} deletes the list.
An alternative to GET /list/{id}/average would be for GET /list/{id} to return the list as structured data, e.g. XML, that includes the average as a generated property.

What you are talking about is doing a stateless transformation of a request representation (list of numbers) into a response representation (single number).
Lets categorize your resource:
Stateless -- The request is stateless, but so is the resource. It should be able to take your request, process it, and return a response without maintaining any internal state. Further discussion below.
Unlikely to be cacheable -- I am making an assumption here that your lists of numbers are never/seldom identical.
Idempotent -- Requests have no side effects. This is because the resource is stateless.
Now lets examine the different HTTP methods:
GET - Gets the state of a resource. Since your resource has no state, it is not appropriate for your situation. (idempotent, cacheable)
DELETE - Removes a resource or clears its state. Also not appropriate for your situation. (not idempotent, not cacheable)
PUT - Used to set the state of a resource (or create it if it does not exist). (idempotent, not cacheable)
POST - Used to process requests which may or may not modify the state of a resource. May create other resources. (no guarantee of idempotence -- depends on whether the resource is stateful or stateless, not cacheable)
As you see in the other answers, POST is most popularly used as a synonym for 'create'. While this is ok, POST is not limited to just 'create' in REST. Mark Baker does a good job of explaining this here: http://www.markbaker.ca/2001/09/draft-baker-http-resource-state-model-01.txt (Section 3.1.4).
While POST does not have a perfect semantic mapping to your problem, it is the best of all the HTTP methods for what you are trying to do. It also leads to a simple, stateless, and scalable solution, which is the point of REST.
In summary, the answer to your question is:
Method: POST
Request: A representation of a list of numbers
Response: A representation of a single number (average of the list)
While this may look like a SOAP-style web service invocation, it is not. Don't let your visceral reaction to SOAP cloud your use of the POST method and place unnecessary constraints on it.
KISS (Keep it simple, stupid).

You cannot just return a hash or an ID, you have to return URIs or a URI template plus the field values. The only URI that can be part of your API is the entry point, otherwise your API is not REST.

To maximize REST philosophies I would do the following
Do a PUT this to indicate a new structure that would generate a hash, that is not based on the number passed. Just a "random" hash. Then each subsequent post would include the id-hash with returned result hash of the numbers sent. Then when a get is presented on that you can cache the results.
1. PUT /api/average/{number} //return id-hash
2. POST /api/average/{id-hash}/{number} // return average-hash
3. GET /api/average/{average-hash}
4. DELETE /api/average/{id-hash}
Then you can cache the get of the average hash, even when you may get to the result in a different way, or different servers get that same average.

Related

REST API - PUT or GET?

I am designing and building a REST API. I understand the basic concept underlying the different request types. In particular PUT requests are intended for updating data.
I have a number of cases where an API call will modify the database, changing the values of a data object's attributes. However, the new values are not sent by the client but rather are implicit in the specific endpoint invoked. There are arguments needed to select the object to be modified, but not to supply attribute values for that object.
Originally I set these up to be PUT requests. However, I am now wondering whether they should be GET requests instead, because the body does not in fact contain update data.
Which would be recommended?
Just because the body doesn't contain update data doesn't mean it is not an update. Look at it from user's or at least from your API user's point of view. Is it an update from their point of view or retrieval of an object where update is not important from their point of view. If it is an update from user's point of view use PUT.
Originally I set these up to be PUT requests. However, I am now wondering whether they should be GET requests instead, because the body does not in fact contain update data.
If the semantics of the request are a change to the representation(s) of a resource on the server, then GET is inappropriate.
If the payload/entity enclosed in the request is not a candidate representation of the target resource ("make your representation look like this one right here"), then PUT is inappropriate.
"Update yourself however you see fit, here is some information that will help" will normally use POST.
POST serves many useful purposes in HTTP, including the general purpose of "this action isn’t worth standardizing." -- Roy Fielding, 2009
POST is the general solution for requests that are intended to modify resource state; PUT (and PATCH) are specializations with narrower semantics (specifically, remote authoring).

Good REST API design for operations on resource sets

With REST it is pretty clear how to operate on resources, e.g.
PUT /users/{userId} - updates the user with userId
GET /users/{userId} - reads the user with userId
Similarly for resource sets
POST /users - creates a new user
GET /users/{userId}/books - reads list of books from a user
GET /users/{userId}/books?filter=x - reads list of books from a user with specific filter
What if I want to develop more elaborate operations on resource sets, e.g.
with the request body, add a list of books to the existing list and accepting duplicates (basically concatenating the list)
POST /users/{userId}/books
or PUT /users/{userId}/books
or PATCH?
or POST /users/{userId}/books/concatenate
with the request body, add a list of books to the existing list but no duplicates (basically merging the list)
POST /users/{userId}/books
or PUT /users/{userId}/books
or PATCH?
or POST /users/{userId}/books/merge
also for deleting parts of resource sets:
with the request body, delete a list of books from the existing list that have a certain property
POST /users/{userId}/books/delete?category=x
or DELETE /users/{userId}/books?category=x
or deleting all resources in a resource set:
POST /users/{userId}/books/delete_all
or DELETE /users/{userId}/books
Would be thankful for some hints or guidelines
"Resource sets", from the point of view of REST, are a fiction. There are only resources. As far as a general-purpose HTTP component is concerned, there is _no relation implied by the following URI:
/users
/users/{userId}
/users/{userId}/books
/users/{userId}/books?filter=x
/users/{userId}/books/concatenate
They are completely independent of one another; for instance, DELETE /users does not imply anything about the other resources.
We human beings tend to assign identifiers in patterns that make sense, but the machines don't care.
with the request body, add a list of books to the existing list and accepting duplicates (basically concatenating the list)
PUT and PATCH have remote authoring semantics; they act like you would expect if you were trying to edit a copy of a file on the server. You GET a copy of the current representation of the resource, make edits to your local copy, and then request that the server change its copy to match your copy. With PUT, you send a complete copy of your representation of the resource; with PATCH, you send a patch-document that describes the changes you made.
It's okay to use POST; HTML got along just fine using nothing but GET and POST, and the web took over the world.
You don't need a separate resource for POST; you can use one if you like, but it isn't necessary to do so.
with the request body, add a list of books to the existing list but no duplicates (basically merging the list)
Not really any different; what we agree upon in HTTP is the semantics of the request and response messages. What the server chooses to do is an implementation concern. See Fielding 2002.
So if I send to you a representation of a list with duplicate entries, and you strip out the duplicates, that's "fine"; you just need to exercise some care with your responses to ensure that you don't imply that you accepted the requested representation as is.
With PATCH, it's a bit fuzzy, in that the RFC describes all or nothing semantics, but based on the language used it is reasonable to infer that the implementation is restricted as well.
also for deleting parts of resource sets: with the request body, delete a list of books from the existing list that have a certain property
Give RFC 7231 a careful read: DELETE doesn't quite mean what your examples hint at. DELETE breaks the associate between a key (the target uri) and a value (the resource representations), but that doesn't necessarily mean "and also garbage collect the representation".
The same idea expressed another way -- suppose I GET /list-of-books from the server, and the returned representation is a list of three books. In the case where I want that resource to instead return a representation of an empty list, DELETE is the wrong tool. DELETE tells the server that I want future calls to GET /list-of-books to return 404 Not Found or possibly 410 Gone. If what I really want is a 200 OK with an empty list, then I need to PUT/PATCH/POST/etc. the resource.
deleting all resources in a resource set
Same problem as before.
With REST it is pretty clear how to operate on resources
This is the problem - it is NOT clear how to operate on resources. The web is cluttered with literature that makes a complete hash of it (we use REST to fetch documents that mangle the lessons of REST -- fabulous irony).
REST includes a uniform interface as a constraint. In HTTP, that interface is effectively a document store. PUT and PATCH just edit document contents - which is perfectly satisfactory if your domain is anemic or declarative. For anything else where we don't have standardized semantics, we use POST.
See Jim Webber, 2011: "You have to learn how to use HTTP to trigger business activity as a side effect of moving documents around the network."

Which HTTP Verb for Read endpoint with request body

We are exposing an endpoint that will return a large data set. There is a background process which runs once per hour and generates the data. The data will be different after each run.
The requester can ask for either the full set of data or a subset. The sub set is determined via a set of parameters but the parameters are too long to fit into a uri which has a max length of 2,083 characters. https://www.google.co.uk/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=uri%20max%20length
The parameters can easily be sent in the request body but which which is the correct HTTP verb to use?
GET would be ideal but use of a body 'has no semantic meaning to a GET request' HTTP GET with request body
PUT is not appropriate because there is no ID and no data is being updated or replaced.
POST is not appropriate because a new resource is not being replaced and more importantly the server is not generating and Id.
http://www.restapitutorial.com/lessons/httpmethods.html
GET (read) would seem to be the most appropriate but how can we include the complex set of parameters to determine the response?
Many thanks
John
POST is the correct method. POST should be used for any operation that's not standardized by HTTP, which is your case, since there's no standard for a GET operation with a body. The reference you linked is just directly mapping HTTP methods to CRUD, which is a REST anti-pattern.
You are right that GET with body is to be avoided. You can experiment with other safe methods that take a request body (such as REPORT or SEARCH), or you can indeed use POST. I see no reason why the latter is wrong; what you're citing is just an opinion, not the spec.
Assuming that the queries against that big dataset are not totally random, you should consider adding stored queries to your API. This way clients can add, remove, update queries (through request body) using POST DELETE PUT. Maybe you can call them "reports".
This way the GET requests need only a reference as query parameter to these queries/reports, you don't have to send all the details with every requests.
But only if not all the requests from clients are unique.

Is it valid to modify a REST API representation based on a If-Modified-Since header?

I want to implement a "get changed values" capability in my API. For example, say I have the following REST API call:
GET /ws/school/7/student
This gets all the students in school #7. Unfortunately, this may be a lot. So, I want to modify the API to return only the student records that have been modified since a certain time. (The use case is that a nightly process runs from another system to pull all the students from my system to theirs.)
I see http://blog.mugunthkumar.com/articles/restful-api-server-doing-it-the-right-way-part-2/ recommends using the if-modified-since header and returning a representation as follows:
Search all the students updated since the time requested in the if-modified-since header
If there are any, return those students with a 200 OK
If there are no students returned from that query, return a 304 Not Modified
I understand what he wants to do, but this seems the wrong way to go about it. The definition of the If-Modified-Since header (http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.24) says:
The If-Modified-Since request-header field is used with a method to make it conditional: if the requested variant has not been modified since the time specified in this field, an entity will not be returned from the server; instead, a 304 (not modified) response will be returned without any message-body.
This seems wrong to me. We would not be returning the representation or a 304 as indicated by the RFC, but some hybrid. It seems like client side code (or worse, a web cache between server and client) might misinterpret the meaning and replace the local cached value, when it should really just be updating it.
So, two questions:
Is this a correct use of the header?
If not (and I suspect not), what is the best practice? Query string parameter?
This is not the correct use of the header. The If-Modified-Since header is one which an HTTP client (browser or code) may optionally supply to the server when requesting a resource. If supplied the meaning is "I want resource X, but only if it's changed since time T." Its purpose is to allow client-side caching of resources.
The semantics of your proposed usage are "I want updates for collection X that happened since time T." It's a request for a subset of X. It does not seem like your motivation is to enable caching. Your client-side cached representation seemingly contains all of X, even though the typical request will only return you a small set of changes to X; that is, the response is not what you are directly caching, so the caching needs to happen in custom user logic client-side.
A query string parameter is a much more appropriate solution. Below {seq} would be something like a sequence number or timestamp.
GET /ws/schools/7/students/updates?since={seq}
Server-side I imagine you have a sequence of updates since the beginning of your system and a request of the above form would grab the first N updates that had a sequence value greater than {seq}. In this way, if a client ever got very far behind and needed to catch up, the results would be paged.

RESTful way to create multiple items in one request

I am working on a small client server program to collect orders. I want to do this in a "REST(ful) way".
What I want to do is:
Collect all orderlines (product and quantity) and send the complete order to the server
At the moment I see two options to do this:
Send each orderline to the server: POST qty and product_id
I actually don't want to do this because I want to limit the number of requests to the server so option 2:
Collect all the orderlines and send them to the server at once.
How should I implement option 2? a couple of ideas I have is:
Wrap all orderlines in a JSON object and send this to the server or use an array to post the orderlines.
Is it a good idea or good practice to implement option 2, and if so how should I do it.
What is good practice?
I believe that another correct way to approach this would be to create another resource that represents your collection of resources.
Example, imagine that we have an endpoint like /api/sheep/{id} and we can POST to /api/sheep to create a sheep resource.
Now, if we want to support bulk creation, we should consider a new flock resource at /api/flock (or /api/<your-resource>-collection if you lack a better meaningful name). Remember that resources don't need to map to your database or app models. This is a common misconception.
Resources are a higher level representation, unrelated with your data. Operating on a resource can have significant side effects, like firing an alert to a user, updating other related data, initiating a long lived process, etc. For example, we could map a file system or even the unix ps command as a REST API.
I think it is safe to assume that operating a resource may also mean to create several other entities as a side effect.
Although bulk operations (e.g. batch create) are essential in many systems, they are not formally addressed by the RESTful architecture style.
I found that POSTing a collection as you suggested basically works, but problems arise when you need to report failures in response to such a request. Such problems are worse when multiple failures occur for different causes or when the server doesn't support transactions.
My suggestion to you is that if there is no performance problem, for example when the service provider is on the LAN (not WAN) or the data is relatively small, it's worth it to send 100 POST requests to the server. Keep it simple, start with separate requests and if you have a performance problem try to optimize.
Facebook explains how to do this: https://developers.facebook.com/docs/graph-api/making-multiple-requests
Simple batched requests
The batch API takes in an array of logical HTTP requests represented
as JSON arrays - each request has a method (corresponding to HTTP
method GET/PUT/POST/DELETE etc.), a relative_url (the portion of the
URL after graph.facebook.com), optional headers array (corresponding
to HTTP headers) and an optional body (for POST and PUT requests). The
Batch API returns an array of logical HTTP responses represented as
JSON arrays - each response has a status code, an optional headers
array and an optional body (which is a JSON encoded string).
Your idea seems valid to me. The implementation is a matter of your preference. You can use JSON or just parameters for this ("order_lines[]" array) and do
POST /orders
Since you are going to create more resources at once in a single action (order and its lines) it's vital to validate each and every of them and save them only if all of them pass validation, ie. you should do it in a transaction.
I've actually been wrestling with this lately, and here's what I'm working towards.
If a POST that adds multiple resources succeeds, return a 200 OK (I was considering a 201, but the user ultimately doesn't land on a resource that was created) along with a page that displays all resources that were added, either in read-only or editable fashion. For instance, a user is able to select and POST multiple images to a gallery using a form comprising only a single file input. If the POST request succeeds in its entirety the user is presented with a set of forms for each image resource representation created that allows them to specify more details about each (name, description, etc).
In the event that one or more resources fails to be created, the POST handler aborts all processing and appends each individual error message to an array. Then, a 419 Conflict is returned and the user is routed to a 419 Conflict error page that presents the contents of the error array, as well as a way back to the form that was submitted.
I guess it's better to send separate requests within single connection. Of course, your web-server should support it
You won't want to send the HTTP headers for 100 orderlines. You neither want to generate any more requests than necessary.
Send the whole order in one JSON object to the server, to: server/order or server/order/new.
Return something that points to: server/order/order_id
Also consider using CREATE PUT instead of POST