REST collections: pagination enabled by default? - rest

I am trying to figure out what is the common way to respond on GET operation to retrieve multiple resources. For example, if user calls: /books or /books?name=*foo* etc what should good REST api return?
[A] list of all resources in collection. Only if user specifies a range (using start and limit or in any other way), then only a page of results is returned.
[B] always return a first page of resources, even when nothing is specified. Then user may continue with pagination, using the parameters (or any other way).
[C] a document indicating that paging is be involved, with total number of resources in the collection, but without any returned resource; with appropriate status code set (like 300 if I remember correctly at this moment). This response indicates to the user that he can start fetching its data using pagination parameters.
I like C approach, but could not find APIs that are having this.

This depends on either the pagination parameters are mandatory or not. For most APIs it's mandatory simply because /books could return millions of entries.
How about [D]: a redirect.
If the client accesses /books or /books?name=foobar redirect it to /books?page=1&size=15 or /books?name=foobar&page=1&size=15 and return results according to those default parameters.
You could also include pagination links into your response (as per HATEOAS) with a rel attribute that specifies that a link is for the next page, previous, first or last page, so the client can also navigate back and forth between the result pages.

Related

Retrieve all of a user's playlist from SoundCloud limited to 50?

I'm trying to retrieve all the playlists from my account via
http://api.soundcloud.com/users/145295911/playlists?client_id=xxxxxx, as the API reference shows.
However, I can only retrieve recent 50 playlists instead of all of my playlists. I've been looking for this but it seems like no one has had this issue before. Is there a way to get all of them?
Check out the section of their API on pagination.
Most results from our API are returned as a collection. The number of items in the collection returned is limited to 50 by default with a maximum value of 200. Most endpoints support a linked_partitioning parameter that will allow you to page through collections. When this parameter is passed, the response will contain a next_href property if there are additional results. To fetch the next page of results, simply follow that URI. If the response does not contain a next_href property, you have reached the end of the results.

HTTP request to search for multiple ObjectIds in a Mongo-based API?

I'm looking to add search functionality to an API, on a resource called Organizations. Organizations can have different Location and Audience ids tagged onto them (which I would like to use in searching). Since these ids are MongoDB ObjectIds, they are quite long and I'm worried about reaching the max query string limit of the browser with a GET request. For example:
GET http://my-site.com/api/organizations?locations=5afa54e5516c5b57c0d43227,5afa54e5516c5b57c0d43226,5afa54e5516c5b57c0d43225,5afa54e5516c5b57c0d43224&audiences=5afa54e5516c5b57c0d43223,5afa54e5516c5b57c0d43222
Which would probably be about an average search, however I don't want it to break if users select many Locations or Audiences.
Any advice on how I could handle this situation?
I've ran into your situation before. You can change your method to POST
For a input of locations and audiences, your resource is not already sitting there. You have to compute it.
By the definition of POST:
Perform resource-specific processing on the request payload.
Providing a block of data, such as the fields entered into an HTML
form, to a data-handling process;
You have to compute and create new resource for response. So it's REST-compliance to do so.

Should I allow user-provided values to be passed through a query string?

I'm adding a search endpoint to a RESTful API. After reading this SO answer, I'd like the endpoint to be designed like:
GET /users?firstName=Otis&hobby=golf,rugby,hunting
That seems like a good idea so far. But the values that I'll be using to perform the search will be provided by the user via a standard HTML input field. I'll guard against malicious injections on the server-side, so that's not my concern. I'm more concerned about the user providing a value that causes the URL to exceed the max URL length of ~2000 characters.
I can do some max-length validation and add some user prompts, etc, but I'm wondering if there's a more standard way to handle this case.
I thought about providing the values in the request body using POST /users, but that endpoint is reserved for new user creation, so that's out.
Any thoughts? Thanks.
I see these possible solutions:
not actually a solution. Go with the query parameter and accept the length constraints
go with the POST solution that shouldn't be designed as you mention. As you point out, if you POST a user to .../users you will create a new user entity. But this is not what you want to do. You want to submit a search ticket to the server that will return a list of results matching your criteria. I'll design something as such
POST .../search/users passing in the body a representation of your search item
distribute the query both server side and client side. Say you have complex criteria to match. Set up a taxonomy of them so that the most strict ones are handled server side. Thus, the server is able to return a manageable list of items you can subsequently filter on the client side. In this approach you can save space in the query string by sending to the server only a subset of the criteria you want to meet in your search.

What should be the response of GET for multiple requested resources with some invalid ids?

what should be the response to a request to
http://localhost:8080/users/1,2,3 when the system doesn't have a user with id 3?
When all users are present I return a 200 response code with all user objects in the response body. When the user requests a single missing user I return a 404 with an error message in the body.
However, what should be the body and status code for a mix between valid and missing ids?
I assume that you want to follow REST API principles. In order to keep clear api design you should rather use query string for filtering
http://localhost:8080/users?id=1,2,3
Then you won't have such dilemmas - you can return just only users with id contained in provided value list and 200 status code (even if list is empty). This endpoint in general
http://localhost:8080/users/{id}
should be reserved for requesting single resource (user) by providing primary key.
What you are requesting there is a collection. The request essentially reads: "give me all users whose ID is in {1, 2, 3}." A subset of those users (let's say there is no user yet with the ID 3) would still be a successful operation, which is asking for a 200 (OK).
If you are overly concerned by this, there's still the possibility to redirect the client via 303 (See Other) to a resource representation without the offending elements.
If all of the IDs are invalid, things get a bit tricky. One may be tempted to simply return a 404 (Not Found), but strictly speaking that were not correct:
The 404 status code indicates that the origin server did not find a current representation for the target resource
Indeed, there is one: The empty set. From a programmatic standpoint, it may indeed be easier to just return that instead of throwing an error. This relies on clients being able to process empty sets/documents.
The RFC grants you the freedom to go either way:
[…] the origin server did not find a current representation for the target resource or is not willing to disclose that one exists.
So if you wish to hide the existence of an empty set, it's okay. It bears mentioning that a set containing nothing is not nothing itself ;)
I would recommend not to offer the method in the first place, but rather force the user of your APIto make three separate requests and return unambiguous responses (two 200s for users 1 and 2 and 404 for user 3). Additionally, the API could offer a get method that responds with all available user ids or such (depends on your system).
Alternatively, if that's not an option, I guess, you have two options:
Return 404 as soon as one user is not found, which technically is more accurate in my opinion. I mean, the request was for 1, 2 AND 3, which was not found.
Return 200 with users 1 and 2, and null, which probably is the most useful for your scenario.

REST interface for finding average

Suppose I want to create a REST interface to find the average of a list of numbers. Assume that the numbers are submitted one at a time. How would you do this?
POST a number to https://example.com/api/average
If this is the first number a hash will be returned
POST a number to https://example.com/api/average/hash
....
GET https://example.com/api/average/hash to find the average
DELETE https://example.com/api/average/hash since we don't need it any more
Is this the right way to do it? Any suggestions?
It makes more sense to think of the list of numbers as the resource. Suppose each list's resource URL is /list/{id} where {id} is a placeholder for the list's ID. Then:
POST /list creates a new list, the server generates a list ID (or 'hash') and specifies the /list/{id} URL in the response's Location header.
POST /list/{id} adds a number to the list
GET /list/{id}/average returns the average
DELETE /list/{id} deletes the list.
An alternative to GET /list/{id}/average would be for GET /list/{id} to return the list as structured data, e.g. XML, that includes the average as a generated property.
What you are talking about is doing a stateless transformation of a request representation (list of numbers) into a response representation (single number).
Lets categorize your resource:
Stateless -- The request is stateless, but so is the resource. It should be able to take your request, process it, and return a response without maintaining any internal state. Further discussion below.
Unlikely to be cacheable -- I am making an assumption here that your lists of numbers are never/seldom identical.
Idempotent -- Requests have no side effects. This is because the resource is stateless.
Now lets examine the different HTTP methods:
GET - Gets the state of a resource. Since your resource has no state, it is not appropriate for your situation. (idempotent, cacheable)
DELETE - Removes a resource or clears its state. Also not appropriate for your situation. (not idempotent, not cacheable)
PUT - Used to set the state of a resource (or create it if it does not exist). (idempotent, not cacheable)
POST - Used to process requests which may or may not modify the state of a resource. May create other resources. (no guarantee of idempotence -- depends on whether the resource is stateful or stateless, not cacheable)
As you see in the other answers, POST is most popularly used as a synonym for 'create'. While this is ok, POST is not limited to just 'create' in REST. Mark Baker does a good job of explaining this here: http://www.markbaker.ca/2001/09/draft-baker-http-resource-state-model-01.txt (Section 3.1.4).
While POST does not have a perfect semantic mapping to your problem, it is the best of all the HTTP methods for what you are trying to do. It also leads to a simple, stateless, and scalable solution, which is the point of REST.
In summary, the answer to your question is:
Method: POST
Request: A representation of a list of numbers
Response: A representation of a single number (average of the list)
While this may look like a SOAP-style web service invocation, it is not. Don't let your visceral reaction to SOAP cloud your use of the POST method and place unnecessary constraints on it.
KISS (Keep it simple, stupid).
You cannot just return a hash or an ID, you have to return URIs or a URI template plus the field values. The only URI that can be part of your API is the entry point, otherwise your API is not REST.
To maximize REST philosophies I would do the following
Do a PUT this to indicate a new structure that would generate a hash, that is not based on the number passed. Just a "random" hash. Then each subsequent post would include the id-hash with returned result hash of the numbers sent. Then when a get is presented on that you can cache the results.
1. PUT /api/average/{number} //return id-hash
2. POST /api/average/{id-hash}/{number} // return average-hash
3. GET /api/average/{average-hash}
4. DELETE /api/average/{id-hash}
Then you can cache the get of the average hash, even when you may get to the result in a different way, or different servers get that same average.