Passing sizable data in an REST GET request - rest

A REST question. Let's say I have a database will a million items in it. I want to retrieve say 10,000 of them via an REST GET, passing in the GET request the ID's of the 10,000 items. Using URL request query parameters, it'll quickly exceed the maximum length of a URL. How does people solve this? Use a POST instead and pass it in the body? That seems hacky.

You should not address this form through the URL parameters, it has a limit: 2000 characters
Url limit
I guess what you are doing is something like this:
https://localhost/api/applicationcollections/(47495dde-67d2-4854-add0-7e25b2fe88e4,1b470799-cc8a-4284-9ca7-76dc34a5aebb)
If you are planning to get more than 10k records you can pass the information on the body of the request which doesn't have any limit. Technically speaking you should do it through a POST request, but that is not the intent with the semantic of the POST verb. Even for the GET you can include a body:HTTP GET with request body but it should not consider as part of the semantic.
Normally you don't filter 10k elements by id, instead, you get 10k elements on a request, passing a pagination parameter if you want through the URL, but that can kill your app, especially considering that the DTO has more than one field, like
ApplicationDto
field1
field2....
field15
Bellow, you have an example of how to pass pagination parameters and get the first 10k records
https://localhost:44390/api/applications?pageNumber=1&pageSize=10000
Also, the APIs should return an extra header, let's call it X-Pagination where you can get the information if you have more pages to paginate, including as well the total amount of elements.
As an extra effort to reduce the size of the request, you can shape the data and only get the fields you need.
ApplicationDto should bring only: field1, field3 see bellow:
https://localhost:44390/api/applications?fields=field1,field3
See how Twitter address this problem as well:
Twitter cursor response
Hope this helps

Related

Determine the proper REST API method

I have the following functionalities in my API:
Getting a user by their name
Getting a user by their ID
Getting a user, or if it doesn't exist create one
Getting multiple users by their ID
Currently I'm handling the two first functionalities with a GET request, and the third with a POST request. I could use a GET request for getting multiple users, but sending potentially hundreds of IDs through a query parameter seems like the wrong approach, as I would get a very long URL. I could also use a POST request to send the long list of IDs through its body, but I doubt a POST request is meant for this purpose.
What method would be appropriate to use for the last one?
I see 2 possible ways here:
Get all users and do your filtering post response.
Get a range of IDs, which resumes to only 2 parameters, the lower and the upper limits of the interval. (If this satisfy your needs)
Behaving the way you described and avoiding long URLs in the same time will not work together.

RESTful API, what if the query string isn't long enough?

We have a resource collection for products: /products.
We want to filter this collection to only return the members which have one of a list of specific class id's. For example:
GET /products?classes=100,101,102
This should return a collection of product members which have any of the classes listed.
The issue we have, is that we're working with thousands of products and classes, so the class list of id's could be thousands long - too long for a query string.
I'm keen to stick to RESTful principles whenever we can, so I like the fact that the resource /products?classes=100,101,102 when called with GET returns a filtered products collection.
Obviously, we could include the id's list in the body in JSON format, but that would mean that the call GET /products won't return a representation of the state of the resource (the resource being the URL), because the body is being used to provide filter options.
What's the best way to request a collection which is filtered, but the filter options are too long to use the query string..?
Interesting comment from #C. Smith who suggests making a POST call using a X-HTTP-Method-Override header set to GET and passing the id's in the body. This would work.
After thinking about it we're probably going to limit the number of class id's allowed in the query string, and suggest making multiple calls, breaking up the id's list into say, groups of 200. Submitting more than 200 would return an error.
GET /products?classes=1004,2342,8753... (limited to 200 id's)
GET /products?classes=2326,3343,6981... (limited to 200 id's)
Then the results can easily be stitched together after.
This method would let you use 5,000 id's for example, by doing 25 calls, which while not ideal, is ok for our use case.

Design a REST API in which a search request can take parameters for multiple Queries

I have to design a REST API in which a search request can take parameters for multiple Queries ( i.e. when the client make a call using this API, he should be able to send parameters to form multiple queries).
We have an existing API where we are using GET and it takes multiple parameters which together forms a single Query and then this API call returns the response for this query.
e.g. currently I can pass firstName, lastName, age etc in the request and then get back the person.
But now I have to enhance this service(or have a separate service) where I should be able to send parameters like firstName1, lastName1, age1 to search person1 ; firstName2, lastName2, age2 to search person2 and so on.
Should I use POST for the new API and then send list of parameters(params for query1, params for query2 and so on)?
Or is there a better approach.
We are using Spring Boot for REST implementation.
Its better to use POST because GET is good for 2,3 parameter but when you have a set of parameter or object then POST is Good.
The best thing to do here will be do POST and then return a JSON object with all the details of the Person in an array.
That way it will be faster and you would not have to deal with long urls for GET.
Also GET has limitations regarding the length of the request whereas there is no such limitation in case of POST.
It is really hard to give a right answer here. In general sending a GET request does have the advantage that you can leverage caching easily on a HTTP level, e.g. by using products like varnish, nginx, etc. But if you already can forsee that your URL including all params you'll have to send a POST request to make it work in all Browsers.
RESTfull architecture should respect the principle of addressability.
Since multiple users can be accessed through a unique request, then ideally this group of user should get an address, which would identify it as a resource.
However I understand that in the real world, URIs have a limited length (maximum length of HTTP GET request?). A POST request would indeed work well, but we lose the benefit of addressability.
Another way would be to expose a new resource : group,.
Lets suppose that your current model is something like this :
.../users/{id}
.../users/search?{arg1}={val1};{arg2}={val2}
You could eventually do something like :
.../users/groups/
.../users/groups/{id}
.../users/search?group={id}
(explanation below)
then you could split your research in two :
first a POST on .../users/groups/ with, as proposed by other response, a JSON description of the search parameters. This request could scan the .../users/groups/ directory, and if this set of parameters exists, return the corresponding address .../users/groups/{id}. (for performance issues you could for instance define {id} with a first part which would give the number of users requested).
Then you could make a request for this group with a GET with something like this : .../users/search?group={id}.
This approach would be a bit more complex to implement, but is more consistent with the resource oriented paradigm.

Rest POST VS GET if payload is huge

I understand the definition of GET and POST as below.
GET: List the members of the collection, complete with their member URIs for further navigation. For example, list all the cars for sale.
POST: Create a new entry in the collection where the ID is assigned automatically by the collection. The ID created is usually included as part of the data returned by this operation.
MY API searches for some detail in server with huge request payload with JSON Message in that case Which Verb should i use ?
Also can anyone please let me know the length of the characters that can be passed in query string.
The main difference between a GET and POST request is that in the former, the entire request is encoded as part of the URL itself, whereas in the latter, parameters are sent after the header. In addition, in GET request, different browsers will impose different limits on how big the URL can be. Most modern browsers will allow at least 200KB, however Internet Explorer seems to limit the URL size to 2KB.
That being said, if you have any suspicion that you will be passing in a large number of parameters which could exceed the limit imposed on GET requests by the receiving web server, you should switch to POST instead.
Here is a site which surveyed the GET behavior of most modern browsers, and it is worth a read.
Late to the party but for anyone searching for a solution, this might help.
I just came up with 2 different strategies to solve this problem. I'll create proof of concept API and test which one suites me better. Here are the solution I'm currently thinking:
1. X-HTTP-Method-Override:
Basically we would tunnel a GET request using POST/PUT method, with added X-HTTP-Method-Override request header, so that server routes the request to GET call. Simple to implement and does work in one trip.
2. Divide and Rule:
Divide requests into two separate requests. Send a POST/PUT request with all payload, to which server will create necessary response and store it in cache/db along with a key/id to access the data. Then server will respond with either "Location" header or the Key/id through which the stored response can be accessed.
Now send GET request with the key/location given by server on previous POST request. A bit complicated to implement and needs two requests, also requires a separate strategy to clean the cached responses.
If this is going to be a typical situation for your API then a RESTful approach could be to POST query data to a buffer endpoint which returns a URI from which you can GET your results.
Who knows maybe a cache of these will mitigate the need to send "huge" blobs of data about.
Well You Can Use Both To get Results From Server By Passing Some Data To server
In Case Of One Or Two Parameters like Id
Here Only One Parameter Is Used .But 3 to 4 params can Be used This Is How I Used In angularjs
Prefer : Get
Example : $http.get('/getEmployeeDataById?id=22');
In Case It Is Big Json Object
Prefer : Post
Example : var dataObj =
{
name : $scope.name,
age : $scope.age,
headoffice : $scope.headoffice
};
var res = $http.post('/getEmployeesList', dataObj);
And For Size Of Characters That Can Be Passed In Query String Here Is Already Answered
If you're getting data from the server, use GET. If you want to post something, use POST. Payload size is irrelevent. If you want to work with smaller payloads, you could implement pagination.

Handling long queries without violating REST

We have a REST api, and we've done a pretty good job at sticking to the spirit of REST. However, we have an important consumer, and they're requesting a way to reconcile their datastore. The flow works like this:
Consumer makes a GET call to retrieve all inventory objects created within a date range. Lets say this returns 1 million inventory VINs.
Consumer compares the payload with their own datastore, see's that they're missing 5,000 inventory objects
Consumer would like to make a request with the 5,000 VIN id's, and return those 5,000 objects.
The problem is that the long query string (JSON array of vins) bumps into the query string length limits imposed by our server. Possbile ideas - make 5k separate calls (seems horrible), increase querystring length limit on server (would like not to do this), use POST instead (not RESTful?).
So, I'm wondering what Roy Fielding would do...
What about a POST submitting the JSON file with the id's list to a new resource, e.g. called /inventory/difference?
If the computation goes any long, you can answer with 202 Accepted and the id of the resource being generated, then point back to it at /inventory/difference/:id.
Somewhat similar to what moonwave99 suggested, but instead you create a resource called a "set".
You POST to /set a list of identifiers that you wish to be in the set. The result of the POST is a redirect URL to the resource that names the specific set.
So:
POST /set
Result:
301 Moved Permanently
Location: /set/123
Then:
GET /set/123
Returns the list of items in the set.
Sets are orthogonal to the use case of "fetching differences", they're simply a compilation of items.
If the creation of a set takes a long time, and you consider the set itself to be a snapshot of the data, when the user tries to do the GET /set/123 can simply reply with a 202 Accepted until the actual dataset has been completed.
You can then use:
GET /set/123/identifiers
To get a collection of the actual identifiers in the set, for example, if you like.
You can do something like
POST /setfromquery
and send a list of criteria (name like "John*", city = "Los Angeles", etc.). This doesn't really need its own specific resource, just define your query "language" to include both simple lists of IDs as well as perhaps other filter criteria.
Set operations (unions, differences, etc.). Lots of powerful things can be done with a set resource.
Finally, of course, there's the ever popular:
DELETE /set/123
I don't think anyone would fault you in working around GET not accepting a request body by using POST for a request that needs a request body. You are just being pragmatic.
I agree, making 5000 individual requests or upping the query string limit are ugly. POST is the way forward.
Using a post without creating a resource just seemed too dirty for me. In the end, we made it so that there was a limit of 100 ids requested in a "chunk". In practice, these requests will rarely be > 100, so hacking REST principles to accomodate an edge case seemed like a bad idea. I made sure the limitation was clearly defined in our API docs, done and done...