Get request with very long query string. (CSV) - rest

I'm looking to implement an API call where you can specify any combination of up to ~6000 ids to get data from the server. Trouble is it's quite likely that a request will contain a large number of id's - say around 4000. The query string would therefore be very long and possibly too long for the browser?
I wonder, what would be the best approach? I could use a POST but it doesn't really fit with REST - but then again I'm not too fussed about that. Is there a better way of doing this?

In this case, POST really is the solution. From a REST perspective and also from an optimization perspective, if you expect this call to be invoked multiple times with the same list of IDs, you may want to consider one POST call to create a server-side named/defined list and then for subsequent GET requests to reference the created list so that this data doesn't have to be repeated each and every time.

Related

Complex requests with REST API

I am wondering if it is possible to adhere to REST principles when creating what will essentially amount to a BI tool.
In my scenario I have high data volume with 100,000's of IDs (frankly more than this but for the sake of this example let's go with that.). These are presented in a traditional table that allows for necessary features when accessing large data sets such as pagination. The user also has the ability to filter by one, or many of these ID's to drill down the data set as they see fit.
It is theoretically possible that the user would want to filter on 100 of the ID's, thus rendering a GET URI incredibly long. Which as I understand it would kind of break the resource identification principle of a REST API. Not to mention could potentially bump into the character limit in a GET request for certain browsers since the ID's may be quite long. Normally I would just use a POST since I can add all of the applied filters in the body and generate a where clause on the server.
Since a POST in a REST API is supposed to
Create a new entry in the collection.
By definition it would appear, at least to me that generating a complex query for something like this would mean that a REST API is not possible. Or does this perhaps mean that I am approaching the solution wrong (totally plausible).
It would seem that in my scenario using a GET simply isn't possible due to the potential length of the parameters. Thus I am forced to use a POST. Though using a POST as I am seems to violate REST style, which isn't the end of the world. I mostly just wanted to double check that I am not missing something and there is not a solution using a GET.
To follow the resources principle, make a search like resource. POST your ids in a body wto this endpoint and it will prepare a list of results for you and redirect you to searchresults/{id}.
See for example: https://gooroo.io/GoorooTHINK/Article/16583/HTTP-Patterns---Bouncer/25829#.W3aBsugzaUk

How to pass a large number of input parameters to RESTful service?

I have a RESTful service that returns detailed data about a machine by the supplied list of Ids. GET api/machine/
http://service.com/api/machine/1,2,3,4
Up till now this has been fine since I am getting a small number of machines at a time, but now I need to get all machines (more then 1000). This exceeds the 2000 character limit on URLs.
I have gotten both of the options below to work and I'm looking for some community feedback on which way to go.
Option 1: Split up my GET. Make multiple calls with a subset of the ids. Pros: I am doing a get so using the HTTP verb GET makes sense. Cons: If a person new to the service doesn't know about this limit, or doesn't use my client, it would cause problems.
Option 2: Add a PUT/POST method and include the full list of ids in the body. Pros: Makes 1 call to get all data. Cons: I am now doing a get from a PUT/POST.
Probably your best course-of-action would be something in the lines of option 2, you can create a JSON on your side with an array of the numbers you want to send in the Body of the message. If there's the possibility of it still being far too large, you can split it in several messages, when you receive the response of one you'd send the next item in the queue, and so on.
Another option, used by the Facebook API among others, is to create a "/batch" POST method which can be used to make multiple requests in one go.
So instead of having http://service.com/api/machine/1,2,3,4,5,.... you'll have a batch of requests with /machine/1, /machine/2, /machine/3, etc.
The advantage is that you keep clean RESTful URLs (no more coma-separated values) and it scales very well since you can batch as many requests as you want.
The disadvantage is that it is slightly more complex to build.
See there for more information - https://developers.facebook.com/docs/graph-api/making-multiple-requests

Using a POST method in WebApi rather than a GET method. Should I?

I have a REST API that provides access to a resource using GET.
Using this method I can get a specific instance or all instances.
In this case one instance isn't enough and all is too many.
What I've done is create a new Controller with a pattern like /api/filteredresource and made that a POST request with the body containing a representation of a filter to be used to limit the list of items returned.
I'm not looking for a "How do I..." answer, more a "Should I do it this way..." one.
What's the besrt practice here?
This StackOverflow article seems to suggest I shouldn't do it this way as the data canno (or rather should not) be cached but in this instance caching this filtered data doesn't make sense. I suppose I'm looking for a pragmatic answer rather than a technically correct one.
** EDIT **
The inital requirement was to just search for instances of the resource matching a particular status, but it seems that this was to be a 'first step'. They have a 'search key' that they want to use that contains all sorts of properies matching, in many cases, elements of the resource itself and they want to be able to use this 'search key' (or a representation of it) as the filter.
** END EDIT **
It's fine to use POST for any operation that isn't standardized, but retrieval of any kind is standardized and should be done with GET. As you figured, this depends on how pragmatic you want to be, and how much you want to stick to the standards.
If your problem is that the query string isn't readable or easy to represent, and you really want to stick to REST principles, you should have a query or filter resource subordinated to the filteredresource you want to filter, then it's semantically correct to make a POST with a body of filter parameters. This POST should return a 303 with a Location URI for a GET to the filteredresource, with a querystring that will yield the result you expect. Since this is generated by the API, and doesn't have to be readable, how easy it is to build or read shouldn't be an issue at this point. That URI will be cacheable and you'll be doing your retrieval with GET.
A compromise between pragmatism and purism is having that POST simply returning the result.
If you want to be pragmatic, just POST to `fiteredresource' and don't worry about it.
Use query parameters to filter:
GET /rest/things/1
gets the thing with id=1.
GET /rest/things
gets all things.
GET /rest/things?color=yellow
gets only the yellow things.

Handling long queries without violating REST

We have a REST api, and we've done a pretty good job at sticking to the spirit of REST. However, we have an important consumer, and they're requesting a way to reconcile their datastore. The flow works like this:
Consumer makes a GET call to retrieve all inventory objects created within a date range. Lets say this returns 1 million inventory VINs.
Consumer compares the payload with their own datastore, see's that they're missing 5,000 inventory objects
Consumer would like to make a request with the 5,000 VIN id's, and return those 5,000 objects.
The problem is that the long query string (JSON array of vins) bumps into the query string length limits imposed by our server. Possbile ideas - make 5k separate calls (seems horrible), increase querystring length limit on server (would like not to do this), use POST instead (not RESTful?).
So, I'm wondering what Roy Fielding would do...
What about a POST submitting the JSON file with the id's list to a new resource, e.g. called /inventory/difference?
If the computation goes any long, you can answer with 202 Accepted and the id of the resource being generated, then point back to it at /inventory/difference/:id.
Somewhat similar to what moonwave99 suggested, but instead you create a resource called a "set".
You POST to /set a list of identifiers that you wish to be in the set. The result of the POST is a redirect URL to the resource that names the specific set.
So:
POST /set
Result:
301 Moved Permanently
Location: /set/123
Then:
GET /set/123
Returns the list of items in the set.
Sets are orthogonal to the use case of "fetching differences", they're simply a compilation of items.
If the creation of a set takes a long time, and you consider the set itself to be a snapshot of the data, when the user tries to do the GET /set/123 can simply reply with a 202 Accepted until the actual dataset has been completed.
You can then use:
GET /set/123/identifiers
To get a collection of the actual identifiers in the set, for example, if you like.
You can do something like
POST /setfromquery
and send a list of criteria (name like "John*", city = "Los Angeles", etc.). This doesn't really need its own specific resource, just define your query "language" to include both simple lists of IDs as well as perhaps other filter criteria.
Set operations (unions, differences, etc.). Lots of powerful things can be done with a set resource.
Finally, of course, there's the ever popular:
DELETE /set/123
I don't think anyone would fault you in working around GET not accepting a request body by using POST for a request that needs a request body. You are just being pragmatic.
I agree, making 5000 individual requests or upping the query string limit are ugly. POST is the way forward.
Using a post without creating a resource just seemed too dirty for me. In the end, we made it so that there was a limit of 100 ids requested in a "chunk". In practice, these requests will rarely be > 100, so hacking REST principles to accomodate an edge case seemed like a bad idea. I made sure the limitation was clearly defined in our API docs, done and done...

REST parameters vs URI

I'm just learning REST and trying to figure out how to apply it in practice. I have a sampling of data that I want to query, but I'm not sure how the URLs are meant to be formed, i.e. where I put the query. For example, for querying the most recent 100 data records:
GET http://data.com/data/latest/100
GET http://data.com/data?amount=100
which of the previous two queries is the better, and why? And the same for the following:
GET http://data.com/data/latest-days/2
GET http://data.com/data?days=2
GET http://data.com/data?fromDate=01-01-2000
Thanks in advance.
Personally, I would use the query string format in this case. If your /data path is returning all of the data, and you would like to perform this type of query, I believe it makes the most sense. You could also pass query string parameters such as ?since=01-01-2000 to get entries after a specified date or pass column names such as ?category=clothing to retrieve all entries with category equaling clothing.
Additionally, you would want paths such as /data/{id} to be available to retrieve certain entries given their unique id.
It really depends on a lot of things. If you're using any sort of MVC framework, you'd use the URI segments to define your get request to your API which I personally prefer.
It's not a big deal either way, it's all based on preference and how predictable you want the URL to be to your user. In some cases, I'd say go with the REST parameters, but more often than not a URI based GET is quite clean if your setup supports it.