HTTP request to search for multiple ObjectIds in a Mongo-based API? - mongodb

I'm looking to add search functionality to an API, on a resource called Organizations. Organizations can have different Location and Audience ids tagged onto them (which I would like to use in searching). Since these ids are MongoDB ObjectIds, they are quite long and I'm worried about reaching the max query string limit of the browser with a GET request. For example:
GET http://my-site.com/api/organizations?locations=5afa54e5516c5b57c0d43227,5afa54e5516c5b57c0d43226,5afa54e5516c5b57c0d43225,5afa54e5516c5b57c0d43224&audiences=5afa54e5516c5b57c0d43223,5afa54e5516c5b57c0d43222
Which would probably be about an average search, however I don't want it to break if users select many Locations or Audiences.
Any advice on how I could handle this situation?

I've ran into your situation before. You can change your method to POST
For a input of locations and audiences, your resource is not already sitting there. You have to compute it.
By the definition of POST:
Perform resource-specific processing on the request payload.
Providing a block of data, such as the fields entered into an HTML
form, to a data-handling process;
You have to compute and create new resource for response. So it's REST-compliance to do so.

Related

Determine the proper REST API method

I have the following functionalities in my API:
Getting a user by their name
Getting a user by their ID
Getting a user, or if it doesn't exist create one
Getting multiple users by their ID
Currently I'm handling the two first functionalities with a GET request, and the third with a POST request. I could use a GET request for getting multiple users, but sending potentially hundreds of IDs through a query parameter seems like the wrong approach, as I would get a very long URL. I could also use a POST request to send the long list of IDs through its body, but I doubt a POST request is meant for this purpose.
What method would be appropriate to use for the last one?
I see 2 possible ways here:
Get all users and do your filtering post response.
Get a range of IDs, which resumes to only 2 parameters, the lower and the upper limits of the interval. (If this satisfy your needs)
Behaving the way you described and avoiding long URLs in the same time will not work together.

Should I allow user-provided values to be passed through a query string?

I'm adding a search endpoint to a RESTful API. After reading this SO answer, I'd like the endpoint to be designed like:
GET /users?firstName=Otis&hobby=golf,rugby,hunting
That seems like a good idea so far. But the values that I'll be using to perform the search will be provided by the user via a standard HTML input field. I'll guard against malicious injections on the server-side, so that's not my concern. I'm more concerned about the user providing a value that causes the URL to exceed the max URL length of ~2000 characters.
I can do some max-length validation and add some user prompts, etc, but I'm wondering if there's a more standard way to handle this case.
I thought about providing the values in the request body using POST /users, but that endpoint is reserved for new user creation, so that's out.
Any thoughts? Thanks.
I see these possible solutions:
not actually a solution. Go with the query parameter and accept the length constraints
go with the POST solution that shouldn't be designed as you mention. As you point out, if you POST a user to .../users you will create a new user entity. But this is not what you want to do. You want to submit a search ticket to the server that will return a list of results matching your criteria. I'll design something as such
POST .../search/users passing in the body a representation of your search item
distribute the query both server side and client side. Say you have complex criteria to match. Set up a taxonomy of them so that the most strict ones are handled server side. Thus, the server is able to return a manageable list of items you can subsequently filter on the client side. In this approach you can save space in the query string by sending to the server only a subset of the criteria you want to meet in your search.

Design a REST API in which a search request can take parameters for multiple Queries

I have to design a REST API in which a search request can take parameters for multiple Queries ( i.e. when the client make a call using this API, he should be able to send parameters to form multiple queries).
We have an existing API where we are using GET and it takes multiple parameters which together forms a single Query and then this API call returns the response for this query.
e.g. currently I can pass firstName, lastName, age etc in the request and then get back the person.
But now I have to enhance this service(or have a separate service) where I should be able to send parameters like firstName1, lastName1, age1 to search person1 ; firstName2, lastName2, age2 to search person2 and so on.
Should I use POST for the new API and then send list of parameters(params for query1, params for query2 and so on)?
Or is there a better approach.
We are using Spring Boot for REST implementation.
Its better to use POST because GET is good for 2,3 parameter but when you have a set of parameter or object then POST is Good.
The best thing to do here will be do POST and then return a JSON object with all the details of the Person in an array.
That way it will be faster and you would not have to deal with long urls for GET.
Also GET has limitations regarding the length of the request whereas there is no such limitation in case of POST.
It is really hard to give a right answer here. In general sending a GET request does have the advantage that you can leverage caching easily on a HTTP level, e.g. by using products like varnish, nginx, etc. But if you already can forsee that your URL including all params you'll have to send a POST request to make it work in all Browsers.
RESTfull architecture should respect the principle of addressability.
Since multiple users can be accessed through a unique request, then ideally this group of user should get an address, which would identify it as a resource.
However I understand that in the real world, URIs have a limited length (maximum length of HTTP GET request?). A POST request would indeed work well, but we lose the benefit of addressability.
Another way would be to expose a new resource : group,.
Lets suppose that your current model is something like this :
.../users/{id}
.../users/search?{arg1}={val1};{arg2}={val2}
You could eventually do something like :
.../users/groups/
.../users/groups/{id}
.../users/search?group={id}
(explanation below)
then you could split your research in two :
first a POST on .../users/groups/ with, as proposed by other response, a JSON description of the search parameters. This request could scan the .../users/groups/ directory, and if this set of parameters exists, return the corresponding address .../users/groups/{id}. (for performance issues you could for instance define {id} with a first part which would give the number of users requested).
Then you could make a request for this group with a GET with something like this : .../users/search?group={id}.
This approach would be a bit more complex to implement, but is more consistent with the resource oriented paradigm.

Handling long queries without violating REST

We have a REST api, and we've done a pretty good job at sticking to the spirit of REST. However, we have an important consumer, and they're requesting a way to reconcile their datastore. The flow works like this:
Consumer makes a GET call to retrieve all inventory objects created within a date range. Lets say this returns 1 million inventory VINs.
Consumer compares the payload with their own datastore, see's that they're missing 5,000 inventory objects
Consumer would like to make a request with the 5,000 VIN id's, and return those 5,000 objects.
The problem is that the long query string (JSON array of vins) bumps into the query string length limits imposed by our server. Possbile ideas - make 5k separate calls (seems horrible), increase querystring length limit on server (would like not to do this), use POST instead (not RESTful?).
So, I'm wondering what Roy Fielding would do...
What about a POST submitting the JSON file with the id's list to a new resource, e.g. called /inventory/difference?
If the computation goes any long, you can answer with 202 Accepted and the id of the resource being generated, then point back to it at /inventory/difference/:id.
Somewhat similar to what moonwave99 suggested, but instead you create a resource called a "set".
You POST to /set a list of identifiers that you wish to be in the set. The result of the POST is a redirect URL to the resource that names the specific set.
So:
POST /set
Result:
301 Moved Permanently
Location: /set/123
Then:
GET /set/123
Returns the list of items in the set.
Sets are orthogonal to the use case of "fetching differences", they're simply a compilation of items.
If the creation of a set takes a long time, and you consider the set itself to be a snapshot of the data, when the user tries to do the GET /set/123 can simply reply with a 202 Accepted until the actual dataset has been completed.
You can then use:
GET /set/123/identifiers
To get a collection of the actual identifiers in the set, for example, if you like.
You can do something like
POST /setfromquery
and send a list of criteria (name like "John*", city = "Los Angeles", etc.). This doesn't really need its own specific resource, just define your query "language" to include both simple lists of IDs as well as perhaps other filter criteria.
Set operations (unions, differences, etc.). Lots of powerful things can be done with a set resource.
Finally, of course, there's the ever popular:
DELETE /set/123
I don't think anyone would fault you in working around GET not accepting a request body by using POST for a request that needs a request body. You are just being pragmatic.
I agree, making 5000 individual requests or upping the query string limit are ugly. POST is the way forward.
Using a post without creating a resource just seemed too dirty for me. In the end, we made it so that there was a limit of 100 ids requested in a "chunk". In practice, these requests will rarely be > 100, so hacking REST principles to accomodate an edge case seemed like a bad idea. I made sure the limitation was clearly defined in our API docs, done and done...

RESTful way to create multiple items in one request

I am working on a small client server program to collect orders. I want to do this in a "REST(ful) way".
What I want to do is:
Collect all orderlines (product and quantity) and send the complete order to the server
At the moment I see two options to do this:
Send each orderline to the server: POST qty and product_id
I actually don't want to do this because I want to limit the number of requests to the server so option 2:
Collect all the orderlines and send them to the server at once.
How should I implement option 2? a couple of ideas I have is:
Wrap all orderlines in a JSON object and send this to the server or use an array to post the orderlines.
Is it a good idea or good practice to implement option 2, and if so how should I do it.
What is good practice?
I believe that another correct way to approach this would be to create another resource that represents your collection of resources.
Example, imagine that we have an endpoint like /api/sheep/{id} and we can POST to /api/sheep to create a sheep resource.
Now, if we want to support bulk creation, we should consider a new flock resource at /api/flock (or /api/<your-resource>-collection if you lack a better meaningful name). Remember that resources don't need to map to your database or app models. This is a common misconception.
Resources are a higher level representation, unrelated with your data. Operating on a resource can have significant side effects, like firing an alert to a user, updating other related data, initiating a long lived process, etc. For example, we could map a file system or even the unix ps command as a REST API.
I think it is safe to assume that operating a resource may also mean to create several other entities as a side effect.
Although bulk operations (e.g. batch create) are essential in many systems, they are not formally addressed by the RESTful architecture style.
I found that POSTing a collection as you suggested basically works, but problems arise when you need to report failures in response to such a request. Such problems are worse when multiple failures occur for different causes or when the server doesn't support transactions.
My suggestion to you is that if there is no performance problem, for example when the service provider is on the LAN (not WAN) or the data is relatively small, it's worth it to send 100 POST requests to the server. Keep it simple, start with separate requests and if you have a performance problem try to optimize.
Facebook explains how to do this: https://developers.facebook.com/docs/graph-api/making-multiple-requests
Simple batched requests
The batch API takes in an array of logical HTTP requests represented
as JSON arrays - each request has a method (corresponding to HTTP
method GET/PUT/POST/DELETE etc.), a relative_url (the portion of the
URL after graph.facebook.com), optional headers array (corresponding
to HTTP headers) and an optional body (for POST and PUT requests). The
Batch API returns an array of logical HTTP responses represented as
JSON arrays - each response has a status code, an optional headers
array and an optional body (which is a JSON encoded string).
Your idea seems valid to me. The implementation is a matter of your preference. You can use JSON or just parameters for this ("order_lines[]" array) and do
POST /orders
Since you are going to create more resources at once in a single action (order and its lines) it's vital to validate each and every of them and save them only if all of them pass validation, ie. you should do it in a transaction.
I've actually been wrestling with this lately, and here's what I'm working towards.
If a POST that adds multiple resources succeeds, return a 200 OK (I was considering a 201, but the user ultimately doesn't land on a resource that was created) along with a page that displays all resources that were added, either in read-only or editable fashion. For instance, a user is able to select and POST multiple images to a gallery using a form comprising only a single file input. If the POST request succeeds in its entirety the user is presented with a set of forms for each image resource representation created that allows them to specify more details about each (name, description, etc).
In the event that one or more resources fails to be created, the POST handler aborts all processing and appends each individual error message to an array. Then, a 419 Conflict is returned and the user is routed to a 419 Conflict error page that presents the contents of the error array, as well as a way back to the form that was submitted.
I guess it's better to send separate requests within single connection. Of course, your web-server should support it
You won't want to send the HTTP headers for 100 orderlines. You neither want to generate any more requests than necessary.
Send the whole order in one JSON object to the server, to: server/order or server/order/new.
Return something that points to: server/order/order_id
Also consider using CREATE PUT instead of POST