RESTful way to create multiple items in one request - rest

I am working on a small client server program to collect orders. I want to do this in a "REST(ful) way".
What I want to do is:
Collect all orderlines (product and quantity) and send the complete order to the server
At the moment I see two options to do this:
Send each orderline to the server: POST qty and product_id
I actually don't want to do this because I want to limit the number of requests to the server so option 2:
Collect all the orderlines and send them to the server at once.
How should I implement option 2? a couple of ideas I have is:
Wrap all orderlines in a JSON object and send this to the server or use an array to post the orderlines.
Is it a good idea or good practice to implement option 2, and if so how should I do it.
What is good practice?

I believe that another correct way to approach this would be to create another resource that represents your collection of resources.
Example, imagine that we have an endpoint like /api/sheep/{id} and we can POST to /api/sheep to create a sheep resource.
Now, if we want to support bulk creation, we should consider a new flock resource at /api/flock (or /api/<your-resource>-collection if you lack a better meaningful name). Remember that resources don't need to map to your database or app models. This is a common misconception.
Resources are a higher level representation, unrelated with your data. Operating on a resource can have significant side effects, like firing an alert to a user, updating other related data, initiating a long lived process, etc. For example, we could map a file system or even the unix ps command as a REST API.
I think it is safe to assume that operating a resource may also mean to create several other entities as a side effect.

Although bulk operations (e.g. batch create) are essential in many systems, they are not formally addressed by the RESTful architecture style.
I found that POSTing a collection as you suggested basically works, but problems arise when you need to report failures in response to such a request. Such problems are worse when multiple failures occur for different causes or when the server doesn't support transactions.
My suggestion to you is that if there is no performance problem, for example when the service provider is on the LAN (not WAN) or the data is relatively small, it's worth it to send 100 POST requests to the server. Keep it simple, start with separate requests and if you have a performance problem try to optimize.

Facebook explains how to do this: https://developers.facebook.com/docs/graph-api/making-multiple-requests
Simple batched requests
The batch API takes in an array of logical HTTP requests represented
as JSON arrays - each request has a method (corresponding to HTTP
method GET/PUT/POST/DELETE etc.), a relative_url (the portion of the
URL after graph.facebook.com), optional headers array (corresponding
to HTTP headers) and an optional body (for POST and PUT requests). The
Batch API returns an array of logical HTTP responses represented as
JSON arrays - each response has a status code, an optional headers
array and an optional body (which is a JSON encoded string).

Your idea seems valid to me. The implementation is a matter of your preference. You can use JSON or just parameters for this ("order_lines[]" array) and do
POST /orders
Since you are going to create more resources at once in a single action (order and its lines) it's vital to validate each and every of them and save them only if all of them pass validation, ie. you should do it in a transaction.

I've actually been wrestling with this lately, and here's what I'm working towards.
If a POST that adds multiple resources succeeds, return a 200 OK (I was considering a 201, but the user ultimately doesn't land on a resource that was created) along with a page that displays all resources that were added, either in read-only or editable fashion. For instance, a user is able to select and POST multiple images to a gallery using a form comprising only a single file input. If the POST request succeeds in its entirety the user is presented with a set of forms for each image resource representation created that allows them to specify more details about each (name, description, etc).
In the event that one or more resources fails to be created, the POST handler aborts all processing and appends each individual error message to an array. Then, a 419 Conflict is returned and the user is routed to a 419 Conflict error page that presents the contents of the error array, as well as a way back to the form that was submitted.

I guess it's better to send separate requests within single connection. Of course, your web-server should support it

You won't want to send the HTTP headers for 100 orderlines. You neither want to generate any more requests than necessary.
Send the whole order in one JSON object to the server, to: server/order or server/order/new.
Return something that points to: server/order/order_id
Also consider using CREATE PUT instead of POST

Related

What REST method should be used to implement a simple inter-process communication?

This is more a theorical question than a practical one.
We have a backend application that uploads csv files to a frontend application, then and only then the backend sends an empty POST request to tell the frontend to start to process those files to update its database.
For this question it doesn't matter if this is a good design (I think it isn't), what are those files, and what database is: I am only want to know better about the REST "sintax".
I'm referring to wikipedia and restfulapi.net, but I'm not convinced about any alternative, because:
GET: Request sender doesn't receive data;
POST (the currently used): Request sender doesn't want to insert data that are on the request body (just data from external files, if existent. Also they can be insert/update/delete);
PUT: Sounds good, but again, data are not on the request body;
PATCH: Sounds best, but data are not on the body (Also, I am wrong or is it deprecated/unused?);
DELETE: Doesn't always need to delete.
I know it is habit to use POST requests to let machines yell "go!" to each other, but I never thought it was right.
What do you think - in theory - would be the proper method?
The actual reference for the semantics of the HTTP methods is the RFC 7231 and not the ones you referenced in your question.
POST is a catch all method and requests that the target resource process the representation enclosed in the request according to the resource's own specific semantics.
4.3.3. POST
The POST method requests that the target resource process the
representation enclosed in the request according to the resource's
own specific semantics. For example, POST is used for the following
functions (among others):
Providing a block of data, such as the fields entered into an HTML
form, to a data-handling process;
Posting a message to a bulletin board, newsgroup, mailing list,
blog, or similar group of articles;
Creating a new resource that has yet to be identified by the
origin server; and
Appending data to a resource's existing representation(s).
[...]
Responses to POST requests are only cacheable when they include
explicit freshness information. However, POST caching is not widely implemented.
In these scenarios, the receiving application knows where the CSV files will be and monitors that location. When it finds one, it processes it and then deletes or archives it. The application will likely have its own criteria for considering itself ready to process, e.g. time of day, size of file etc.
If the data load on the front end takes a long time you could "partition" the updates based on "importance". How you define importance would be up to your business rules. You could then POST a list of CSV filenames/locations to the front end. The list would be ordered by importance. The front end could then update its database based on that importance. Scheduling less important data for a more appropriate time of day.
If the backend knows the difference between new users and updated users you could use PUT and POST. The front end could assign higher priority to PUT requests as they relate to new users, perhaps assigning lower priority and staggered syncing for CSV filenames in POST requests.

Should I use GET or POST REST API Method?

I want to retrieve data about a bunch of resources. Let's say an Array of book id and the response is JSON Array of book objects. I want to send the request payload as JSON to the server.
Should I use GET and POST method?
Note:
I don't want to make multiple GET request for each book ID.
POST seems to be confusing as it is supposed to be used only when the request creates a resource or modifies the server state.
I want to retrieve data about a bunch of resources. Let's say an Array of book id and the response is JSON Array of book objects.
If you are thinking about passing the array of book id as the message body of the HTTP Request, then GET is a bad idea.
A payload within a GET request message has no defined semantics; sending a payload body on a GET request might cause some existing implementations to reject the request.
You should use POST instead
POST seems to be confusing as it is supposed to be used only when the request creates a resource or modifies the server state.
That's not quite right. POST can be used for anything -- see GraphQL or SOAP. But what you give up by using POST is the ability of intermediate components to participate in the conversation.
For example, for cases that are effectively read-only, you would like to use a safe method, because that allows pre-caching optimization, and automated retry of lost responses on an unreliable network. POST doesn't have extra semantic constraints, so you lose out.
What HTTP really wants is that you GET using the URI; this can be done in one of two relatively straightforward ways:
POST the ids to the server, to create a new resource (meaning that the server retains for itself a copy of the list of ids), and receive a new resource identifier back in exchange. Then GET using this new identifier any time you want to know the current representation of the results.
Encode the information you need into the URI itself. Most commonly, this is done using the query part of the URI, although that isn't strictly necessary. The downside here is that if the URI encoded representation of the array of ids is very long, you may have trouble with some implementations that enforce arbitrary URI limits.
There aren't always great answers:
The REST interface is designed to be efficient for large-grain hypermedia data transfer, optimizing for the common case of the Web, but resulting in an interface that is not optimal for other forms of architectural interaction.
If I understand correctly, you want to get a list of all of the items in a list, in one pull. This would be possible using GET, as REST returns the JSON it can by default be up to 100 items, and you can get more items if needed by specifying $top.
As far as writing back or to the server, POST would be what your looking for, this to my understanding would need to be one for one.
you are going to use a GET-Request and put your request-data (book-id array) in the data-section of your ajax (or whatever you're going to use) request. See How to pass parameters in GET requests with jQuery

How to pass a large number of input parameters to RESTful service?

I have a RESTful service that returns detailed data about a machine by the supplied list of Ids. GET api/machine/
http://service.com/api/machine/1,2,3,4
Up till now this has been fine since I am getting a small number of machines at a time, but now I need to get all machines (more then 1000). This exceeds the 2000 character limit on URLs.
I have gotten both of the options below to work and I'm looking for some community feedback on which way to go.
Option 1: Split up my GET. Make multiple calls with a subset of the ids. Pros: I am doing a get so using the HTTP verb GET makes sense. Cons: If a person new to the service doesn't know about this limit, or doesn't use my client, it would cause problems.
Option 2: Add a PUT/POST method and include the full list of ids in the body. Pros: Makes 1 call to get all data. Cons: I am now doing a get from a PUT/POST.
Probably your best course-of-action would be something in the lines of option 2, you can create a JSON on your side with an array of the numbers you want to send in the Body of the message. If there's the possibility of it still being far too large, you can split it in several messages, when you receive the response of one you'd send the next item in the queue, and so on.
Another option, used by the Facebook API among others, is to create a "/batch" POST method which can be used to make multiple requests in one go.
So instead of having http://service.com/api/machine/1,2,3,4,5,.... you'll have a batch of requests with /machine/1, /machine/2, /machine/3, etc.
The advantage is that you keep clean RESTful URLs (no more coma-separated values) and it scales very well since you can batch as many requests as you want.
The disadvantage is that it is slightly more complex to build.
See there for more information - https://developers.facebook.com/docs/graph-api/making-multiple-requests

Rest POST VS GET if payload is huge

I understand the definition of GET and POST as below.
GET: List the members of the collection, complete with their member URIs for further navigation. For example, list all the cars for sale.
POST: Create a new entry in the collection where the ID is assigned automatically by the collection. The ID created is usually included as part of the data returned by this operation.
MY API searches for some detail in server with huge request payload with JSON Message in that case Which Verb should i use ?
Also can anyone please let me know the length of the characters that can be passed in query string.
The main difference between a GET and POST request is that in the former, the entire request is encoded as part of the URL itself, whereas in the latter, parameters are sent after the header. In addition, in GET request, different browsers will impose different limits on how big the URL can be. Most modern browsers will allow at least 200KB, however Internet Explorer seems to limit the URL size to 2KB.
That being said, if you have any suspicion that you will be passing in a large number of parameters which could exceed the limit imposed on GET requests by the receiving web server, you should switch to POST instead.
Here is a site which surveyed the GET behavior of most modern browsers, and it is worth a read.
Late to the party but for anyone searching for a solution, this might help.
I just came up with 2 different strategies to solve this problem. I'll create proof of concept API and test which one suites me better. Here are the solution I'm currently thinking:
1. X-HTTP-Method-Override:
Basically we would tunnel a GET request using POST/PUT method, with added X-HTTP-Method-Override request header, so that server routes the request to GET call. Simple to implement and does work in one trip.
2. Divide and Rule:
Divide requests into two separate requests. Send a POST/PUT request with all payload, to which server will create necessary response and store it in cache/db along with a key/id to access the data. Then server will respond with either "Location" header or the Key/id through which the stored response can be accessed.
Now send GET request with the key/location given by server on previous POST request. A bit complicated to implement and needs two requests, also requires a separate strategy to clean the cached responses.
If this is going to be a typical situation for your API then a RESTful approach could be to POST query data to a buffer endpoint which returns a URI from which you can GET your results.
Who knows maybe a cache of these will mitigate the need to send "huge" blobs of data about.
Well You Can Use Both To get Results From Server By Passing Some Data To server
In Case Of One Or Two Parameters like Id
Here Only One Parameter Is Used .But 3 to 4 params can Be used This Is How I Used In angularjs
Prefer : Get
Example : $http.get('/getEmployeeDataById?id=22');
In Case It Is Big Json Object
Prefer : Post
Example : var dataObj =
{
name : $scope.name,
age : $scope.age,
headoffice : $scope.headoffice
};
var res = $http.post('/getEmployeesList', dataObj);
And For Size Of Characters That Can Be Passed In Query String Here Is Already Answered
If you're getting data from the server, use GET. If you want to post something, use POST. Payload size is irrelevent. If you want to work with smaller payloads, you could implement pagination.

Avoid duplicate POSTs with REST

I have been using POST in a REST API to create objects. Every once in a while, the server will create the object, but the client will be disconnected before it receives the 201 Created response. The client only sees a failed POST request, and tries again later, and the server happily creates a duplicate object...
Others must have had this problem, right? But I google around, and everyone just seems to ignore it.
I have 2 solutions:
A) Use PUT instead, and create the (GU)ID on the client.
B) Add a GUID to all objects created on the client, and have the server enforce their UNIQUE-ness.
A doesn't match existing frameworks very well, and B feels like a hack. How does other people solve this, in the real world?
Edit:
With Backbone.js, you can set a GUID as the id when you create an object on the client. When it is saved, Backbone will do a PUT request. Make your REST backend handle PUT to non-existing id's, and you're set.
Another solution that's been proposed for this is POST Once Exactly (POE), in which the server generates single-use POST URIs that, when used more than once, will cause the server to return a 405 response.
The downsides are that 1) the POE draft was allowed to expire without any further progress on standardization, and thus 2) implementing it requires changes to clients to make use of the new POE headers, and extra work by servers to implement the POE semantics.
By googling you can find a few APIs that are using it though.
Another idea I had for solving this problem is that of a conditional POST, which I described and asked for feedback on here.
There seems to be no consensus on the best way to prevent duplicate resource creation in cases where the unique URI generation is unable to be PUT on the client and hence POST is needed.
I always use B -- detection of dups due to whatever problem belongs on the server side.
Detection of duplicates is a kludge, and can get very complicated. Genuine distinct but similar requests can arrive at the same time, perhaps because a network connection is restored. And repeat requests can arrive hours or days apart if a network connection drops out.
All of the discussion of identifiers in the other anwsers is with the goal of giving an error in response to duplicate requests, but this will normally just incite a client to get or generate a new id and try again.
A simple and robust pattern to solve this problem is as follows: Server applications should store all responses to unsafe requests, then, if they see a duplicate request, they can repeat the previous response and do nothing else. Do this for all unsafe requests and you will solve a bunch of thorny problems. Repeat DELETE requests will get the original confirmation, not a 404 error. Repeat POSTS do not create duplicates. Repeated updates do not overwrite subsequent changes etc. etc.
"Duplicate" is determined by an application-level id (that serves just to identify the action, not the underlying resource). This can be either a client-generated GUID or a server-generated sequence number. In this second case, a request-response should be dedicated just to exchanging the id. I like this solution because the dedicated step makes clients think they're getting something precious that they need to look after. If they can generate their own identifiers, they're more likely to put this line inside the loop and every bloody request will have a new id.
Using this scheme, all POSTs are empty, and POST is used only for retrieving an action identifier. All PUTs and DELETEs are fully idempotent: successive requests get the same (stored and replayed) response and cause nothing further to happen. The nicest thing about this pattern is its Kung-Fu (Panda) quality. It takes a weakness: the propensity for clients to repeat a request any time they get an unexpected response, and turns it into a force :-)
I have a little google doc here if any-one cares.
You could try a two step approach. You request an object to be created, which returns a token. Then in a second request, ask for a status using the token. Until the status is requested using the token, you leave it in a "staged" state.
If the client disconnects after the first request, they won't have the token and the object stays "staged" indefinitely or until you remove it with another process.
If the first request succeeds, you have a valid token and you can grab the created object as many times as you want without it recreating anything.
There's no reason why the token can't be the ID of the object in the data store. You can create the object during the first request. The second request really just updates the "staged" field.
Server-issued Identifiers
If you are dealing with the case where it is the server that issues the identifiers, create the object in a temporary, staged state. (This is an inherently non-idempotent operation, so it should be done with POST.) The client then has to do a further operation on it to transfer it from the staged state into the active/preserved state (which might be a PUT of a property of the resource, or a suitable POST to the resource).
Each client ought to be able to GET a list of their resources in the staged state somehow (maybe mixed with other resources) and ought to be able to DELETE resources they've created if they're still just staged. You can also periodically delete staged resources that have been inactive for some time.
You do not need to reveal one client's staged resources to any other client; they need exist globally only after the confirmatory step.
Client-issued Identifiers
The alternative is for the client to issue the identifiers. This is mainly useful where you are modeling something like a filestore, as the names of files are typically significant to user code. In this case, you can use PUT to do the creation of the resource as you can do it all idempotently.
The down-side of this is that clients are able to create IDs, and so you have no control at all over what IDs they use.
There is another variation of this problem. Having a client generate a unique id indicates that we are asking a customer to solve this problem for us. Consider an environment where we have a publicly exposed APIs and have 100s of clients integrating with these APIs. Practically, we have no control over the client code and the correctness of his implementation of uniqueness. Hence, it would probably be better to have intelligence in understanding if a request is a duplicate. One simple approach here would be to calculate and store check-sum of every request based on attributes from a user input, define some time threshold (x mins) and compare every new request from the same client against the ones received in past x mins. If the checksum matches, it could be a duplicate request and add some challenge mechanism for a client to resolve this.
If a client is making two different requests with same parameters within x mins, it might be worth to ensure that this is intentional even if it's coming with a unique request id.
This approach may not be suitable for every use case, however, I think this will be useful for cases where the business impact of executing the second call is high and can potentially cost a customer. Consider a situation of payment processing engine where an intermediate layer ends up in retrying a failed requests OR a customer double clicked resulting in submitting two requests by client layer.
Design
Automatic (without the need to maintain a manual black list)
Memory optimized
Disk optimized
Algorithm [solution 1]
REST arrives with UUID
Web server checks if UUID is in Memory cache black list table (if yes, answer 409)
Server writes the request to DB (if was not filtered by ETS)
DB checks if the UUID is repeated before writing
If yes, answer 409 for the server, and blacklist to Memory Cache and Disk
If not repeated write to DB and answer 200
Algorithm [solution 2]
REST arrives with UUID
Save the UUID in the Memory Cache table (expire for 30 days)
Web server checks if UUID is in Memory Cache black list table [return HTTP 409]
Server writes the request to DB [return HTTP 200]
In solution 2, the threshold to create the Memory Cache blacklist is created ONLY in memory, so DB will never be checked for duplicates. The definition of 'duplication' is "any request that comes into a period of time". We also replicate the Memory Cache table on the disk, so we fill it before starting up the server.
In solution 1, there will be never a duplicate, because we always check in the disk ONLY once before writing, and if it's duplicated, the next roundtrips will be treated by the Memory Cache. This solution is better for Big Query, because requests there are not imdepotents, but it's also less optmized.
HTTP response code for POST when resource already exists