RESTful: creating multiple records in one request - rest

I have a form that allows the user to send invites to others. The amount of invites is configurable by the user in the user interface, and could theoretically be infinite. The user needs to define an email address per invite.
When clicking 'send' it should ideally post one request to the server, wrapping all records in one bulk submit. Even though this is not truly RESTful (I heard), it seems favourable over sending possibly 50 separate requests. However, what would be the proper way to do this?
It gets tricky when one of the invites fails, due to a malformed email address or duplicate invites or so. It is fine to properly process the other valid requests, and provide errors on the invalid requests, but what response status code would one use for this?
Generally I try to use the JSONAPI request format. The errors would be in a top object called errors and would be an array consisting of multiple objects. The field key within an error object would point to the record index number (as received in the request) and field name of the error, i.e. "field": "/invites/0/email" for an error on the email field in the first received record.

The best solution I've seen to the "batch request" problem is Google Calendar's API. It is a RESTful API, and therefore there is a URL for every resource which you can manipulate using standard REST sematics (i.e. GET, POST, PUT, DELETE). But the API also exposes a "/batch" endpoint, which accepts a content-type of "mixed/multipart", and the request body contains several nested HTTP requests, each with their own headers, method, url and everything. The response is also one HTTP response with a content-type of "mixed/multipart" containing a collection of individual HTTP response, one response per request.
This advantage of this solution is that
1. It allows you to design your system in a RESTful manner, which we all know and love.
2. It generalizes well to any combination of HTTP requests that your system can deal with.
For more info see: https://developers.google.com/google-apps/calendar/batch

Related

What is the best REST API design that enables a resource to be sent by email?

I'm designing a REST API for an ordering system using the CRUD paradigm.
My routes are as follows:
- GET orders
- POST orders
- GET orders/{order}
- PATCH orders/{order}
- DELETE orders/{order}
This makes perfect sense to me, however, each order can then be sent by email once it has been reviewed and I'm not 100% sure what way to approach it.
I had thought of using:
- POST orders/{order}/sendemail
but then I think I'm using POST incorrectly because it's used for creating resources. Also the route now has a verb in it, which isn't ideal for a resource based REST API.
Then my next thought was to use:
- POST emails/orders/{order}
but that would then imply that emails are a resource which they aren't.
Or should I be using a combination of routes and query strings?
POST orders/{orders}?send-email=true
What would be the best way?
I think I'm using POST incorrectly because it's used for creating resources. Also the route now has a verb in it, which isn't ideal for a resource based REST API.
Both of those ideas are wrong. POST doesn't mean create, it means POST. See also Fielding 2009
POST serves many useful purposes in HTTP, including the general purpose of "this action isn’t worth standardizing."
REST doesn't care what spelling conventions you use for your resource identifiers; as an experiment, see if the following links work the way you expect:
https://www.merriam-webster.com/dictionary/get
https://www.merriam-webster.com/dictionary/post
https://www.merriam-webster.com/dictionary/put
https://www.merriam-webster.com/dictionary/patch
https://www.merriam-webster.com/dictionary/delete
Presumably, you don't want to send emails any time a web crawler happens to come by, indexing your resources, or when a smart cache tries to optimistically load all the links that it knows about, so you are going to want to consider unsafe methods.
POST is the most straight forward choice
POST /??? HTTP/2.0
Content-Type: text/plain
Hey Bob,
Please send the email for order #12345
Riddle: if all of the information that you need is in the body of the HTTP request, does it matter what the target-uri in the request-line is?
Answer: yes! It matters because caching; the target-uri for the request also identifies that document that should be invalidated in the cache if the request is successful.
In other words, we want to look for some related resource (document) whose representations will be changed if the request is successful. In this specific case, that might be a document that somewhere in its representation includes the status of the email itself.
The most obvious candidate? The resource for the order itself
POST /orders/12345 HTTP/2.0
Content-Type: text/plain
Hey Bob,
Please send the email for order #12345
You can also use a remote authoring idiom; the basic idea being that the client edits the document, and then the server figures out what to do based on the edits. For example
GET /???
200 OK
Content-Type: text/plain
Order #12345
Email:
Status: PENDING APPROVAL
PUT /???
Order #12345
Email:
Status: APPROVED
And then its up to the server to diff the two different representations of the resource, and figure out that the status changed and therefore the email should be sent.
Again, you've got some freedom about what /??? should be - is it the same document as the order itself? is it some other document? is that other document in a subordinate location to the order document, or in some other part of the hierarchy?
Neither REST nor HTTP cares very much, other than noting that if the resource identifiers are different, then they are different documents with independent caching policies.
Your problem is that you are trying to think of how you request an email in REST terms as a resource, but you said yourself that emails aren't a resource. It sounds like emails are an indirect result of some sort of state change (the review and approval of an order).
Maybe it makes sense to have order approvals as a resource. If the email generation is tied to the approval process you can have;
/approvals/orders/{order}
With a POST operation to create an approval for the order. This would trigger the email.
Alternatively, if the status is tied into the order you might have;
/orders/{order}
with a PUT operation to update a status field from pending to approved and this would trigger the email.

Are PUT and DELETE HTTP methods indispensable just because of their idempotency property?

I have a REST API and I want to handle all HTTP requests via POST request.
Is there any performance or other kind of issue in using just POST to perform all CRUD operations requested by a user, which sends a JSON containing some data and the operation to be performed?
Technically, the HTML used in the Web only supports GET and POST and this is more or less the reference implementation of a REST architecture.
So, while this is possible I wouldn't advocate for something like that as the idempotency property of PUT and DELETE provide some other benefits in case of network issues where a client can automatically resend the request regardless whether the initial request, whose response might have just got lost mid-way, actually performed its task or not. The result should always be an updated/created resource or a removed URI mapping to the actual resource (or even a removal of the actual resource) as DELETE technically just removes the URI mapping.
In regards to put some operations in the payload, it depends. This actually sounds very RPCy to me, similar to SOAP i.e. If the operation however is defined by a well-defined media-type, like in the JSON Patch case, I guess this is not wrong. Similar to the Web, however, a server should use some resource that is able to teach a client on how to build up a request, like HTML does with forms. This will not only teach the client on what fields the server supports for the target resource but also where to send the request to as well as the media-type and HTTP operation to use, which might be fixed to POST as in the HTML case.

Which REST HTTP verb to use for "Q&A" scenario?

An auth system I work on has this new function:
1. Auth system allows users to specify Relying Parties they transact with,
2. The Relying Party can approve/deny/maybe the request (authorisation) - maybe causes a redirect to the RP website for further authorisation questions by the RP.
The RP has to implement a web service specified by the Auth System to perform the approve/deny/maybe request that the auth system generates.
My problem is what this looks like as a REST service. As the auth system can't really dictate the URI style for the RP system, i would like to specifying that the path does not have any parameters in it, auth system just needs to know the URI of the service. The data of the request (user name/id) might be in a bit of json in the request body (suggesting POST http verb. GET might be OK, but loath to expose user ids in the URI). The auth system does not care what the RP does with the request data, the auth system just wants a "yes/no/maybe" reply (so may not really be a GET/POST/PATCH/DELETE/etc paradigm).
What would be the best verb to use? and how to facilitate the reply; its not really a success/failure response as there are 3 possible results to the query, is it acceptable to have some json returned with the response (then what http verb to use)?
I'm a bit baffled by this. GET seems the most obvious
GET /api/user_link_authorize/{userid}
except then i'm forced to put user ids in the URI (which I dont want to do)...
Any suggestions?
My problem is what this looks like as a REST service.
Think about how it would look as a web site.
You would start with some known URI in your list of bookmarks. Fetching that page would give you a representation of a form, which would have input controls that describe what data needs to be provided (and possibly includes default values). The client provides the data it knows about, and submits the form. The data in the form is used to create a HTTP request as described by HTML's form processing rules. The response to that request includes a representation of the answer, or possibly the next bit of work to be done.
That's REST.
Retrieving the form (via the bookmarked URI) would be a GET of course; we're just updating our locally cached copy of the forms "current" representation. Submitting the form could be a GET or a POST; we don't necessarily need to know that in advance, because that information is carried in the representation of the form itself.
GET vs POST involves a number of trade offs. Semantically, GET is safe, it implies that the resource can be fetched at any time, that spiders can crawl it, that accessing the resource in that way is "free". Which is great when the resource is free, because clients on an unreliable network can automatically retry the request if the response is lost. On the other hand, announcing to the world that the request is safe when it is actually expensive to produce responses is not a winning play.
Furthermore, GET doesn't support a message body (more precisely, the payload has no defined semantics). That means that information provided by the client needs to be part of the target resource identifier itself. If you are dealing with sensitive information, that can be problematic -- not necessarily in transit (you can use a secured socket), but certainly in making sure that the URI with sensitive information is not logged where the sensitive data can leak.
POST supports including a payload with the request, but it doesn't promise that the query is safe, which means that generic components won't know if they can automatically retry the request when a response is lost.
Given that you don't want the user id in the URI, that's a point against GET, and therefore in favor of POST.

Differentiating REST status codes

Lately, I have started adding status codes to my responses instead of returning them directly.
Let's assume /person/1 returns a person with id 1 from the DB. If the person does not exist, should I return 404 status? How am I supposed to differentiate if the endpoint does not exist on the server or the resource does not exist?
Now, let's assume I have a POST endpoint for inserting users. What if that endpoint checks if the email is formed correctly and I return 400? How should I know if the request was not formed correctly and did not route to any servlets or if it indeed reached the servlet which decided that email is badly formed?
Is it a good practice to always return a 200 OK response from all of my servlets indicating that the application has done its job regardless of the outcome and write the status in a json field status or is this an overkill and an anti-pattern?
I do not have a lot of experience nor knowledge of HTTP servers so I am not sure I am explaining this (nor using it) right, so I apologize for the broad descriptions.
Let's assume /person/1 returns a person with id 1 from the DB. If the person does not exist, should I return 404 status? How am I supposed to differentiate if the endpoint does not exist on the server or the resource does not exist?
To a client it doesn't matter whether the resource or the endpoint did not exist. All it is told by the server is that for the given URI there is no representation available.
As inf3rno already mentioned a client is usually served all of the URIs a client will need by the server directly in a response. Through bookmarking or including links in some external resource certain links might get invalid over time and as such a 404 Not Found response just informs the client that no representation is available for the given URI.
A client typically is also not interested in the internals of an API but just to send or receive data it can work upon.
A further misconception many users have, unfortunately, is, that they already assume certain resources to return certain types. Such types may lead to failures on the client side if the expected representation format ever changes. In addition to that the URI structure itself, including any path, matrix and query parameters, should not be used to deduce any logical structure of the API, its exposed endpoints or the logical structure of the resources to other resources of that API. A URI as a whole is a pointer to a resource. A resource may have a dozens of links pointing to it. You might think of a URI as cache-key for representations returned that, on consecutive invocations are further served by the cache instead of the actual server. This is actually one of the constraints REST imposes and is widely used on the Web.
Now, let's assume I have a POST endpoint for inserting users. What if that endpoint checks if the email is formed correctly and I return 400? How should I know if the request was not formed correctly and did not route to any servlets or if it indeed reached the servlet which decided that email is badly formed?
RFC 7231 defines POST as an all-purpose tool that should be used if other methods aren't fitting for the task at hand. It explicitely states that the payload provided by that method will be processed according to the resource's own specific semantics. So, if you need to validate an email-address of a user before persisting it or before starting a calculation, background process or whatever, fine, do that :) Even PUT, which is often said to only replace the current representatin with the given one in the request, is not only allowed but also encouraged to perform verifications regarding any constraints the server has for the target resource and therefore it should refuse payloads that do not fit its expectations.
The quintesence here is, that a server should provide a client always with as much information as possible to let a client determine what to do next. Think of a Web based application which you access through your browser. If you receive a 400 Bad Request the browser will usually tell what the server didn't like about your request, i.e. incomplete syntax or missing value of a required field. The same holds true for REST APIs as they are basically just a generalization of the interaction model used on the Web. So the same concepts that apply to the Web also apply to REST :)
By that, each HTTP status code has its own semantics and should help the client to determine what the client should do next. A 400 Bad Request i.e. states that the server either cannot or will not process the request due to something that the server considers to be a client based error and it's up to the client to correct that failure and resend the request.
A 405 Method Not Allowed on the other hand indicates that the client used a HTTP method not supported by the targeted endpoint. An error response not only indicates that to the client but also which methods are allowed on the targeted endpoint within an Alllow response header.
Each of the HTTP status codes specified in RFC 7231 has their own semantics and its probably advisable to at least skim over these. You can also lookup all available status codes at IANA that provides links to the specificaton describing those status codes.
Is it a good practice to always return a 200 OK response from all of my servlets indicating that the application has done its job regardless of the outcome and write the status in a json field status or is this an overkill and an anti-pattern?
As with error codes also the success codes (in the 200 range) have their own semantics. If a new resource is created as outcome of processing a request (via PUT or POST) a client should be notified with a 201 Created status response that furthremore contains a HTTP Location header containing a URI targeting at the newly created resource.
If a server may take some time in order to calculate a response it is probably advisable to return a 202 Accepted response in order to inform a client about the pending request. A client can later on poll for the request either after some threshold period or after getting notified by the server through callback mechanisms such as email-notification or similar stuff. Due to German law restrictions i.e. German companies have to maintain archives of their messages exchanged via EDI. We, as an EDI provider, offer our clients to perform an archive of their exchanged messages via triggering one of our HTTP endpoints. Depending on the number of messages exchange by that company and the time period selected the archive should be generated for, this process may take some time (a couple of hours to be more concrete) and instead of letting the client wait for that period we simply return 202 Accepted and start the archiving process in the back. Depending on the configuration they either poll for the finished archive, get an information about the final result or directly get the archive sent through email if the file isn't to large.
204 No Content is also quite useful if a client performs an update onto a resource. As PUT is generally defined as replace the current representation with the one provided in the payload, upon receiving a 204 No Content response the client knows that the server applied the update and the current representation does look like the requested one by the client. Thus the server does not need to inform the client further how the current representation looks like, as the client already knows how it should look like. However, in case the server had to convert the payload to a different representation that maybe lead to an other outcome, it is probably benefitial to inform a client about the new state of the resource within a 200 OK response including the a representation of the outcome of the update process.
Returning 200 OK for a failure including a JSON payload with fields indicating about the error is for sure a bad way to proceed. Not only does it give clients a wrong hint but the response might also be cached by intermediaries and returned to other clients requesting the same even when the failure might only be of temporary nature (DB crash or the like). In additon to that is such a JSON payload proabably using a non-standardized format and thus requires out-of-band knowledge to actually process the message. While we humans are quite capable of figuring out what's going on, computers aren't yet that smart on their own.
I hope you can see that HTTP offers a lot of semantics on when to use what method or response code. They are there for a reason and therefore also should be used if the circumstances are right.
In GET request, 404 status is just a response code. You have to provide error message in body of the response in case when record is not found for the id provided.
For POST request, you can return 400 error code with specifying in the body which fields are missing/failing validation.
For url not found, User will always get the 404 error code.
For succcessful GET or POST request, you can return the response with 200 status
How am I supposed to differentiate if the endpoint does not exist on
the server or the resource does not exist?
The endpoint is the IRI (URI) of the web resource in this case. If the endpoint does not exist, then there is a good chance that the web resource does not exist either. It is an unlikely scenario, since you got your URIs from the server (HATEOAS), but it can happen if something changes between two requests, e.g. the URI template changes or somebody deletes the resource. In all of these cases the 404 is a fine HTTP status code. You can elaborate in the error message or use an additional error code, but for me it does not make sense, because the URI template change is a rare event. It would make the client more flexible though, since it could clear the cache and retry with a new link.

Should I use a POST request to send a retrieval request to my server for a large array of ids?

I read the following posts; however, I still haven't found a conclusive answer to my question.
When do you use POST and when do you use GET?
How should I choose between GET and POST methods in HTML forms?
So why should we use POST instead of GET for posting data? [duplicate]
I want to make a HTTP request to my server to retrieve some data based on an array of ids that I will pass to the server. Since each id will have a length of 23 characters, sending 100 of these ids as query parameters of a GET request will exceed the character length limit of some browsers. Since a standard GET request is not feasible due to URL limits, I have been considering my other options.
Option 1: Use request body of HTTP GET request (not advisable according to following SO thread)
HTTP GET with request body
Option 2: Use body of HTTP POST request to send the array of Ids. This is the method that Dropbox appear to have used for their public-facing API.
I know that POST requests should be reserved for requests that are not idempotent and in my case, I should be using a GET request because the query is idempotent. I also know that REST is purely a guideline and since this API will only be consumed by me, I can do whatever I like; however, I thought I'd get a second opinion on the matter before I commit to any decision.
So, what should I do in my situation? Are there better alternatives that I have yet to discover and is there anything I should consider if I do use a POST request?
So, what should I do in my situation?
First step is to review the HTTP Method Registry, which is defined within RFC 7231
Additional methods, outside the scope of this specification, have been standardized for use in HTTP. All such methods ought to be registered within the "Hypertext Transfer Protocol (HTTP) Method Registry" maintained by IANA
The registry is currently here: https://www.iana.org/assignments/http-methods/http-methods.xhtml
So you can review methods that have already been standardized, to see if any of them have matching semantics.
In your case, you are trying to communicate a query with a message-body. As a rule, queries are not merely idempotent but also safe.
A quick skim of the registry might lead you to consider SEARCH
SEARCH is a safe method; it does not have any significance other than executing a query and returning a query result
That looks like a good option, until you read through the specification carefully, and notice the constraints relating the message body. In short, WebDAV probably isn't what you want.
But maybe something else is a fit.
A second option is to consider your search idiom to be a protocol. You POST (or PUT, or PATCH) the ids to the server to create a resource, and then GET a representation of that resource when you want the results.
By itself, that's not quite the single call and response that you want. What it does do is set you up to be thinking about how to be returning a representation of query result resource. In particular, you can use Content-Location to communicate to intermediaries that the response body is in fact the representation of a resource.
I know that POST requests should be reserved for requests that are not idempotent
That's not quite right. When making requests that align with the semantics of another method, we prefer using that other method so that intermediate components can take advantage of the semantics: an idempotent request can be tried, a safe request can be pre-fetched, and so on. Because POST doesn't offer those guarantees, clients cannot take advantage of them even if they happen to apply.
Depending on how you need to manage the origin servers URI namespace, you could use PUT -- conceptually, the query and the results are dual to one another, so can be thought of as two different representations of the same thing. You might manage this with media types - one for the request, a different one for the response.
That gets you back idempotent, but it doesn't get you safe.
I suspect safe requests with payloads are always going to be a problem; the Vary header in HTTP doesn't have an affordance to allow the server to announce that the returned representation depends on the request body (in part because GET isn't supposed to have a request body), so it's going to be difficult for an intermediate component to understand the caching implications of the request body.
I did come across another alternative method from another SO thread, which was to tunnel a GET request using POST/PUT method by adding the X-HTTP-Method-Override request header. Do you think its a legitimate solution to my question?
No, I don't think it solves your problem at all. X-HTTP-Method-Override (and its variant spellings) are for method tunneling, not method-override-the-specification-ing. X-HTTP-Method-Override: GET tells the server that the payload has no defined semantics, which puts you back into the same boat as just using a GET request.