Are PUT and DELETE HTTP methods indispensable just because of their idempotency property? - rest

I have a REST API and I want to handle all HTTP requests via POST request.
Is there any performance or other kind of issue in using just POST to perform all CRUD operations requested by a user, which sends a JSON containing some data and the operation to be performed?

Technically, the HTML used in the Web only supports GET and POST and this is more or less the reference implementation of a REST architecture.
So, while this is possible I wouldn't advocate for something like that as the idempotency property of PUT and DELETE provide some other benefits in case of network issues where a client can automatically resend the request regardless whether the initial request, whose response might have just got lost mid-way, actually performed its task or not. The result should always be an updated/created resource or a removed URI mapping to the actual resource (or even a removal of the actual resource) as DELETE technically just removes the URI mapping.
In regards to put some operations in the payload, it depends. This actually sounds very RPCy to me, similar to SOAP i.e. If the operation however is defined by a well-defined media-type, like in the JSON Patch case, I guess this is not wrong. Similar to the Web, however, a server should use some resource that is able to teach a client on how to build up a request, like HTML does with forms. This will not only teach the client on what fields the server supports for the target resource but also where to send the request to as well as the media-type and HTTP operation to use, which might be fixed to POST as in the HTML case.

Related

HTTP method for both sending and returning information

I'm building a web application that needs to process some information on a server. There is no database involved, the server (using Flask) just needs to receive some (complex) information, process it, and send back the result.
My question is which HTTP method is most suitable here (if any). When I read about HTTP methods, they are usually explained in terms of a REST api, where a GET request is used to retrieve data from the server and a POST request is used to create new data on the server. In my case however, I don't need to store any information on the server. A GET request doesn't seem suitable here, as the information sent to the server is rather complex, and can't be easily encoded in the URL. I think a POST request should work here, as I can send the data in JSON format, but the specifications say POST should be used when you want to create something on the server, and a response should only contain a success message and/or location.
Am I missing something here? Should I use something different like WebSocket, or is a POST request fine here, although it doesn't abide by the REST principles?
Thanks in advance.
the specifications say POST should be used when you want to create something on the server
No, they don't. A lot of people say that, but the specification is not so restrictive.
The POST method requests that the target resource process the representation enclosed in the request according to the resource's own specific semantics
Here's how Roy Fielding explained it in 2009:
POST serves many useful purposes in HTTP, including the general purpose of “this action isn’t worth standardizing.”
Yes, POST isn't ideal - the semantics of POST are neither safe nor idempotent, and your particular case would benefit from communicating those properties to general purpose components.
But it is good enough, until the work is done to standardize the semantics of a new method token that better handles this case.
We use POST method to send data to the server. What the server does with the data is encoded in the server logic.
As a client if you want to just send data to server use POST.

Which REST HTTP verb to use for "Q&A" scenario?

An auth system I work on has this new function:
1. Auth system allows users to specify Relying Parties they transact with,
2. The Relying Party can approve/deny/maybe the request (authorisation) - maybe causes a redirect to the RP website for further authorisation questions by the RP.
The RP has to implement a web service specified by the Auth System to perform the approve/deny/maybe request that the auth system generates.
My problem is what this looks like as a REST service. As the auth system can't really dictate the URI style for the RP system, i would like to specifying that the path does not have any parameters in it, auth system just needs to know the URI of the service. The data of the request (user name/id) might be in a bit of json in the request body (suggesting POST http verb. GET might be OK, but loath to expose user ids in the URI). The auth system does not care what the RP does with the request data, the auth system just wants a "yes/no/maybe" reply (so may not really be a GET/POST/PATCH/DELETE/etc paradigm).
What would be the best verb to use? and how to facilitate the reply; its not really a success/failure response as there are 3 possible results to the query, is it acceptable to have some json returned with the response (then what http verb to use)?
I'm a bit baffled by this. GET seems the most obvious
GET /api/user_link_authorize/{userid}
except then i'm forced to put user ids in the URI (which I dont want to do)...
Any suggestions?
My problem is what this looks like as a REST service.
Think about how it would look as a web site.
You would start with some known URI in your list of bookmarks. Fetching that page would give you a representation of a form, which would have input controls that describe what data needs to be provided (and possibly includes default values). The client provides the data it knows about, and submits the form. The data in the form is used to create a HTTP request as described by HTML's form processing rules. The response to that request includes a representation of the answer, or possibly the next bit of work to be done.
That's REST.
Retrieving the form (via the bookmarked URI) would be a GET of course; we're just updating our locally cached copy of the forms "current" representation. Submitting the form could be a GET or a POST; we don't necessarily need to know that in advance, because that information is carried in the representation of the form itself.
GET vs POST involves a number of trade offs. Semantically, GET is safe, it implies that the resource can be fetched at any time, that spiders can crawl it, that accessing the resource in that way is "free". Which is great when the resource is free, because clients on an unreliable network can automatically retry the request if the response is lost. On the other hand, announcing to the world that the request is safe when it is actually expensive to produce responses is not a winning play.
Furthermore, GET doesn't support a message body (more precisely, the payload has no defined semantics). That means that information provided by the client needs to be part of the target resource identifier itself. If you are dealing with sensitive information, that can be problematic -- not necessarily in transit (you can use a secured socket), but certainly in making sure that the URI with sensitive information is not logged where the sensitive data can leak.
POST supports including a payload with the request, but it doesn't promise that the query is safe, which means that generic components won't know if they can automatically retry the request when a response is lost.
Given that you don't want the user id in the URI, that's a point against GET, and therefore in favor of POST.

Differentiating REST status codes

Lately, I have started adding status codes to my responses instead of returning them directly.
Let's assume /person/1 returns a person with id 1 from the DB. If the person does not exist, should I return 404 status? How am I supposed to differentiate if the endpoint does not exist on the server or the resource does not exist?
Now, let's assume I have a POST endpoint for inserting users. What if that endpoint checks if the email is formed correctly and I return 400? How should I know if the request was not formed correctly and did not route to any servlets or if it indeed reached the servlet which decided that email is badly formed?
Is it a good practice to always return a 200 OK response from all of my servlets indicating that the application has done its job regardless of the outcome and write the status in a json field status or is this an overkill and an anti-pattern?
I do not have a lot of experience nor knowledge of HTTP servers so I am not sure I am explaining this (nor using it) right, so I apologize for the broad descriptions.
Let's assume /person/1 returns a person with id 1 from the DB. If the person does not exist, should I return 404 status? How am I supposed to differentiate if the endpoint does not exist on the server or the resource does not exist?
To a client it doesn't matter whether the resource or the endpoint did not exist. All it is told by the server is that for the given URI there is no representation available.
As inf3rno already mentioned a client is usually served all of the URIs a client will need by the server directly in a response. Through bookmarking or including links in some external resource certain links might get invalid over time and as such a 404 Not Found response just informs the client that no representation is available for the given URI.
A client typically is also not interested in the internals of an API but just to send or receive data it can work upon.
A further misconception many users have, unfortunately, is, that they already assume certain resources to return certain types. Such types may lead to failures on the client side if the expected representation format ever changes. In addition to that the URI structure itself, including any path, matrix and query parameters, should not be used to deduce any logical structure of the API, its exposed endpoints or the logical structure of the resources to other resources of that API. A URI as a whole is a pointer to a resource. A resource may have a dozens of links pointing to it. You might think of a URI as cache-key for representations returned that, on consecutive invocations are further served by the cache instead of the actual server. This is actually one of the constraints REST imposes and is widely used on the Web.
Now, let's assume I have a POST endpoint for inserting users. What if that endpoint checks if the email is formed correctly and I return 400? How should I know if the request was not formed correctly and did not route to any servlets or if it indeed reached the servlet which decided that email is badly formed?
RFC 7231 defines POST as an all-purpose tool that should be used if other methods aren't fitting for the task at hand. It explicitely states that the payload provided by that method will be processed according to the resource's own specific semantics. So, if you need to validate an email-address of a user before persisting it or before starting a calculation, background process or whatever, fine, do that :) Even PUT, which is often said to only replace the current representatin with the given one in the request, is not only allowed but also encouraged to perform verifications regarding any constraints the server has for the target resource and therefore it should refuse payloads that do not fit its expectations.
The quintesence here is, that a server should provide a client always with as much information as possible to let a client determine what to do next. Think of a Web based application which you access through your browser. If you receive a 400 Bad Request the browser will usually tell what the server didn't like about your request, i.e. incomplete syntax or missing value of a required field. The same holds true for REST APIs as they are basically just a generalization of the interaction model used on the Web. So the same concepts that apply to the Web also apply to REST :)
By that, each HTTP status code has its own semantics and should help the client to determine what the client should do next. A 400 Bad Request i.e. states that the server either cannot or will not process the request due to something that the server considers to be a client based error and it's up to the client to correct that failure and resend the request.
A 405 Method Not Allowed on the other hand indicates that the client used a HTTP method not supported by the targeted endpoint. An error response not only indicates that to the client but also which methods are allowed on the targeted endpoint within an Alllow response header.
Each of the HTTP status codes specified in RFC 7231 has their own semantics and its probably advisable to at least skim over these. You can also lookup all available status codes at IANA that provides links to the specificaton describing those status codes.
Is it a good practice to always return a 200 OK response from all of my servlets indicating that the application has done its job regardless of the outcome and write the status in a json field status or is this an overkill and an anti-pattern?
As with error codes also the success codes (in the 200 range) have their own semantics. If a new resource is created as outcome of processing a request (via PUT or POST) a client should be notified with a 201 Created status response that furthremore contains a HTTP Location header containing a URI targeting at the newly created resource.
If a server may take some time in order to calculate a response it is probably advisable to return a 202 Accepted response in order to inform a client about the pending request. A client can later on poll for the request either after some threshold period or after getting notified by the server through callback mechanisms such as email-notification or similar stuff. Due to German law restrictions i.e. German companies have to maintain archives of their messages exchanged via EDI. We, as an EDI provider, offer our clients to perform an archive of their exchanged messages via triggering one of our HTTP endpoints. Depending on the number of messages exchange by that company and the time period selected the archive should be generated for, this process may take some time (a couple of hours to be more concrete) and instead of letting the client wait for that period we simply return 202 Accepted and start the archiving process in the back. Depending on the configuration they either poll for the finished archive, get an information about the final result or directly get the archive sent through email if the file isn't to large.
204 No Content is also quite useful if a client performs an update onto a resource. As PUT is generally defined as replace the current representation with the one provided in the payload, upon receiving a 204 No Content response the client knows that the server applied the update and the current representation does look like the requested one by the client. Thus the server does not need to inform the client further how the current representation looks like, as the client already knows how it should look like. However, in case the server had to convert the payload to a different representation that maybe lead to an other outcome, it is probably benefitial to inform a client about the new state of the resource within a 200 OK response including the a representation of the outcome of the update process.
Returning 200 OK for a failure including a JSON payload with fields indicating about the error is for sure a bad way to proceed. Not only does it give clients a wrong hint but the response might also be cached by intermediaries and returned to other clients requesting the same even when the failure might only be of temporary nature (DB crash or the like). In additon to that is such a JSON payload proabably using a non-standardized format and thus requires out-of-band knowledge to actually process the message. While we humans are quite capable of figuring out what's going on, computers aren't yet that smart on their own.
I hope you can see that HTTP offers a lot of semantics on when to use what method or response code. They are there for a reason and therefore also should be used if the circumstances are right.
In GET request, 404 status is just a response code. You have to provide error message in body of the response in case when record is not found for the id provided.
For POST request, you can return 400 error code with specifying in the body which fields are missing/failing validation.
For url not found, User will always get the 404 error code.
For succcessful GET or POST request, you can return the response with 200 status
How am I supposed to differentiate if the endpoint does not exist on
the server or the resource does not exist?
The endpoint is the IRI (URI) of the web resource in this case. If the endpoint does not exist, then there is a good chance that the web resource does not exist either. It is an unlikely scenario, since you got your URIs from the server (HATEOAS), but it can happen if something changes between two requests, e.g. the URI template changes or somebody deletes the resource. In all of these cases the 404 is a fine HTTP status code. You can elaborate in the error message or use an additional error code, but for me it does not make sense, because the URI template change is a rare event. It would make the client more flexible though, since it could clear the cache and retry with a new link.

Should I use a POST request to send a retrieval request to my server for a large array of ids?

I read the following posts; however, I still haven't found a conclusive answer to my question.
When do you use POST and when do you use GET?
How should I choose between GET and POST methods in HTML forms?
So why should we use POST instead of GET for posting data? [duplicate]
I want to make a HTTP request to my server to retrieve some data based on an array of ids that I will pass to the server. Since each id will have a length of 23 characters, sending 100 of these ids as query parameters of a GET request will exceed the character length limit of some browsers. Since a standard GET request is not feasible due to URL limits, I have been considering my other options.
Option 1: Use request body of HTTP GET request (not advisable according to following SO thread)
HTTP GET with request body
Option 2: Use body of HTTP POST request to send the array of Ids. This is the method that Dropbox appear to have used for their public-facing API.
I know that POST requests should be reserved for requests that are not idempotent and in my case, I should be using a GET request because the query is idempotent. I also know that REST is purely a guideline and since this API will only be consumed by me, I can do whatever I like; however, I thought I'd get a second opinion on the matter before I commit to any decision.
So, what should I do in my situation? Are there better alternatives that I have yet to discover and is there anything I should consider if I do use a POST request?
So, what should I do in my situation?
First step is to review the HTTP Method Registry, which is defined within RFC 7231
Additional methods, outside the scope of this specification, have been standardized for use in HTTP. All such methods ought to be registered within the "Hypertext Transfer Protocol (HTTP) Method Registry" maintained by IANA
The registry is currently here: https://www.iana.org/assignments/http-methods/http-methods.xhtml
So you can review methods that have already been standardized, to see if any of them have matching semantics.
In your case, you are trying to communicate a query with a message-body. As a rule, queries are not merely idempotent but also safe.
A quick skim of the registry might lead you to consider SEARCH
SEARCH is a safe method; it does not have any significance other than executing a query and returning a query result
That looks like a good option, until you read through the specification carefully, and notice the constraints relating the message body. In short, WebDAV probably isn't what you want.
But maybe something else is a fit.
A second option is to consider your search idiom to be a protocol. You POST (or PUT, or PATCH) the ids to the server to create a resource, and then GET a representation of that resource when you want the results.
By itself, that's not quite the single call and response that you want. What it does do is set you up to be thinking about how to be returning a representation of query result resource. In particular, you can use Content-Location to communicate to intermediaries that the response body is in fact the representation of a resource.
I know that POST requests should be reserved for requests that are not idempotent
That's not quite right. When making requests that align with the semantics of another method, we prefer using that other method so that intermediate components can take advantage of the semantics: an idempotent request can be tried, a safe request can be pre-fetched, and so on. Because POST doesn't offer those guarantees, clients cannot take advantage of them even if they happen to apply.
Depending on how you need to manage the origin servers URI namespace, you could use PUT -- conceptually, the query and the results are dual to one another, so can be thought of as two different representations of the same thing. You might manage this with media types - one for the request, a different one for the response.
That gets you back idempotent, but it doesn't get you safe.
I suspect safe requests with payloads are always going to be a problem; the Vary header in HTTP doesn't have an affordance to allow the server to announce that the returned representation depends on the request body (in part because GET isn't supposed to have a request body), so it's going to be difficult for an intermediate component to understand the caching implications of the request body.
I did come across another alternative method from another SO thread, which was to tunnel a GET request using POST/PUT method by adding the X-HTTP-Method-Override request header. Do you think its a legitimate solution to my question?
No, I don't think it solves your problem at all. X-HTTP-Method-Override (and its variant spellings) are for method tunneling, not method-override-the-specification-ing. X-HTTP-Method-Override: GET tells the server that the payload has no defined semantics, which puts you back into the same boat as just using a GET request.

GET or POST? Action not destructive, a lot of parameters, no need to cache

I am designing an API endpoint that runs a simulation, and returns the result.
The specific simulation that is run depends on many parameters.
There are no side effects. Nothing destructive (creating, updating, deleting) is happening.
I don't desire to cache the parameters in the query string of the URL for the user to save, or click refresh.
Should this endpoint accept GET requests, or POST requests?
The query string isn't big enough to hold all the parameters. And apparently you aren't supposed to send a payload along with GET requests.
There are no destructive side effects (no side effects at all). So POST doesn't seem appropriate either.
What should I do?
Should this endpoint accept GET requests, or POST requests?
I've got some good news for you. REST doesn't care.
Riddle: how would you provide this service on a web site?
You'd probably have some landing page of general interest, and onto that page you would add a link "click here to try the simulator!" When the consumer followed that link, you would provide a representation of a form describing the parameters required for the simulation, with an identifier for an endpoint and an action. The consumer would submit the filled out form, dispatching to your endpoint a representation of the simulator parameters.
A hypermedia API works the same way; the client shouldn't need to know the endpoint, or what method to use. What it needs to know is how to obtain that information from the representation of the form.
If you have a hypermedia API, you can change endpoints, or switch back and forth between http methods, without requiring that the client be updated to match.
There are no destructive side effects (no side effects at all). So POST doesn't seem appropriate either.
I've got more good news for you. Using POST is fine. The current authority for using POST isn't stack overflow, but RFC-7231
The POST method requests that the target resource process the representation enclosed in the request according to the resource's own specific semantics. For example, POST is used for the following functions (among others):
Providing a block of data, such as the fields entered into an HTML form, to a data-handling process
Perfect. It's not cacheable, unless you explicitly make it so. It contains facilities for redirecting the user to a cacheable representation of the data, for cases when that makes sense.
What POST doesn't do is communicate to the browser, or the intermediary components, that it is safe to retry a lost message.
Here's what Fielding had to say about this in 2009
It isn’t RESTful to use POST for information retrieval when that information corresponds to a potential resource, because that usage prevents safe reusability and the network-effect of having a URI.
POST only becomes an issue when it is used in a situation for which some other method is ideally suited: e.g., retrieval of information that should be a representation of some resource (GET), complete replacement of a representation (PUT), or any of the other standardized methods that tell intermediaries something more valuable than “this may change something.” The other methods are more valuable to intermediaries because they say something about how failures can be automatically handled and how intermediate caches can optimize their behavior. POST does not have those characteristics, but that doesn’t mean we can live without it. POST serves many useful purposes in HTTP, including the general purpose of “this action isn’t worth standardizing.”
HTTP doesn't specify a method for safe operations that include a payload. It
does specify an idempotent method that includes a payload; PUT. Using PUT is unusual in so far as it doesn't really align with the usual understanding of "Create" or "Update", but so long as you are careful about identifiers, I think that it is valid.
Fielding, writing in 2006:
PUT does not mean store. I must have repeated that a million times in webdav and related lists. HTTP defines the intended semantics of the communication -- the expectations of each party. The protocol does not define how either side fulfills those expectations, and it makes damn sure it doesn't prevent a server from having absolute authority over its own resources.
I understand this to mean:
The server is not constrained to track the state of the resource as is; it can use an output representation, rather than an input representation
The server is not constrained on what access to allow; the resource can be write only. Or getting the resource can provide its input representation again.
The server is not constrained on the permanence of the change; "we successfully processed your request (but then immediately reverted the outcome)" is perfectly valid.
From RFC 7231:
A successful response only implies that the user agent's intent was achieved at the time of its processing by the origin server.
In addition, the definition of the 200 status code gives you some room
For the methods defined by this specification, the intended meaning of the payload can be summarized as:
GET a representation of the target resource
PUT, DELETE a representation of the status of the action;
So I think it's an option that may, upon detailed review, be more suitable to your particular circumstances than POST or GET.
I believe with something like this you would want to have aPOST request and return the user something in the Response in JSON or XML. Im using an API for a site now that sends a POST and I get the data out of the Response, and I to have a lot of parameters I'm passing.
First, this API endpoint should be refactored and simplified. HTTP request with a lot of parameters should be avoided, no matter it is a GET or POST or something else. Request with a lot of parameters means very tight coupling between 2 modules -- client side have to assemble very carefully to meet the requirement of server. Also, request with a lot of parameters brings cost on documentation and training.
That said, it has to be a POST if a lot of parameters cannot be avoided. The reason is: although it should be a GET (semantically), GET is not feasible in this situation -- the request with all parameters exceeds the maximum limit of query string. Browser may truncate the query string and break the request.
In summary, it is not a question about what I should do, it is a question about what I have to do, unless the API endpoint is optimized.