What is the correct REST method for performing server side validation?

What is the correct REST method for performing server side validation? - rest

If I don't want to update a resource, but I just want to check if something is valid (in my case, a SQL query), what's the correct REST method?
I'm not GETting a resource (yet). I'm not PUTting, POSTing, or PATCHing anything (yet). I'm simply sending part of a form back for validation that only the server can do. Another equivalent would be checking that a password conforms to complexity requirements that can only be known by the domain, or perhaps there are other use cases.
Send object, validate, return response, continue with form. Using REST. Any ideas? Am I missing something?

What is the correct REST method for performing server side validation?
Asking whether a representation is valid should have no side effects on the server; therefore it should be safe.
If the representation that you want to validate can be expressed within the URI, then, you should prefer to use GET, as it is the simplest choice, and gives you the best semantics for caching the answer. For example, if we were trying to use a web site to create a validation api for a text (and XML or JSON validator, for instance), then we would probably have a form with a text area control, and construct the identifier that we need by processing the form input.
If the representation that you want to validate cannot be expressed within the URI, then you are going to need to put it into the message body.
Of the methods defined by RFC 7231, only POST is suitable.
Additional methods, outside the scope of this specification, have been standardized for use in HTTP. All such methods ought to be registered within the "Hypertext Transfer Protocol (HTTP) Method Registry" maintained by IANA, as defined in Section 8.1.
The HTTP method registry gives you a lot of options. For this case, I wouldn't bother with them unless you find either a perfect match, or a safe method that accepts a body and is close enough.
So maybe REPORT, which is defined in RFC 3253; I tend to steer clear of WebDAV methods, as I'm not comfortable stretching specifications for "remote Web content authoring operations" outside of their remit.

TLDR; There's a duplicate question around the topic of creating validation endpoints via REST:
In your case a GET request would seem sufficient.
The HTTP GET method is used to read (or retrieve) a representation of a resource. In the “happy” (or non-error) path, GET returns a representation in XML or JSON and an HTTP response code of 200 (OK). In an error case, it most often returns a 404 (NOT FOUND) or 400 (BAD REQUEST).
restapitutorial.com
For validating your SQL query you could use a GET request to get the valid state of your query potentially using a query parameter to achieve this.
GET: api/validateQuery?query="SELECT * FROM TABLE"
Returning:
200 (OK): Valid Query
400 (MALFORMED): Invalid Query
404 (NOT FOUND): Query valid but returns no results (if you plan on executing the query)

I think this type of endpoint is best served as a POST request. As defined in the spec, POST requests can be used for
Providing a block of data, such as the fields entered into an HTML form, to a data-handling process
The use of GET as suggested in another post, for me, is misleading and impractical based on the complexity & arbitrarity of SQL queries.

Related

HTTP GET Request Params - which error code? 400 vs 404 vs 422

I've read a lot of questions about 400 vs 422, but they are almost for HTTP POST requests for example this one: 400 vs 422 response to POST of data.
I am still not sure what should I use for GET when the required parameters are sent with wrong values.
Imagine this scenario:
I have the endpoint /searchDeclaration, with the parameter type.
My declarations have 2 types: TypeA and TypeB.
So, I can call this endpoint like this: /searchDeclaration?type=TypeA to get all TypeA declarations.
What error should I send when someone calls the endpoint with an invalid type? For example: /searchDeclaration?type=Type123
Should I send 400? I am not sure it is the best code, because the parameter is right, only the value is not valid.
The 422 code looks like it is more suitable for POST request.
EDIT:
After some responses I have another doubt.
The /searchDeclaration endpoints will return the declaration for the authenticated user. TypeA and TypeB are valid values, but some users don't have submitted a TypeB declaration, so when they call /searchDeclaration?type=TypeB which error should I send? Returning 404 does not seem right because the URI is correct, but that user does not have a declaration for that value yet. Am I overthinking this?

If the URI is wrong, use 404 Not Found.
The 404 (Not Found) status code indicates that the origin server did not find a current representation for the target resource
The target resource is, as one might expect, identified by the target URI.
Aside from the semantics of the response, 404 indicates that the response is cacheable - meaning that general purpose caches will know that they can re-use this response to future requests
I am not sure it is the best code, because the parameter is right, only the value is not valid.
The fact that this URI includes parameters and values is an implementation detail; something that we don't care about at the HTTP level.
Casually: 404 means "the URI is spelled wrong"; but doesn't try to discriminate which part(s) of the URI have errors. That information is something that you can include in the body of the response, as part of the explanation of the error situation.
Am I overthinking this?
No, but I don't think you are thinking about the right things, yet.
One of the reasons that you are finding this challenging is that you have multiple logical resources sharing the same target URI. If each user declaration document had its own unique identifier, then the exercise of choosing the right response semantics would be a lot more straight forward.
Another way of handing it would be to redirect the client to the more specific URI, and then handle the response semantics there in the straight forward way.
It's trying to use a common URI for different logical resources AND respond without requiring an extra round trip that is making it hard. Bad news: this is one of the trade offs that ought to have been considered when designing your resource identifiers; if you don't want this to require harder thinking, don't use this kind of design.
The good news: 404 is still going to be fine - you are dealing with authorized requests, and the rules about sharing responses to authorized requests mean that the only possible confusion will be if different users are sharing the same private cache.
Remember: part of the point is that all resources share a common, general purpose message vocabulary. Everything is supposed to look like a document being served by a boring web server.
The fact that there's a bunch of complexity of meaning behind the resource is an implementation detail that is correctly hidden behind the uniform interface.

There are two options, it all depends on your 'type' variable.
If 'type' is an ENUM, which only allows 'typeA' and 'typeB', and your client sends 'type123', the service will respond with a '400 Bad Request' error, you don't need to check. In my opinion, this should be ideal, since if you need to add new 'type's in the future, you will only have to add them in the ENUM, instead of doing 'if-else' inside your code to check them all.
In case the 'type' variable is a String, the controller will admit a 'type123' and you should be the one to return an error, since the client request is not malformed, but rather it is trying to access a resource that does not exist.
In this case, you can return a 404 Not Found error, (that resource the client is filtering by cannot be found), or a 422 error as you say, since the server understands the request, but is not able to process it.

Let's assume for a moment that the resource you are querying is returning a set of entries that do contain certain properties. If you don't specify a filter you will basically get a (pageable) representation of those entries either as embedded objects or as links to those resources.
Now you want to filter the results based on some properties these entries have. Most programming languages nowadays provide some lambda functionality in the form of
List filteredList = list.filter(item => item.propertyX == ...)...;
The result of such a filter function is usually a list of items that fulfilled the specified conditions. If no items met the given condition then the result will be an empty list.
Applying certain filter conditions on the Web can be designed similarly. Is it really an error when a provided filter expression doesn't yield any entries? IMO it is not an error in terms of the actual message transport itself as the server was able to receive and parse the request without any issues. As such it has to be some kind of business rule that states that only admissible values are allowed as input.
If you or your company consider the case of a provided filter value for a property returning no results as an error or you perform some i.e. XML or JSON schemata validation on the received payload (for complex requests) then we should look at how those mentioned HTTP errors are defined:
The 400 (Bad Request) status code indicates that the server cannot or will not process the request due to something that is perceived to be a client error (e.g., malformed request syntax, invalid request message framing, or deceptive request routing). (Source: RFC 7230)
Here, it is clearly the case that you don't want to process requests that come with an invalid property value and as such decline the request as such.
The 422 (Unprocessable Entity) status code means the server understands the content type of the request entity (hence a 415(Unsupported Media Type) status code is inappropriate), and the syntax of the request entity is correct (thus a 400 (Bad Request) status code is inappropriate) but was unable to process the contained instructions. For example, this error condition may occur if an XML request body contains well-formed (i.e., syntactically correct), but semantically erroneous, XML instructions. (Source: RFC 4918 (WebDAV))
In this case you basically say that the payload was actually syntactically correct but failed on a semmantical level.
Note that 422 Unprocessable Entity stems from WebDAV while 400 Bad Request is defined in the HTTP specification. This can have some impact if your API serves arbitrary HTTP clients. Ones that only know and support the HTTP error codes defined in the HTTP sepcification won't be able to really determine the semantics of the 422 response. They will still consider it as a user error, but won't be able to provide the client with any more help on that issue. As such, if your API needs to be as generic as possible, stick to 400 Bad Request. If you are sure all clients support 422 Unprocessable Entity go for that one.
General improvement hints
As you tagged your question with rest, let's see how we can improve this case.
REST is an architectural style with an intention of decoupling clients from servers to make the former one more failure tollerant while allowing the latter one to evolve freely over time. Servers in such architectures should therefore provide clients with all the things clients need to make valid requests. To avoid having clients to know upfront what a server expects as input, servers usually provide some kind of input mask clients can use to fill in stuff the server needs.
On the browsable Web this is usually accomplished by HTML Forms. The form not only teaches your client where to send the request to, which HTTP operation to use and which representation format the request should actually use (usually given implicitly as application/x-www-form-urlencoded) but also the sturcture and properties the server supports.
In HTML forms it is rather easy for the server to restrict the input choices of a client by using something along the lines of
<form action="/target">
<label for="cars">Choose a car:</label>
<select name="cars" id="cars">
<option value="volvo">Volvo</option>
<option value="saab">Saab</option>
<option value="opel">Opel</option>
<option value="audi">Audi</option>
</select>
<br/>
<input type="submit" value="Submit">
</form>
This doesn't really remove the needs to check and verify the correctness of the request on the server side, tough you make it much easier for the client to actually perform a valid request.
Unfortunately, HTML forms itself have their limits. I.e. they only allow POST and GET requests to be issues. While encType defaults to application/x-www-form-urlencoded, if you want to transfer files you should use multipart/form-data. Other than that, any valid content-type should be admissible.
If you prefer JSON-based payloads over HTML you might want to look into JSON Forms, HAL forms, Ion Forms among others.
Note though that you should adhere to the content type negotiation principles. Most often proactive content type negotiation is performed where a client sends its preferences within the Accept header and the server will select the best match somehow and return either the resource mapped to that representation format or respond with a 406 Not Acceptable response. While the standard doesn't prevent returning a default representation in such caes, it bears the danger that clients won't be able to process such responses then. A better alternative here would be to fall back to reactive negotiation where the server responds with a 300 Muliple Choice response where a client has to select one of the provided alternatives and then send a GET request to the selected alternative URIs to retrieve the content in the payload may be able to process.
If you want to provide a simple link a client can use to retrieve filtered results, the server should provide the client already with the full URI as well as a link relation name and/or extension relation type that the client can use to lookup the URI to retrieve the content for if interested in.
Both, forms and link-relation support, fall under the HATEOAS umbrella as they help to remove the need for any external documentation such as OpenAPI or Swagger documentation.
To sum things up, I would rethink whether a provided property value that does not exist should really end up as a business failure. I think returning an empty list is just fine here as you clearly state that way that for the given criterias no result was obtainable. If you though still want to stick to a business error check what clients actually make use of your API. If they support 422 go for that one. If you don't know, better stick to 400 as it should be understood by all HTTP clients equally.
In oder to remove the likelihood of ending up with requests that issue invalid property values, use forms to teach clients how requests should look like. Through certain elements or properties you can already teach a client that only a limited set of choices is valid for a certain property. Instead of a form you could also provide dedicated links a client can just use to obtain the filtered result. Just make sure to issue those links with meaningful link relatin names then.

RestAPI. Is it ok to demand optional parameters depending on client configuration?

We are designing a RESTful API.
Sample request:
{
"transaction_id" : abc,
"sale_value" : 100,
"profit_value" : 2
}
profit_value is an optional field. However, based on client configuration, the field is either mandatory or it is ignored even if supplied by the client (in this case we calculate it ourselves and return it in the response).
Is that good practice?
i.e. Is it ok to demand a field even when the API specification defines it as optional?
And is it ok to ignore a field, that even though it is optional, was supplied?

In a REST architecture the server should teach a client on how certain things can be achieved. This is very similar to the browseable Web where a HTML form page teaches a user not only on the available data elements that can be filled by the client, but also teaches a client on the HTTP operation to use, the HTTP endpoint to send the request to as well as the media type format the response has to be sent in.
profit_value is an optional field. However, based on client configuration, the field is either mandatory or it is ignored even if supplied by the client (in this case we calculate it ourselves and return it in the response).
Is that good practice?
If you are teaching a client to send the request via POST the request is basically processed according the resource's own specific semantics. This means that you are of course allowed to ignore or change certain user input to your liking or even decline a request that contains or misses certain input data.
From the given problem statement it is unclear whether the client itself is configuring the mandatory or optional fields or whether it is given by the service provider themselves. Therefore it is unclear how a client knows whether a certain input is mandatory or optional (and thus may be ignored as a consequence). In case a certain field will be ignored by the server, the best option here would be to not even teach that client about that field at all and as such don't include control elements within the form'esque representation format sent as a response from the server to the client. A mandatory element will for sure be evaluated on the server side and in case of constraint validations (missing or not in the correct range or the like) the server will reject the request with a 400 Bad Request response that indicates that a certain expected input parameter is missing. This is how HTTP and HTML operate for decades and this is basically how REST should behave as well.
By API specification I guess you mean something like Swagger or the like?! Unfortunately such API documentation has hardly anything to do with REST as it is pretty similar to traditional interface definition languages (IDL) of common RPC solutions (CORBA, WSDL for SOAP, stub-classes for RMI, ...). In such a case, however, the server can't really support a client on whether a field is mandatory, optional or may even be ignored at all. You might introduce a further resource a client can query to learn whether it supports a certain field or parameter and in such a case add it to the request. This, however, needs to be documented very clearly to avoid confusion on the client side.

REST Check if resource exists, how to handle on server side?

how to handle resource checking on server side?
For example, my api looks like:
/books/{id}
After googling i found, that i should use HEAD method to check, if resource exists.
https://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
I know, that i can use GET endpoint and use HEAD method to fetch information about resource and server does not return body in this case.
But what should i do on server side?
I have two options.
One endpoint marked as GET. I this endpoint i can use GET method to fetch data and HEAD to check if resource is available.
Two endpoints. One marked as GET, second as HEAD.
Why i'm considering second solution?
Let's assume, that GET request fetch some data from database and process them in some way which takes some time, eg. 10 ms
But what i actually need is only to check if data exists in database. So i can run query like
select count(*) from BOOK where id = :id
and immediately return status 200 if result of query is equal to 1. In this case i don't need to process data so i get a faster response time.
But... resource in REST is a object which is transmitted via HTTP, so maybe i should do processing data but not return them when i use HEAD method?
Thanks in advance for your answer!

You could simply delegate the HEAD handler to the existing GET handler and return the status code and headers only (ignoring the response payload).
That's what some frameworks such as Spring MVC and JAX-RS do.
See the following quote from the Spring MVC documentation:
#GetMapping — and also #RequestMapping(method=HttpMethod.GET), are implicitly mapped to and also support HTTP HEAD. An HTTP HEAD request is processed as if it were HTTP GET except but instead of writing the body, the number of bytes are counted and the "
Content-Length header set.
[...]
#RequestMapping method can be explicitly mapped to HTTP HEAD and HTTP OPTIONS, but that is not necessary in the common case.
And see the following quote from the JAX-RS documentation:
HEAD and OPTIONS requests receive additional automated support. On receipt of a HEAD request an implementation MUST either:
Call a method annotated with a request method designator for HEAD or, if none present,
Call a method annotated with a request method designator for GET and discard any returned entity.
Note that option 2 may result in reduced performance where entity creation is significant.
Note: Don't use the old RFC 2616 as reference anymore. It was obsoleted by a new set of RFCs: 7230-7235. For the semantics of the HTTP protocol, refer to the RFC 7231.

Endpoint should be the same and server side script should make decision what to do based on method. If method is HEAD, then just return suitable HTTP code:
204 if content exists but server don't return it
404 if not exists
4xx or 5xx on other error
If method is GET, then process request and return content with HTTP code:
200 if content exists and server return it
404 if not exists
4xx or 5xx on other error
The important thing is that URL should be the same, just method should be different. If URL will be different then we talking about different resources in REST context.

Your reference for HTTP methods is out of date; you should be referencing RFC 7231, section 4.3.2
The HEAD method is identical to GET except that the server MUST NOT send a message body in the response (i.e., the response terminates at the end of the header section).
This method can be used for obtaining metadata about the selected representation without transferring the representation data and is often used for testing hypertext links for validity, accessibility, and recent modification.
You asked:
resource in REST is a object which is transmitted via HTTP, so maybe i should do processing data but not return them when i use HEAD method?
That's right - the primary difference between GET and HEAD is whether the server returns a message-body as part of the response.
But what i actually need is only to check if data exists in database.
My suggestion would be to use a new resource for that. "Resources" are about making your database look like a web site. It's perfectly normal in REST to have many URI that map to a queries that use the same predicate.
Jim Webber put it this way:
The web is not your domain, it's a document management system. All the HTTP verbs apply to the document management domain. URIs do NOT map onto domain objects - that violates encapsulation. Work (ex: issuing commands to the domain model) is a side effect of managing resources. In other words, the resources are part of the anti-corruption layer. You should expect to have many many more resources in your integration domain than you do business objects in your business domain.

Which HTTP Verb for Read endpoint with request body

We are exposing an endpoint that will return a large data set. There is a background process which runs once per hour and generates the data. The data will be different after each run.
The requester can ask for either the full set of data or a subset. The sub set is determined via a set of parameters but the parameters are too long to fit into a uri which has a max length of 2,083 characters. https://www.google.co.uk/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=uri%20max%20length
The parameters can easily be sent in the request body but which which is the correct HTTP verb to use?
GET would be ideal but use of a body 'has no semantic meaning to a GET request' HTTP GET with request body
PUT is not appropriate because there is no ID and no data is being updated or replaced.
POST is not appropriate because a new resource is not being replaced and more importantly the server is not generating and Id.
http://www.restapitutorial.com/lessons/httpmethods.html
GET (read) would seem to be the most appropriate but how can we include the complex set of parameters to determine the response?
Many thanks
John

POST is the correct method. POST should be used for any operation that's not standardized by HTTP, which is your case, since there's no standard for a GET operation with a body. The reference you linked is just directly mapping HTTP methods to CRUD, which is a REST anti-pattern.

You are right that GET with body is to be avoided. You can experiment with other safe methods that take a request body (such as REPORT or SEARCH), or you can indeed use POST. I see no reason why the latter is wrong; what you're citing is just an opinion, not the spec.

Assuming that the queries against that big dataset are not totally random, you should consider adding stored queries to your API. This way clients can add, remove, update queries (through request body) using POST DELETE PUT. Maybe you can call them "reports".
This way the GET requests need only a reference as query parameter to these queries/reports, you don't have to send all the details with every requests.
But only if not all the requests from clients are unique.

Proper RESTful way to handle a request that is not really creating or getting something?

I am writing a little app that does one thing only: takes some user-provided data, does some analysis on it, and returns a "tag" for that data. I am thinking that the client should either GET or POST their request to /getTag in order to get a response back.
Nothing is stored on the server when the client does this, so it feels weird to use a POST. However, there is not a uniform URI for the analysis either, so using a GET feels weird, since it will return different things depending on what data is provided.
What is the best way to represent this functionality with REST?

The "best way" is to do whatever is most appropriate for your application and its needs. Not knowing that, here are a few ideas:
GET is the most appropriate verb since you're not creating or storing anything on the server, just retrieving something that the server provides.
Don't put the word get in the URI as you've suggested. Verbs like that are already provided by HTTP, so just use /tag and GET it instead.
You should use a well-understood (or "cool") URI for this resource and pass the data as query parameters. I wouldn't worry about it feeling weird (see this question's answers to find out why).
To sum up, just GET on /tag?foo=bar&beef=dead, and you're done.

POST can represent performing an action. The action doesn't have to be a database action.
What you have really created is a Remote Procedure. RPC is usually all POST. I don't think this is a good fit for REST, but that doesn't have to stop you from using simple URLs and JSON.

It seems to me like there would probably be a reason you or the user who generated the original data would want the generated tag to persist, wouldn't they?
If that's a possibility, then I'd write it as POST /tags and pass the /tags/:id resource URI back as a Location: header.
If I really didn't care about persisting the generated tag, I'd think about what the "user-generated data" was and how much processing is happening behind the scenes. If the "tag" is different enough from whatever data is being passed into the system, GET /tag might be really confusing for an API consumer.

I'll second Brian's answer: use a GET. If the same input parameters return the same output, and you're not really creating anything, it's an idempotent action and thus perfectly suited for a GET.

You can use GET and POST either:
GET /tag?data="..." -> 200, tag
The GET method means retrieve whatever information (in the form of an
entity) is identified by the Request-URI. If the Request-URI refers to
a data-producing process, it is the produced data which shall be
returned as the entity in the response and not the source text of the
process, unless that text happens to be the output of the process.
POST /tag {data: "..."} -> 200, tag
The action performed by the POST method might not result in a resource
that can be identified by a URI. In this case, either 200 (OK) or 204
(No Content) is the appropriate response status, depending on whether
or not the response includes an entity that describes the result.
according to the HTTP standard / method definitions section.
I would use GET if I were you (and POST only if you want to send files).

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse