REST API: HTML generic error response for Invalid JSON - rest

Service X host REST APIs and is behind service Y.
Clients -> Y -> X
For invalid JSON, service Y responds back with HTML error (shown below). Service X does not have control over Y.
<html>
<head><title>400 Bad Request</title></head>
<center><h1>400 Bad Request</h1></center>
</body>
</html>
For all other type of errors, X responds back with appropriate HTTP response code and an Error JSON (Following format).
{
"errorCode": "InvalidXXX",
"message": ""
}
I am trying to check if there exists any RFC around REST API error response standards?
Is there a security risk if service returns an error response with details mentioning the JSON is invalid?
I am trying to understand whether it will be fine for us to document this special case as part of integration guide for clients.

I am trying to check if there exists any RFC around REST API error response standards?
REST is an architectural style defined by Fielding in the chapter 5 of his dissertation and it says nothing about the response format for errors.
I assume you are doing REST over HTTP, so I advise you to choose the most suitable status code for each situation. Status codes are meant to indicate the result of the server's attempt to understand and satisfy the client's request.
Status codes are sometimes not sufficient to convey enough information about an error to be helpful and some details sent in the payload can help the client to understand what caused the error.
If you are looking for a standard from IETF to report errors, the closest you'll find is probably the RFC 7807. This specification defines simple JSON and XML document formats to report errors to the client, along with the application/problem+json and application/problem+xml media types.
Is there a security risk if service returns an error response with details mentioning the JSON is invalid?
When it's a client error, it makes sense to inform the client on what's wrong, so they can fix it and perform a new request. You shouldn't, however, leak any stack trace or internal details that could be exploited by a malicious user.

Related

API design - Optional body in client request - Status code to return if validation fails

In our API, one of the endpoint will expect clients to provide body/payload only in certain scenario.
If the API is unable to generate a payload for given request based on the origin of the client then, we want our API to provide response with the right status code to the client, so that they know they have to provide additional information. Once the client fulfills the request with body/payload then the api will process the request as normal.
I just wanted to know is there any standard, predefined status code or procedure to implement this kind of endpoint in API design or do we have to just reject the request with some custom status code and then ask the client to implement a logic based on custom code?.
Thanks,
Vinoth
HTTP Status codes don't, nor are they intended to, map precisely against every real world error. They represent categories of error.
For example, a 404 means that the resource couldn't be found, but if your path is /customers/11/animals/5 then there are several things which could be wrong with the path. customer 11 may not have an animal 5 for example, or there may be no customer 11. There is no http response for "animal not found". Or your API may not have any calls with that pattern of URL to begin with.
You should return a status code which represents what "category" of error you have (in this case, something was not found), and the response body should contain more specific details about the error. To make things simpler, I find it helpful if the data structure is the same for a success and error (it makes parsing much easier) with a "data" field which varies per response.
Here is one example:
status code: 404 not found
body: {
"messageDetailCode" :"CustomerNotFound",
"messageDetail" : "Customer not found",
"data" : null
}
Further reading:
What's an appropriate HTTP status code to return by a REST API service for a validation failure?

Post condition check http status

Using a following API:
feed?url=XXX
Validations performed on the parameter url:
If missing: 400 Bad Request
If empty/invalid URL: 422 Unprocessable Entity
If URL don't point to a valid RSS/Atom feed: 422??
What status error should be returned for the 3.?
Unlike validation 2., it is not possible to check the 3. without fetching data and trying to parse it, so raw user data can't be directly validated.
I was thinking about 422 Unprocessable Entity because it is related to validation even if is not directly the data (url) but the reference of this data (content of the url).
What is your opinion?
422 is inappropriate for #2 and #3. It relates to the server understanding the Content-Type header, but the actual content in the HTTP request body could not be interpreted.
I think you could make the argument that 502 Bad Gateway is appropriate here. It's a bit weird because the problem is both user error (incoming url parameter, so a 4xx code), but also it's kind of happening on the server, and in particular an origin server (which makes sense as a 5xx and specifically 502).
But if in this context you strictly consider this an issue that the client caused (bad input in the url) and not server-side, then I would say that there's not a specific enough error code for this, and you should probably just stick to 400 for all of them.
And maybe.. perhaps you can make the argument that 409 might work here. A 409 can used in a case where:
There is nothing in particular wrong with the HTTP request perse.
But the state of another resource is causing the request to fail.
Typically that 'other resource' lives in the same system, but I don't see why the external Atom feed could also not be considered as conflicting.
Because if the Atom feed on the external server is 'fixed', then the original HTTP request that the user made will now work. So a 409 kind of is appropriate here.

HTTP status code in REST API for using GET to query a “Not Ready Yet, Try Again Later” resource?

First of all, I've read some relevant posts:
Best HTTP status code in REST API for “Not Ready Yet, Try Again Later”? It is about GET an item
Is it wrong to return 202 “Accepted” in response to HTTP GET? It is about GET an item
HTTP Status Code for Resource not yet available It is about POST
HTTP status code for in progress? It is about GET but no clear answer.
but I still think I should raise my question and my thoughts here. What should be the HTTP status code in REST API for using GET to QUERY a “Not Ready Yet, Try Again Later” resource? For example, client tries to query all local news happen in future(!) by make an HTTP GET to this url: http://example.com/news?city=chicago&date=2099-12-31 so what shall the server reply?
These are the http status code I considered, their rfc definition and why I am not fully satisfied with:
3xx Redirection. Comment: Not an option because there is no other link to be redirected to.
503 Service Unavailable: The server is currently unable to handle the request due to a temporary overloading or maintenance of the server. The implication is that this is a temporary condition which will be alleviated after some delay. If known, the length of the delay MAY be indicated in a Retry-After header. Comment: The retry behavior is desired, but semantically the situation is not server's fault at all, so all 5xx look weird.
4xx Client Error. Comment: Looks promising. See below.
413 Request Entity Too Large: The server is refusing to process a request because the request entity is larger than the server is willing or able to process. ... If the condition is temporary, the server SHOULD include a Retry- After header field to indicate that it is temporary and after what time the client MAY try again. Comment: The retry behavior is desired, however the "Entity Too Large" part is somewhat misleading.
417 Expectation Failed: The expectation given in an Expect request-header field (see section 14.20) could not be met by this server. Comment: So it should be caused by an Expect request-header, not applicable to my case.
406 Not Acceptable: The resource ... not acceptable according to the accept headers sent in the request. Comment: so it is caused by the Accept request-header, not applicable to my case.
409 Conflict: The request could not be completed due to a conflict with the current state of the resource. This code is only allowed in situations where it is expected that the user might be able to resolve the conflict and resubmit the request. ... Conflicts are most likely to occur in response to a PUT request. Comment: This one is close. Although my case is not about PUT, and isn't actually caused by conflict.
404 Not Found: The server has not found anything matching the Request-URI. Comment: Technically, my url path (http://example.com/news) DOES exist, it is the parameters causing problems. In this case, returning an empty collection instead of a 404, is probably more appropriate.
403 Forbidden: The server understood the request, but is refusing to fulfill it. Authorization will not help and the request SHOULD NOT be repeated. Comment: Generally this is supposed to be used in any restricted resource?
400 Bad Request: The request could not be understood by the server due to malformed syntax. The client SHOULD NOT repeat the request without modifications. Comment: It is not true in my case. My server understands the request, its syntax is good, only the meaning is bad.
2xx Successful. Comment: If 4xx doesn't work, how about 2xx? See below.
200 OK. Comment: Fine. So what should I include in the response body? null or [] or {} or {"date": "2099-12-31", "content_list": null} or ... which one is more intuitive? On the other hand, I prefer a way to clearly differentiate the minor "future news" error from the more common "all query criteria are good, just no news this day" situation.
202 Accepted: The request has been accepted for processing, but the processing has not been completed. The request might or might not eventually be acted upon. Comment: Providing that we can use 202 in a GET request, it is acceptable. Then refer to the 200 comment.
204 No Content: The server has fulfilled the request but does not need to return an entity-body. Comment: Providing that we can use 204 in a GET request, it is acceptable. Just don't know whether this is better than 202 or 200.
More on 2xx: Comment: I assume all 2xx response will likely be cached somewhere. But in my case, if I return an empty body for "tomorrow's news", I don't want it to be cached. Ok, explicitly specify the "no cache" headers should help.
Your thoughts?
Use 404.
Your objection to it is based on a popular understanding of a URI as not including the querystring. "Because I have multiple URI's that map to the same handler," goes the logic, "my resource does in fact exist and is just being parameterized by querystring args."
This is incorrect. As the URI spec itself says in Section 3.3 (emphasis mine),
"The path component contains data, usually organized in hierarchical
form, that, along with data in the non-hierarchical query
component (Section 3.4), serves to identify a resource within the
scope of the URI's scheme and naming authority (if any)."
Resources are identified by URI's, and any change to any part of an absolute-URI identifies a separate resource. Tweet that to everyone you know once a day until they tell you to stop. Therefore a 404 is a perfect match: "The 404 (Not Found) status code indicates that the origin server did not find a current representation for the target resource or is not willing to disclose that one exists."
You are retrieving the news for that day, which is a valid day, there just isn't any news. A 200 response of an empty body, or a what ever makes sense based on the mediatype would seem the logical. This depends on the media type you have decided with the client.
404 would make more sense if the date format was wrong (you asked for the 45th day of November, or asked for a city that doesn't exist.)
As an aside the URL would be better in the format http://example.com/news/chicago/2099-12-31 since that is the specific resource you want to retrieve. This format would make things like 404s clearer as well.

400 vs 422 response to POST of data

I'm trying to figure out what the correct status code to return on different scenarios with a "REST-like" API that I'm working on. Let's say I have an end point that allows POST'ing purchases in JSON format. It looks like this:
{
"account_number": 45645511,
"upc": "00490000486",
"price": 1.00,
"tax": 0.08
}
What should I return if the client sends me "sales_tax" (instead of the expected "tax"). Currently, I'm returning a 400. But, I've started questioning myself on this. Should I really be returning a 422? I mean, it's JSON (which is supported) and it's valid JSON, it's just doesn't contain all of the required fields.
400 Bad Request would now seem to be the best HTTP/1.1 status code for your use case.
At the time of your question (and my original answer), RFC 7231 was not a thing; at which point I objected to 400 Bad Request because RFC 2616 said (with emphasis mine):
The request could not be understood by the server due to malformed syntax.
and the request you describe is syntactically valid JSON encased in syntactically valid HTTP, and thus the server has no issues with the syntax of the request.
However as pointed out by Lee Saferite in the comments, RFC 7231, which obsoletes RFC 2616, does not include that restriction:
The 400 (Bad Request) status code indicates that the server cannot or will not process the request due to something that is perceived to be a client error (e.g., malformed request syntax, invalid request message framing, or deceptive request routing).
However, prior to that re-wording (or if you want to quibble about RFC 7231 only being a proposed standard right now), 422 Unprocessable Entity does not seem an incorrect HTTP status code for your use case, because as the introduction to RFC 4918 says:
While the status codes provided by HTTP/1.1 are sufficient to
describe most error conditions encountered by WebDAV methods, there
are some errors that do not fall neatly into the existing categories.
This specification defines extra status codes developed for WebDAV
methods (Section 11)
And the description of 422 says:
The 422 (Unprocessable Entity) status code means the server
understands the content type of the request entity (hence a
415(Unsupported Media Type) status code is inappropriate), and the
syntax of the request entity is correct (thus a 400 (Bad Request)
status code is inappropriate) but was unable to process the contained
instructions.
(Note the reference to syntax; I suspect 7231 partly obsoletes 4918 too)
This sounds exactly like your situation, but just in case there was any doubt, it goes on to say:
For example, this error condition may occur if an XML
request body contains well-formed (i.e., syntactically correct), but
semantically erroneous, XML instructions.
(Replace "XML" with "JSON" and I think we can agree that's your situation)
Now, some will object that RFC 4918 is about "HTTP Extensions for Web Distributed Authoring and Versioning (WebDAV)" and that you (presumably) are doing nothing involving WebDAV so shouldn't use things from it.
Given the choice between using an error code in the original standard that explicitly doesn't cover the situation, and one from an extension that describes the situation exactly, I would choose the latter.
Furthermore, RFC 4918 Section 21.4 refers to the IANA Hypertext Transfer Protocol (HTTP) Status Code Registry, where 422 can be found.
I propose that it is totally reasonable for an HTTP client or server to use any status code from that registry, so long as they do so correctly.
But as of HTTP/1.1, RFC 7231 has traction, so just use 400 Bad Request!
Case study: GitHub API
https://docs.github.com/en/rest/overview/resources-in-the-rest-api#client-errors
Maybe copying from well known APIs is a wise idea:
There are three possible types of client errors on API calls that receive request bodies:
Sending invalid JSON will result in a 400 Bad Request response:
HTTP/1.1 400 Bad Request
Content-Length: 35
{"message":"Problems parsing JSON"}
Sending the wrong type of JSON values will result in a 400 Bad Request response:
HTTP/1.1 400 Bad Request
Content-Length: 40
{"message":"Body should be a JSON object"}
Sending invalid fields will result in a 422 Unprocessable Entity response:
HTTP/1.1 422 Unprocessable Entity
Content-Length: 149
{
"message": "Validation Failed",
"errors": [
{
"resource": "Issue",
"field": "title",
"code": "missing_field"
}
]
}
400 Bad Request is proper HTTP status code for your use case. The code is defined by HTTP/0.9-1.1 RFC.
The request could not be understood by the server due to malformed
syntax. The client SHOULD NOT repeat the request without
modifications.
https://www.rfc-editor.org/rfc/rfc2616#section-10.4.1
422 Unprocessable Entity is defined by RFC 4918 - WebDav. Note that there is slight difference in comparison to 400, see quoted text below.
This error condition may occur if an XML
request body contains well-formed (i.e., syntactically correct), but
semantically erroneous, XML instructions.
To keep uniform interface you should use 422 only in a case of XML responses and you should also support all status codes defined by Webdav extension, not just 422.
https://www.rfc-editor.org/rfc/rfc4918#page-78
See also Mark Nottingham's post on status codes:
it’s a mistake to try to map each part of your application “deeply”
into HTTP status codes; in most cases the level of granularity you
want to be aiming for is much coarser. When in doubt, it’s OK to use
the generic status codes 200 OK, 400 Bad Request and 500 Internal
Service Error when there isn’t a better fit.
How to Think About HTTP Status Codes
To reflect the status as of 2015:
Behaviorally both 400 and 422 response codes will be treated the same by clients and intermediaries, so it actually doesn't make a concrete difference which you use.
However I would expect to see 400 currently used more widely, and furthermore the clarifications that the HTTPbis spec provides make it the more appropriate of the two status codes:
The HTTPbis spec clarifies the intent of 400 to not be solely for syntax errors. The broader phrase "indicates that the server cannot or will not process the request due to something which is perceived to be a client error" is now used.
422 is specifically a WebDAV extension, and is not referenced in RFC 2616 or in the newer HTTPbis specification.
For context, HTTPbis is a revision of the HTTP/1.1 spec that attempts to clarify areas that were unclear or inconsistent. Once it has reached approved status it will supersede RFC2616.
There is no correct answer, since it depends on what the definition of "syntax" is for your request. The most important thing is that you:
Use the response code(s) consistently
Include as much additional information in the response body as you can to help the developer(s) using your API figure out what's going on.=
Before everyone jumps all over me for saying that there is no right or wrong answer here, let me explain a bit about how I came to the conclusion.
In this specific example, the OP's question is about a JSON request that contains a different key than expected. Now, the key name received is very similar, from a natural language standpoint, to the expected key, but it is, strictly, different, and hence not (usually) recognized by a machine as being equivalent.
As I said above, the deciding factor is what is meant by syntax. If the request was sent with a Content Type of application/json, then yes, the request is syntactically valid because it's valid JSON syntax, but not semantically valid, since it doesn't match what's expected. (assuming a strict definition of what makes the request in question semantically valid or not).
If, on the other hand, the request was sent with a more specific custom Content Type like application/vnd.mycorp.mydatatype+json that, perhaps, specifies exactly what fields are expected, then I would say that the request could easily be syntactically invalid, hence the 400 response.
In the case in question, since the key was wrong, not the value, there was a syntax error if there was a specification for what valid keys are. If there was no specification for valid keys, or the error was with a value, then it would be a semantic error.
422 Unprocessable Entity Explained Updated: March 6, 2017
What Is 422 Unprocessable Entity?
A 422 status code occurs when a request is well-formed, however, due
to semantic errors it is unable to be processed. This HTTP status was
introduced in RFC 4918 and is more specifically geared toward HTTP
extensions for Web Distributed Authoring and Versioning (WebDAV).
There is some controversy out there on whether or not developers
should return a 400 vs 422 error to clients (more on the differences
between both statuses below). However, in most cases, it is agreed
upon that the 422 status should only be returned if you support WebDAV
capabilities.
A word-for-word definition of the 422 status code taken from section
11.2 in RFC 4918 can be read below.
The 422 (Unprocessable Entity) status code means the server
understands the content type of the request entity (hence a
415(Unsupported Media Type) status code is inappropriate), and the
syntax of the request entity is correct (thus a 400 (Bad Request)
status code is inappropriate) but was unable to process the contained
instructions.
The definition goes on to say:
For example, this error condition may occur if an XML request body
contains well-formed (i.e., syntactically correct), but semantically
erroneous, XML instructions.
400 vs 422 Status Codes
Bad request errors make use of the 400 status code and should be
returned to the client if the request syntax is malformed, contains
invalid request message framing, or has deceptive request routing.
This status code may seem pretty similar to the 422 unprocessable
entity status, however, one small piece of information that
distinguishes them is the fact that the syntax of a request entity for
a 422 error is correct whereas the syntax of a request that generates
a 400 error is incorrect.
The use of the 422 status should be reserved only for very particular
use-cases. In most other cases where a client error has occurred due
to malformed syntax, the 400 Bad Request status should be used.
https://www.keycdn.com/support/422-unprocessable-entity/
Your case: HTTP 400 is the right status code for your case from REST perspective as its syntactically incorrect to send sales_tax instead of tax, though its a valid JSON. This is normally enforced by most of the server side frameworks when mapping the JSON to objects. However, there are some REST implementations that ignore new key in JSON object. In that case, a custom content-type specification to accept only valid fields can be enforced by server-side.
Ideal Scenario for 422:
In an ideal world, 422 is preferred and generally acceptable to send as response if the server understands the content type of the request entity and the syntax of the request entity is correct but was unable to process the data because its semantically erroneous.
Situations of 400 over 422:
Remember, the response code 422 is an extended HTTP (WebDAV) status code. There are still some HTTP clients / front-end libraries that aren't prepared to handle 422. For them, its as simple as "HTTP 422 is wrong, because it's not HTTP". From the service perspective, 400 isn't quite specific.
In enterprise architecture, the services are deployed mostly on service layers like SOA, IDM, etc. They typically serve multiple clients ranging from a very old native client to a latest HTTP clients. If one of the clients doesn't handle HTTP 422, the options are that asking the client to upgrade or change your response code to HTTP 400 for everyone. In my experience, this is very rare these days but still a possibility. So, a careful study of your architecture is always required before deciding on the HTTP response codes.
To handle situation like these, the service layers normally use versioning or setup configuration flag for strict HTTP conformance clients to send 400, and send 422 for the rest of them. That way they provide backwards compatibility support for existing consumers but at the same time provide the ability for the new clients to consume HTTP 422.
The latest update to RFC7321 says:
The 400 (Bad Request) status code indicates that the server cannot or
will not process the request due to something that is perceived to be
a client error (e.g., malformed request syntax, invalid request
message framing, or deceptive request routing).
This confirms that servers can send HTTP 400 for invalid request. 400 doesn't refer only to syntax error anymore, however, 422 is still a genuine response provided the clients can handle it.
400 - Failed the request validation like if the data is missing, if it has a wrong type, etc. so it is given a status of 400.
422 - Passes the request validation, but failed the operation process, because the the request data, or part of it is giving an error to the to the operation, but is handled, and given a status of 422.
Firstly this is a very good question.
400 Bad Request - When a critical piece of information is missing from the request
e.g. The authorization header or content type header. Which is absolutely required by the server to understand the request. This can differ from server to server.
422 Unprocessable Entity - When the request body can't be parsed.
This is less severe than 400. The request has reached the server. The server has acknowledged the request has got the basic structure right. But the information in the request body can't be parsed or understood.
e.g. Content-Type: application/xml when request body is JSON.
Here's an article listing status codes and its use in REST APIs.
https://metamug.com/article/status-codes-for-rest-api.php

Proper way to convey error messages during calls to a REST service?

I'm writing a REST based web service, and I'm trying to figure out the best way to handle error conditions.
Currently the service is returning HTTP Errors, such as Bad Request, but how can I return extra information to give developers using the web service an idea what they're doing wrong?
For example: creating a user with a null username returns an error of Bad Request. How can I add that the error was caused by a null username parameter?
According to the HTTP spec, the text that comes after the three digit response code, the "Reason-Phrase", can only be replaced with a logical equivalent. So you can't respond with 400 null user and expect anything useful to happen. Indeed, The client is not required to examine or display the Reason- Phrase.
In general, the HTTP response entity (typically the page that accompanies the response) should contain information useful to the client to guide them forward, even when the response is an error. On the web, most such errors are HTML, and are devoid of machine readable information, but most browsers do show the error to the user (and SO's error page is pretty good!).
So for a primarily machine readable resource you have two options:
Pass a human readable message anyway. Return 400 Bad Request with a HTML response, which the client may opt to show to the user. It's dead easy but it's a bit like throwing an unchecked exception, it passes all the hard work to the client, or indeed the end user.
Allow clients to recover. Return 400 Bad Request with a machine readable response which is part of your API, so clients can recover from known error conditions. This is harder, like throwing a checked exception, it becomes part of the API, and it allows clients to recover gracefully if they want to.
You could even make the server support both scenarios by defining a media type for the machie readable error recovery document, and allow clients to "accept" them: Accept: application/atom+xml, application/my.proprietary.errors+json
Clients that forget the mandatory field can opt in to getting machine readable errors or human readable errors by choosing to Accepting the error media type.
It's stated in the HTTP spec that most error codes should return some basic text that gives a clarification of why the error is being returned. The basic Java Servlet Spec defines the HttpServletResponse.sendError(int Code, String message) for this purpose.
String desc = "my Description";
throw new WebApplicationException(Response.status(Status.BAD_REQUEST).entity(desc).type("text/plain").build());