What is the proper response status for a REST API GET returning no content? - rest

I have an endpoint like so:
GET /api/customer/primary
If a primary customer exists, I return something like
{
name: "customerName"
}
But what if I send a GET and a primary customer doesn't exist?
Is it better to send a 200 OK with an empty JSON {}
Or is better to only send a 204 No Content?

404 is the appropriate status code for this. You're trying to represent a 'primary customer' as a resource, but in some cases this relationship doesn't exists. This situation is pretty clear, it should be a 404 for GET requests.
This is a perfectly acceptable way to communicate this. A 404 might signal a client that the resource doesn't exist yet, and perhaps that it can be created with PUT.
204 No Content has a specific meaning, and doesn't make that much sense for your case. 204 is not just meant to signal there's not going to be response body (Content-Length: 0 can do that), but it has a more specific application for hypermedia applications. Specifically, it signals that when a user performs an action that results in the 204, the view shouldn't refresh. This makes sense for for example an "Update" operation where a user can occasionally save their progress while working on a document. Contrast to 205 Reset Content, which signals that the 'view' should reset so (perhaps) a new document can be created from scratch.
Most applications don't go this far. Frankly, I haven't seen a single one. Given that, returning 200 with Content-Length: 0 or 204 No Content is an almost completely irrelevant discussion. The HTTP specification certainly doesn't forbid 200 OK with Content-Length: 0.
That was a bit of a tangent. To conclude, 404 signals this 'thing' doesn't exist, and that's appropriate here. There's no multiple interpretations. There's the people who wrote the specifications, those who read them well and on the other side of the discussion the people who are wrong.

But what if I send a GET and a primary customer doesn't exist?
Is it better to send a 200 OK with an empty JSON {}
Or is better to only send a 204 No Content?
If I'm interpreting your question correctly, you aren't really asking about status codes, but rather what kind of schema should you be using to manage the different cases in your API.
For cases like REST, where the two ends of the conversation are not necessarily controlled by the same organization and same release cycle, you may need to consider that one side of the conversation is using a more recent schema version than the other.
So how is that going to be possible? The best treatments I have seen focus on designing schema for extension - new fields are optional, and have documented semantics for how they should be understood if a field is absent.
From that perspective
{}
Doesn't look like a representation of a missing object - it looks like a representation of an object with default values for all of the optional fields.
It might be that what you want is something like Maybe or Option - where instead of promising to send back an object or not, you are promising to send back a collection of zero or one object. Collections I would normally expected to be represented in JSON as a array, rather than an object.
[]
Now, with that idea in pocket, I think it's reasonable to decide that you are returning a representation of a Maybe, where the representation of None is zero bytes long, and the representation of Some(object) is the JSON representation of the object.
So in that design 204 when returning None makes a lot of sense, and you can promise that if a successful response returns a body, that there is really something there.
There's a trade off here - the list form allows consumers to always parse the data, but they have to do that even when a None is sent. On the other hand, using the empty representation for None saves a parse, but requires that the consumer be paying attention to the content length.
So, looking back to your two proposals, I would expect that using 204 is going to be the more successful long term approach.
Another possibility would be to return the null primitive type when you want to express that there is no object available. This would go with a 200 response, because the content length would be four bytes long.
null

HTTP 404 status's text ("Not Found") is the closest to the situation, But:
The first digit of the Status-Code defines the class of response. The
last two digits do not have any categorization role. There are 5
values for the first digit:
1xx: Informational - Request received, continuing process
2xx: Success - The action was successfully received,
understood, and accepted
3xx: Redirection - Further action must be taken in order to
complete the request
4xx: Client Error - The request contains bad syntax or cannot
be fulfilled
5xx: Server Error - The server failed to fulfill an apparently
valid request
(reference)
In practice, 4xx recognized as an error and it is likely some alerts will rise from network / security / logging infrastructure
204 semantic indicate that the server has successfully fulfilled a request and there is no additional content to send - not exactly what happening.
A common use case is to return 204 as a result of a PUT request, updating the resource.
Therefore I would recommend using either:
HTTP 200 with an empty object / array
like you suggested.
HTTP 200 returning a null object, e.g.:
"none" (valid JSON)
or
{
"name": "NO_PRIMARY_CUSTOMER"
}
(implementation of such a null object depends on your specific system behavior with the returned data)
Custom HTTP 2xx code with an empty result
Less common, but still workable alternative is to return a custom HTTP code within the 2xx range (e.g. HTTP 230) with an empty result.
This option should be used with extra caution or even avoided if the API is exposed to a wide audience that may use unknown tools to access / monitor the API.

Related

HTTP GET Request Params - which error code? 400 vs 404 vs 422

I've read a lot of questions about 400 vs 422, but they are almost for HTTP POST requests for example this one: 400 vs 422 response to POST of data.
I am still not sure what should I use for GET when the required parameters are sent with wrong values.
Imagine this scenario:
I have the endpoint /searchDeclaration, with the parameter type.
My declarations have 2 types: TypeA and TypeB.
So, I can call this endpoint like this: /searchDeclaration?type=TypeA to get all TypeA declarations.
What error should I send when someone calls the endpoint with an invalid type? For example: /searchDeclaration?type=Type123
Should I send 400? I am not sure it is the best code, because the parameter is right, only the value is not valid.
The 422 code looks like it is more suitable for POST request.
EDIT:
After some responses I have another doubt.
The /searchDeclaration endpoints will return the declaration for the authenticated user. TypeA and TypeB are valid values, but some users don't have submitted a TypeB declaration, so when they call /searchDeclaration?type=TypeB which error should I send? Returning 404 does not seem right because the URI is correct, but that user does not have a declaration for that value yet. Am I overthinking this?
If the URI is wrong, use 404 Not Found.
The 404 (Not Found) status code indicates that the origin server did not find a current representation for the target resource
The target resource is, as one might expect, identified by the target URI.
Aside from the semantics of the response, 404 indicates that the response is cacheable - meaning that general purpose caches will know that they can re-use this response to future requests
I am not sure it is the best code, because the parameter is right, only the value is not valid.
The fact that this URI includes parameters and values is an implementation detail; something that we don't care about at the HTTP level.
Casually: 404 means "the URI is spelled wrong"; but doesn't try to discriminate which part(s) of the URI have errors. That information is something that you can include in the body of the response, as part of the explanation of the error situation.
Am I overthinking this?
No, but I don't think you are thinking about the right things, yet.
One of the reasons that you are finding this challenging is that you have multiple logical resources sharing the same target URI. If each user declaration document had its own unique identifier, then the exercise of choosing the right response semantics would be a lot more straight forward.
Another way of handing it would be to redirect the client to the more specific URI, and then handle the response semantics there in the straight forward way.
It's trying to use a common URI for different logical resources AND respond without requiring an extra round trip that is making it hard. Bad news: this is one of the trade offs that ought to have been considered when designing your resource identifiers; if you don't want this to require harder thinking, don't use this kind of design.
The good news: 404 is still going to be fine - you are dealing with authorized requests, and the rules about sharing responses to authorized requests mean that the only possible confusion will be if different users are sharing the same private cache.
Remember: part of the point is that all resources share a common, general purpose message vocabulary. Everything is supposed to look like a document being served by a boring web server.
The fact that there's a bunch of complexity of meaning behind the resource is an implementation detail that is correctly hidden behind the uniform interface.
There are two options, it all depends on your 'type' variable.
If 'type' is an ENUM, which only allows 'typeA' and 'typeB', and your client sends 'type123', the service will respond with a '400 Bad Request' error, you don't need to check. In my opinion, this should be ideal, since if you need to add new 'type's in the future, you will only have to add them in the ENUM, instead of doing 'if-else' inside your code to check them all.
In case the 'type' variable is a String, the controller will admit a 'type123' and you should be the one to return an error, since the client request is not malformed, but rather it is trying to access a resource that does not exist.
In this case, you can return a 404 Not Found error, (that resource the client is filtering by cannot be found), or a 422 error as you say, since the server understands the request, but is not able to process it.
Let's assume for a moment that the resource you are querying is returning a set of entries that do contain certain properties. If you don't specify a filter you will basically get a (pageable) representation of those entries either as embedded objects or as links to those resources.
Now you want to filter the results based on some properties these entries have. Most programming languages nowadays provide some lambda functionality in the form of
List filteredList = list.filter(item => item.propertyX == ...)...;
The result of such a filter function is usually a list of items that fulfilled the specified conditions. If no items met the given condition then the result will be an empty list.
Applying certain filter conditions on the Web can be designed similarly. Is it really an error when a provided filter expression doesn't yield any entries? IMO it is not an error in terms of the actual message transport itself as the server was able to receive and parse the request without any issues. As such it has to be some kind of business rule that states that only admissible values are allowed as input.
If you or your company consider the case of a provided filter value for a property returning no results as an error or you perform some i.e. XML or JSON schemata validation on the received payload (for complex requests) then we should look at how those mentioned HTTP errors are defined:
The 400 (Bad Request) status code indicates that the server cannot or will not process the request due to something that is perceived to be a client error (e.g., malformed request syntax, invalid request message framing, or deceptive request routing). (Source: RFC 7230)
Here, it is clearly the case that you don't want to process requests that come with an invalid property value and as such decline the request as such.
The 422 (Unprocessable Entity) status code means the server understands the content type of the request entity (hence a 415(Unsupported Media Type) status code is inappropriate), and the syntax of the request entity is correct (thus a 400 (Bad Request) status code is inappropriate) but was unable to process the contained instructions. For example, this error condition may occur if an XML request body contains well-formed (i.e., syntactically correct), but semantically erroneous, XML instructions. (Source: RFC 4918 (WebDAV))
In this case you basically say that the payload was actually syntactically correct but failed on a semmantical level.
Note that 422 Unprocessable Entity stems from WebDAV while 400 Bad Request is defined in the HTTP specification. This can have some impact if your API serves arbitrary HTTP clients. Ones that only know and support the HTTP error codes defined in the HTTP sepcification won't be able to really determine the semantics of the 422 response. They will still consider it as a user error, but won't be able to provide the client with any more help on that issue. As such, if your API needs to be as generic as possible, stick to 400 Bad Request. If you are sure all clients support 422 Unprocessable Entity go for that one.
General improvement hints
As you tagged your question with rest, let's see how we can improve this case.
REST is an architectural style with an intention of decoupling clients from servers to make the former one more failure tollerant while allowing the latter one to evolve freely over time. Servers in such architectures should therefore provide clients with all the things clients need to make valid requests. To avoid having clients to know upfront what a server expects as input, servers usually provide some kind of input mask clients can use to fill in stuff the server needs.
On the browsable Web this is usually accomplished by HTML Forms. The form not only teaches your client where to send the request to, which HTTP operation to use and which representation format the request should actually use (usually given implicitly as application/x-www-form-urlencoded) but also the sturcture and properties the server supports.
In HTML forms it is rather easy for the server to restrict the input choices of a client by using something along the lines of
<form action="/target">
<label for="cars">Choose a car:</label>
<select name="cars" id="cars">
<option value="volvo">Volvo</option>
<option value="saab">Saab</option>
<option value="opel">Opel</option>
<option value="audi">Audi</option>
</select>
<br/>
<input type="submit" value="Submit">
</form>
This doesn't really remove the needs to check and verify the correctness of the request on the server side, tough you make it much easier for the client to actually perform a valid request.
Unfortunately, HTML forms itself have their limits. I.e. they only allow POST and GET requests to be issues. While encType defaults to application/x-www-form-urlencoded, if you want to transfer files you should use multipart/form-data. Other than that, any valid content-type should be admissible.
If you prefer JSON-based payloads over HTML you might want to look into JSON Forms, HAL forms, Ion Forms among others.
Note though that you should adhere to the content type negotiation principles. Most often proactive content type negotiation is performed where a client sends its preferences within the Accept header and the server will select the best match somehow and return either the resource mapped to that representation format or respond with a 406 Not Acceptable response. While the standard doesn't prevent returning a default representation in such caes, it bears the danger that clients won't be able to process such responses then. A better alternative here would be to fall back to reactive negotiation where the server responds with a 300 Muliple Choice response where a client has to select one of the provided alternatives and then send a GET request to the selected alternative URIs to retrieve the content in the payload may be able to process.
If you want to provide a simple link a client can use to retrieve filtered results, the server should provide the client already with the full URI as well as a link relation name and/or extension relation type that the client can use to lookup the URI to retrieve the content for if interested in.
Both, forms and link-relation support, fall under the HATEOAS umbrella as they help to remove the need for any external documentation such as OpenAPI or Swagger documentation.
To sum things up, I would rethink whether a provided property value that does not exist should really end up as a business failure. I think returning an empty list is just fine here as you clearly state that way that for the given criterias no result was obtainable. If you though still want to stick to a business error check what clients actually make use of your API. If they support 422 go for that one. If you don't know, better stick to 400 as it should be understood by all HTTP clients equally.
In oder to remove the likelihood of ending up with requests that issue invalid property values, use forms to teach clients how requests should look like. Through certain elements or properties you can already teach a client that only a limited set of choices is valid for a certain property. Instead of a form you could also provide dedicated links a client can just use to obtain the filtered result. Just make sure to issue those links with meaningful link relatin names then.

Should this GET call return 204 or 200 with a body?

Say we have a rest api which aggregates data about a Person from other services. One of the Aggregator service routes is GET /person/(person id)/driverinfo which tells us whether the person is a licensed driver or not, license id, expiry date of license and the number of traffic violations. These data can be picked up by the Aggregator from one or more other services. This api will be used by a web page to show the "driver info" about a person. It will also be tested with automation.
Currently, the api gives 204 no content response for persons who never had a driving license. This is because one of the underlying apis gives a 204 for that scenario. So, it was decided that the Aggregator should do the same.
But, I believe that this is not a good response. Instead, we should return 200 with appropriate values for the fields. For example, licensed=false, licenseId = N.A. etc. when the underlying api gives a 204. I.e. the Aggregator should generate these fields and their values.
Which approach do you think is better and why ?
204 means something specific in HTTP; it says that the server found a representation of the requested resource, and that representation is zero bytes long.
Therefore, the real question is more like "Should we use a zero byte long message to describe a situation?". Maybe? If all of the fields in your message schema are optional, and we are trying to describe a representation that means that all of the fields are taking on their default values, then a zero byte array might be the right way to communicate that.
Within the context of HTTP specifically, the headers themselves are already significant in length (compared to zero), so I wouldn't expect there to be particularly compelling performance reasons to squeeze a signal down to zero length. For instance, if we were normally passing around application/json, I would expect that sending an empty object or array to be much more cost effective than sending nothing at all.

HTTP status code for GET request with non-existing query parameter value

Let's clarify three common scenarios when no item matches the request with a simple example:
GET /posts/{postId} and postId does not exist (status code 404, no question)
GET /posts?userId={userId} and the user with userId does not have any posts
GET /posts?userId={userId} and the user with userId does not exist itslef
I know there's no strict REST guideline for the appropriate status codes for cases 2 and 3, but it seems to be a common practice to return 200 for case 2 as it's considered a "search" request on the posts resource, so it is said that 404 might not be the best choice.
Now I wonder if there's a common practice to handle case 3. Based on a similar reasoning with case 2, 200 seems to be more relevant (and of course in the response body more info could be provided), although returning 404 to highlight the fact that userId itself does not exist is also tempting.
Any thoughts?
Ok, so first, REST doesn't say anything about what the endpoints should return. That's the HTTP spec. The HTTP spec says that if you make a request for a non-existent resource, the proper response code is 404.
Case 1 is a request for a single thing. That would return 404, as you said.
The resource being returned in case 2 is typically an envelope which contains metadata and a collection of things. It doesn't matter if the envelope has any things in it or not. So 200 is the correct response code because the envelope exists, it just so happens the envelope isn't holding any things. It would be allowable under the spec to say there's no envelope if there are no things and return 404, but that's usually not done because then the API can't send the metadata.
Case 3, then, is exactly the same thing as case 2. If expected response is an envelope, then the envelope exists whether or not the userId is valid. It would not be unreasonable to include metadata in the envelope pointing out that there is no user with userId, if the API designer thinks that information would be useful to clients.
Case 2 and Case 3 are really the same case, and should both either return 200 with an empty envelope or 404.
First piece, you need to recognize that /posts?userId={userId} identifies a resource, precisely in the same sense that /posts/{userId} or /index.html specifies a resource.
So GET /posts?userId={userId} "requests transfer of a current selected representation for the target resource."
The distinction between 200 and 404 is straight forward; if you are reporting to the consumer that "the origin server did not find a current representation for the target resource or is not willing to disclose that one exists", then you should be returning 404. If the response payload includes a current representation of the resource, then you should use the 200 status code.
404 is, of course, an response from the Client Error response class
the server SHOULD send a representation containing an explanation of the error situation, and whether it is a temporary or permanent condition.
So a way of figuring out which of these status codes to use, is to just look at the message body of the response. If it is a current representation of the resource, then you use the 200 status code. If it is a representation of a message that explains no current representation is available, then you use the 404 status code.
Of course that ducks the big question: what should the representation of the resource be in each case? Once you know that, you can work out the rest.
If you you think that an unexpected identifier indicates an error on the client (for example, a corrupted link), then it will probably improve the consumer's experience to report that as an explicit error, rather than returning a representation of an empty list.
But that's a judgment call; different API are going to have different answers, and HTTP isn't particularly biased one way or the other; HTTP just asks that you ensure that the response code and headers are appropriate for the choice that you have made.

Correct HTTP status code for possible absent entity: 200 or 204 or 404

I've searched a lot about this and have found different answers, also, my case is a little different.
The context:
I have a document A with a possible sender S
Server X (so not a browser) requests the sender from document A on Server Y, but the sender is absent.
What should server Y return:
200: with a null object (not really OK and dangerous for nullpointers on Server X)
204: a correct status I think, but this is mainly used when the endpoint does not return data in general (e.g. post, update, delete), which can be confusing
404: this should definitely the answer for .../sender/{sender_id}. But in this case we ask the sender of a document, and no sender is a correct answer...
So, what would be the best practice, or is there another approach which is better fitting for this.
Thanks in advance!
Broad rule: don't try to make status codes specific the details of your api or your domain model. Those are messages to generic components (like browsers, caches, proxies) that don't need to know anything about the specifics of your domain model and your integration protocol.
404: this should definitely the answer for .../sender/{sender_id}. But in this case we ask the sender of a document, and no sender is a correct answer...
The 4xx class of response codes indicate an error in the client request. In other words, the client asked the question wrong. 404 specifically implies that the client addressed the request to the wrong integration resource.
So it's not what you are looking for.
204: a correct status I think, but this is mainly used when the endpoint does not return data in general (e.g. post, update, delete), which can be confusing
204 has a very specific meaning - is says that the representation provided in the response is zero bytes long. "You asked me to send you the contents of this file, and I'm successfully doing so, but by the way the file is empty."
So if the representation of an absent sender is zero bytes long, aces! But if the representation is instead an empty json object
{}
Then 204 is off the table.
200 is probably your best bet.
I would suggest 404 Not Found matches best what you describe as "sender is absent"
Just in case here you can see a list of status codes: https://en.wikipedia.org/wiki/List_of_HTTP_status_codes
There are two cases here.
If you are requesting a collection eg: /users and there are no users available on the server then your service should return 200 OK with an empty list.
If you are requesting a resource by id eg: /users/id and if the user is not available then you should return 404 NOT Found as the user id that you are searching is not available on the server.
Based on your situation(assuming you are requesting for a collection of resources), I probably recommend returning 200 OK with the empty list. If you does not have control over the server then you may need to have a null check on the client side.

Is it valid to modify a REST API representation based on a If-Modified-Since header?

I want to implement a "get changed values" capability in my API. For example, say I have the following REST API call:
GET /ws/school/7/student
This gets all the students in school #7. Unfortunately, this may be a lot. So, I want to modify the API to return only the student records that have been modified since a certain time. (The use case is that a nightly process runs from another system to pull all the students from my system to theirs.)
I see http://blog.mugunthkumar.com/articles/restful-api-server-doing-it-the-right-way-part-2/ recommends using the if-modified-since header and returning a representation as follows:
Search all the students updated since the time requested in the if-modified-since header
If there are any, return those students with a 200 OK
If there are no students returned from that query, return a 304 Not Modified
I understand what he wants to do, but this seems the wrong way to go about it. The definition of the If-Modified-Since header (http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.24) says:
The If-Modified-Since request-header field is used with a method to make it conditional: if the requested variant has not been modified since the time specified in this field, an entity will not be returned from the server; instead, a 304 (not modified) response will be returned without any message-body.
This seems wrong to me. We would not be returning the representation or a 304 as indicated by the RFC, but some hybrid. It seems like client side code (or worse, a web cache between server and client) might misinterpret the meaning and replace the local cached value, when it should really just be updating it.
So, two questions:
Is this a correct use of the header?
If not (and I suspect not), what is the best practice? Query string parameter?
This is not the correct use of the header. The If-Modified-Since header is one which an HTTP client (browser or code) may optionally supply to the server when requesting a resource. If supplied the meaning is "I want resource X, but only if it's changed since time T." Its purpose is to allow client-side caching of resources.
The semantics of your proposed usage are "I want updates for collection X that happened since time T." It's a request for a subset of X. It does not seem like your motivation is to enable caching. Your client-side cached representation seemingly contains all of X, even though the typical request will only return you a small set of changes to X; that is, the response is not what you are directly caching, so the caching needs to happen in custom user logic client-side.
A query string parameter is a much more appropriate solution. Below {seq} would be something like a sequence number or timestamp.
GET /ws/schools/7/students/updates?since={seq}
Server-side I imagine you have a sequence of updates since the beginning of your system and a request of the above form would grab the first N updates that had a sequence value greater than {seq}. In this way, if a client ever got very far behind and needed to catch up, the results would be paged.