HTTP Response Status Code for bad data found on the DB

HTTP Response Status Code for bad data found on the DB - rest

I expect the JSON response returned by one of my Web APIs to contain a minimum set of fields annotated as mandatory by the business.
Which HTTP status code fit better in case some bad data that does not respect the contract is found on the db?
At the moment we're using 500 but we probably need to improve it (also because Varnish put in front of our service translates that 500 into a 503 Service Unavailable).
Example:
{
"id": "123",
"message": "500 - Exception during request processing. Cause: subtitles is a required field of class Movie and cannot be empty",
"_links": {
"self": {
"href": "/products/movies/123"
}
}
}
Thanks

An 500 Internal Server Error seems to be enough here to be honest as the issue is essentially from the server side and it doesn't reflect any wrong doing from the client side.

400 means BAD request but there is nothing bad with the request so don't use it.
500 means server error as in something unexpected happened - the code imploded, the galaxy crashed. Bad data is not a cause of any of that.
You have a case of bad data so you have a number of approaches at your disposal.
The first question is what do you intend to do with the bad data?
You could delete it in which case you would return a 204 which means the request is fine but there is no data to send back. I would not return a 404 for no data because 404 means endpoint doesn't exist, not data does not exist.
You could move it out of the main db into a temp data while someone fixes it.
Either way you would not return the bad data and the client doesn't care why that data is bad. All the client is concerned with is there any data or not.
Why should the client care that your data is missing a field you deem important?
You could take another approach and return the data you have.
Bottom line : decide how you're dealing with the bad data and don't put any kind of responsibility on the client.

Related

HTTP status code for GET requests when entity is corrupted

How should servers respond to GET requests for corrupted entities?
Let's say I have a Person schema
Person {
firstName: optional string
lastName: required string
}
And due to an issue in the database or due to a bug in CREATE implementation, certain Person entities with missing lastName got created.
When a client is doing a GET call to fetch such an entity, how should the server respond?
Here are a few options I can think of:
Throw a 4XX
It’s the client’s issue since they shouldn’t be fetching invalid entities. 400/422 both feel incorrect here since the request syntax is valid.
Throw a 500
It’s our (server) issue since at some point we allowed the creation of an invalid entity or something happened on server/DB end where data got corrupted. Either way, not really a client's problem.
Return the entity with the missing lastName
This feels hacky as it essentially means the server isn't following the contract it has specified.
All the above feel somewhat hacky. Is there a better HTTP error for such cases?

The 500 (Internal Server Error) status code indicates that the server encountered an unexpected condition that prevented it from fulfilling the request. -- HTTP Semantics
The server having lost or corrupted data certainly sounds like an "unexpected condition".
Of course, 500 can mean lots of different things; so if you want to share more information with the end consumer, you should include that information in the body of the response:
the server SHOULD send a representation containing an explanation of the error situation, and whether it is a temporary or permanent condition.
But the general purpose components of the HTTP application don't really care which unexpected condition prevented the server from fulfilling the request, so 500 is sufficient for all of them.

What is the proper response status for a REST API GET returning no content?

I have an endpoint like so:
GET /api/customer/primary
If a primary customer exists, I return something like
{
name: "customerName"
}
But what if I send a GET and a primary customer doesn't exist?
Is it better to send a 200 OK with an empty JSON {}
Or is better to only send a 204 No Content?

404 is the appropriate status code for this. You're trying to represent a 'primary customer' as a resource, but in some cases this relationship doesn't exists. This situation is pretty clear, it should be a 404 for GET requests.
This is a perfectly acceptable way to communicate this. A 404 might signal a client that the resource doesn't exist yet, and perhaps that it can be created with PUT.
204 No Content has a specific meaning, and doesn't make that much sense for your case. 204 is not just meant to signal there's not going to be response body (Content-Length: 0 can do that), but it has a more specific application for hypermedia applications. Specifically, it signals that when a user performs an action that results in the 204, the view shouldn't refresh. This makes sense for for example an "Update" operation where a user can occasionally save their progress while working on a document. Contrast to 205 Reset Content, which signals that the 'view' should reset so (perhaps) a new document can be created from scratch.
Most applications don't go this far. Frankly, I haven't seen a single one. Given that, returning 200 with Content-Length: 0 or 204 No Content is an almost completely irrelevant discussion. The HTTP specification certainly doesn't forbid 200 OK with Content-Length: 0.
That was a bit of a tangent. To conclude, 404 signals this 'thing' doesn't exist, and that's appropriate here. There's no multiple interpretations. There's the people who wrote the specifications, those who read them well and on the other side of the discussion the people who are wrong.

But what if I send a GET and a primary customer doesn't exist?
Is it better to send a 200 OK with an empty JSON {}
Or is better to only send a 204 No Content?
If I'm interpreting your question correctly, you aren't really asking about status codes, but rather what kind of schema should you be using to manage the different cases in your API.
For cases like REST, where the two ends of the conversation are not necessarily controlled by the same organization and same release cycle, you may need to consider that one side of the conversation is using a more recent schema version than the other.
So how is that going to be possible? The best treatments I have seen focus on designing schema for extension - new fields are optional, and have documented semantics for how they should be understood if a field is absent.
From that perspective
{}
Doesn't look like a representation of a missing object - it looks like a representation of an object with default values for all of the optional fields.
It might be that what you want is something like Maybe or Option - where instead of promising to send back an object or not, you are promising to send back a collection of zero or one object. Collections I would normally expected to be represented in JSON as a array, rather than an object.
[]
Now, with that idea in pocket, I think it's reasonable to decide that you are returning a representation of a Maybe, where the representation of None is zero bytes long, and the representation of Some(object) is the JSON representation of the object.
So in that design 204 when returning None makes a lot of sense, and you can promise that if a successful response returns a body, that there is really something there.
There's a trade off here - the list form allows consumers to always parse the data, but they have to do that even when a None is sent. On the other hand, using the empty representation for None saves a parse, but requires that the consumer be paying attention to the content length.
So, looking back to your two proposals, I would expect that using 204 is going to be the more successful long term approach.
Another possibility would be to return the null primitive type when you want to express that there is no object available. This would go with a 200 response, because the content length would be four bytes long.
null

HTTP 404 status's text ("Not Found") is the closest to the situation, But:
The first digit of the Status-Code defines the class of response. The
last two digits do not have any categorization role. There are 5
values for the first digit:
1xx: Informational - Request received, continuing process
2xx: Success - The action was successfully received,
understood, and accepted
3xx: Redirection - Further action must be taken in order to
complete the request
4xx: Client Error - The request contains bad syntax or cannot
be fulfilled
5xx: Server Error - The server failed to fulfill an apparently
valid request
(reference)
In practice, 4xx recognized as an error and it is likely some alerts will rise from network / security / logging infrastructure
204 semantic indicate that the server has successfully fulfilled a request and there is no additional content to send - not exactly what happening.
A common use case is to return 204 as a result of a PUT request, updating the resource.
Therefore I would recommend using either:
HTTP 200 with an empty object / array
like you suggested.
HTTP 200 returning a null object, e.g.:
"none" (valid JSON)
or
{
"name": "NO_PRIMARY_CUSTOMER"
}
(implementation of such a null object depends on your specific system behavior with the returned data)
Custom HTTP 2xx code with an empty result
Less common, but still workable alternative is to return a custom HTTP code within the 2xx range (e.g. HTTP 230) with an empty result.
This option should be used with extra caution or even avoided if the API is exposed to a wide audience that may use unknown tools to access / monitor the API.

Correct HTTP status code for possible absent entity: 200 or 204 or 404

I've searched a lot about this and have found different answers, also, my case is a little different.
The context:
I have a document A with a possible sender S
Server X (so not a browser) requests the sender from document A on Server Y, but the sender is absent.
What should server Y return:
200: with a null object (not really OK and dangerous for nullpointers on Server X)
204: a correct status I think, but this is mainly used when the endpoint does not return data in general (e.g. post, update, delete), which can be confusing
404: this should definitely the answer for .../sender/{sender_id}. But in this case we ask the sender of a document, and no sender is a correct answer...
So, what would be the best practice, or is there another approach which is better fitting for this.
Thanks in advance!

Broad rule: don't try to make status codes specific the details of your api or your domain model. Those are messages to generic components (like browsers, caches, proxies) that don't need to know anything about the specifics of your domain model and your integration protocol.
404: this should definitely the answer for .../sender/{sender_id}. But in this case we ask the sender of a document, and no sender is a correct answer...
The 4xx class of response codes indicate an error in the client request. In other words, the client asked the question wrong. 404 specifically implies that the client addressed the request to the wrong integration resource.
So it's not what you are looking for.
204: a correct status I think, but this is mainly used when the endpoint does not return data in general (e.g. post, update, delete), which can be confusing
204 has a very specific meaning - is says that the representation provided in the response is zero bytes long. "You asked me to send you the contents of this file, and I'm successfully doing so, but by the way the file is empty."
So if the representation of an absent sender is zero bytes long, aces! But if the representation is instead an empty json object
{}
Then 204 is off the table.
200 is probably your best bet.

I would suggest 404 Not Found matches best what you describe as "sender is absent"
Just in case here you can see a list of status codes: https://en.wikipedia.org/wiki/List_of_HTTP_status_codes

There are two cases here.
If you are requesting a collection eg: /users and there are no users available on the server then your service should return 200 OK with an empty list.
If you are requesting a resource by id eg: /users/id and if the user is not available then you should return 404 NOT Found as the user id that you are searching is not available on the server.
Based on your situation(assuming you are requesting for a collection of resources), I probably recommend returning 200 OK with the empty list. If you does not have control over the server then you may need to have a null check on the client side.

Sending application specific messages

There is a change to our business logic, where earlier with one of the APIs we use to return a list, for eg. list of employees. Recently we introduced authorization checks, to see if a particular user has permission to view a specific employee.
If say there are 10 employees that should be returned through method GET, due to the missing permission only 5 are returned. The request itself in this case is successful. I am currently not sure how to pass on the information back to the client that there were 5 employees that are filtered out due to missing permission.
Should this be mapped to HTTP status codes? If yes, which status code fits this? Or this is not an error at all?
What would be the best approach in this case?

A status code by itself wouldn't be sufficient to indicate the partial response. The status code 206 sounds close by name but is used when a client specifically requests a partial set of data based on headers.
Use 200. The request was fulfilled successfully after all, and the reason for the smaller set of data is proprietary to your API so extra metadata in the response to indicate a message might be sufficient.
Assuming JSON response:
{
"data": [ ... ],
"messages": [
"Only some data was returned due to permissions."
]
}
If you have many consumers and are worried about backward compatibility you may also want to provide a vendor specific versioned JSON media type:
"Content-Type": "application/vnd.myorg-v2+json"

What status code should I return when part of a bulk update fails?

Let's say I'm a server responding to a request to do a bulk create of some entity. Let's say that I've also decided to make it so that if one instance of the entity can't be created, due to a server error or user error, I will still create the other entity instances. In this scenario what should I return? A 201 because I created most of the entities in the request? Or A 4xx/5xx since there was an error while creating one of the entities?

If you return a 4xx code it implies that the entire request has failed, and the server-state has not changed.
If the intent of the request is to do 'one or more things and some may fail', then a partial application is still a success, so that puts in the 2XX range.
206 is not a good idea. This is specifically for requests that use Range, which is not the case here.
207 could be used. You'll probably want to define a custom format instead of the default XML-based on. My vote would probably just go to 200.
Also, consider just doing many requests. Requests are cheap, why lump them together? Now each request can have their own beautiful, accurate status code.

In that case you could return a multi-status (207) response where you acknowledge the result for each batch entry. That way the client would have a complete awareness of the results. However, that type of HTTP status involves more complex processing on the client-side.

I think 201 is not a good choice because it says "Accepted" which mean everything is alright. But not in your case. Maybe 206 is a good idea which means "Partial Content" according to Wiki "The server is delivering only part of the resource..."

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse