Should it matter if a call to a private REST API returns 400 or 500? - rest

We have a private REST API that is locked down and only ever called by software we control, not the public. Many endpoints take a JSON payload. If deserialising the JSON payload fails (eg. the payload has an int where a Guid is expected), an exception is thrown and the API is returning a 500 Internal Server Error. Technically, it should return a 400 Bad Request in this circumstance.
Without knowing how much effort is required to ensure a 400 is returned in this circumstance, is there benefit in changing the API to return a 400? The calling software and QA are the only entities that see this error, and it only occurs if the software is sending data that doesn't match the expected model which is a critical defect anyway. I see this as extra effort and maintenance for no gain.
Am I missing something here that the distinction between 400 and 500 would significantly help with?

From a REST perspective:
If you want to follow strict REST principals, you should return 4xx as the problem is with the data being sent and not the server program
5xx are reserved for server errors. For example if the server was not able to execute the method due to site outage or software defect. 5xx range status codes SHOULD NOT be utilized for validation or logical error handling.
From a technical perspective:
The reported error does not convey useful information if tomorrow another programmer/team will work on the issue
If tomorrow you have to log your errors in a central error log, you will pollute it will wrong status codes
As a consequence, if QA decides to run reports/metrics on errors, they will be erroneous
You may be increasing your technical debt which can impact your productivity in the future. link
The least you can do is to log this issue or create a ticket if you use a tool like JIRA.

Should it matter if a call to a private REST API returns 400 or 500?
A little bit.
The status code is meta data:
The status-code element is a 3-digit integer code describing the result of the server's attempt to understand and satisfy the client's corresponding request. The rest of the response message is to be interpreted in light of the semantics defined for that status code.
Because we have a shared understanding of the status codes, general purpose clients can use that meta data to understand the broad meaning of the response, and take sensible actions.
The primary difference between 4xx and 5xx is the general direction of the problem. 4xx indicates a problem in the request, and by implication with the client
The 4xx (Client Error) class of status code indicates that the client seems to have erred.
5xx indicates a problem at the server.
The 5xx (Server Error) class of status code indicates that the server is aware that it has erred or is incapable of performing the requested method
So imagine, if you would, a general purpose reverse proxy acting as a load balancer. How might the proxy take advantage of the ability to discriminate between 4xx and 5xx.
Well... 5xx suggests that the query itself might be fine. So the proxy could try routing the request to another healthy instance in the cluster, to see if a better response is available. It could look at the pattern of 5xx responses from a specific member of the cluster, and judge whether that instance is healthy or unhealthy. It could then evict that unhealthy instance and provision a replacement.
On the other hand, with a 4xx status code, none of those mitigations make any sense - we know instead that the problem is with the client, and that forwarding the request to another instance isn't going to make things any better.
Even if you aren't going to automatically mitigate the server errors, it can still be useful to discriminate between the two error codes, for internal alarms and reporting.
(In the system I maintain, we're using general purpose monitoring that distinguishes 4xx and 5xx responses, with different thresholds to determine if I should be paged. As you might imagine, I'm rather invested in having that system be well tuned.)

Related

How to handle client response during Transient Exception retrying?

Context
I'm developing a REST API that, as you might expect, is backed by multiple external cross-network services, APIs, and databases. It's very possible that a transient failure is encountered at any point and for which the operation should be retried. My question is, during that retry operation, how should my API respond to the client?
Suppose a client is POSTing a resource, and my server encounters a transient exception when attempting to write to the database. Using a combination of the Retry Pattern perhaps with the Circuit Breaker Pattern, my server-side code should attempt to retry the operation, following randomized linear/exponential back-off implementations. The client would obviously be left waiting during that time, which is not something we want.
Questions
Where does the client fit into the retry operation?
Should I perhaps provide an isTransient: true indicator in the JSON response and leave the client to retry?
Should I leave retrying to the server and respond with a message and status code indicative that the server is actively retrying the request and then have the client poll for updates? How would you determine the polling interval in that case without overloading the server? Or, should the server respond via a web socket instead so the client need not poll?
What happens if there is an unexpected server crash during the retry operation? Obviously, when the server recovers, it won't "remember" the fact that it was retrying an operation unless that fact was persisted somewhere. I suppose that's a non-critical issue that would just cause further unnecessary complexity if I attempted to solve it.
I'm probably over-thinking the issue, but while there is a lot of documentation about implementing transient exception retry logic, seldom have I come across resources that discuss how to leave the client "pending" during that time.
Note: I realize that similar questions have been asked, but my queries are more specific, for I'm specifically interested in the different options for where the client fits into a given retry operation, how the client should react in those cases, and what happens should a crash occur that interrupts a retry sequence.
Thank you very much.
There are some rules for retry:
always create an idempotency key to understand that there is retry operation.
if your operation a complex and you want to wrap rest call with retry, you must ensure that for duplicate requests no side effects will be done(start from failure point and don't execute success code).
Personally, I think the client should not know that you retry something, and of course, isTransient: true should not be as a part of the resource.
Warning: Before add retry policy to something you must check side effects, put retry policy everywhere is bad practice

What is the most appropriate HTTP status code for an already processed POST request?

I have a RESTful API that is used by another internal application that post updates to it.
The problem is that some unexpected peaks occur and, during those times, a request might take longer than 60 seconds (the limit defined by the load balancer, which I cannot change) to respond, which causes a 504 Gateway Timeout error.
When the latter application gets such response, it will retry the request again after 10 minutes or so.
This caused some requests to be processed twice, because the first request was successful, but took more than 60 seconds.
So I decided to use Idempotency Keys in the requests to avoid this problem. The issue is that I don't know what I should return in this case.
Should I just stick with 200 OK? Should I return some 4xx code?
I'd say it highly depends if it is an error for you or not. But I'd say the exact response code is more a matter of taste rather than best practice. But since I guess you're rejecting the duplicated requests, you want to report an error code such as 409 Conflict:
Indicates that the request could not be processed because of conflict
in the current state of the resource, such as an edit conflict between
multiple simultaneous updates.
https://en.wikipedia.org/wiki/List_of_HTTP_status_codes#4xx_Client_errors
Whenever a resource conflict would be caused by fulfilling the request. Duplicate entries and deleting root objects when cascade-delete is not supported are a couple of examples.
https://www.restapitutorial.com/httpstatuscodes.html
A potentially useful reference is RFC 5789, which describes the PATCH method. Obviously, you aren't doing a patch, but the error handling is analogous.
For instance, if you were sending a JSON Patch document, then you might be ensuring idempotent behavior by including a test operation that checks that the resource is in the expected initial state. After your operation, that check would presumably fail. In that case, the error handling section directs your attention to RFC 5789 -- section 2.2 outlines a number of different possible cases.
Another source of inspiration is to look at RFC 7232 which describes conditional requests. The section on If-Match includes this gem:
An origin server MUST NOT perform the requested method if a received If-Match condition evaluates to false; instead, the origin server MUST respond with either a) the 412 (Precondition Failed) status code or b) one of the 2xx (Successful) status codes if the origin server has verified that a state change is being requested and the final state is already reflected in the current state of the target resource (i.e., the change requested by the user agent has already succeeded, but the user agent might not be aware of it, perhaps because the prior response was lost or a compatible change was made by some other user agent).
From this, I infer that 200 is completely acceptable if you can determine that the work was already done successfully.

If a GET request's response changes, is idempotency respected?

I am reading a lot about rest API and I always stumble upon terms idempotency. Basically GET, HEAD, PUT, DELETE and OPTIONS are all idempotent, and POST is not.
This statement on http://www.restapitutorial.com/lessons/idempotency.html made me doubt my understanding of idempotency.
From a RESTful service standpoint, for an operation (or service call)
to be idempotent, clients can make that same call repeatedly while
producing the same result. In other words, making multiple identical
requests has the same effect as making a single request. Note that
while idempotent operations produce the same result on the server (no
side effects), the response itself may not be the same (e.g. a
resource's state may change between requests).
So does idempotency actually has something to do with server-work or a response?
What confuses me if I have
GET /users/5
returning
{
"first_name" : "John",
"last_name" : "Doe",
"minutes_active": 10
}
and then do the same request after one minute I would get
GET /users/5
{
"first_name" : "John",
"last_name" : "Doe",
"minutes_active": 11
}
How is this idempotent?
Furthermore if response contains some kind of UUID which is unique for each response, would that break idempotency?
And finally, is idempotency same server-work over and over again, or same results over and over again for the same/single request?
So does idempotency actually has something to do with server-work or a response?
It refers to the server state after subsequent request of the same type.
So, lets suppose that the client makes a request that changes the server's old state, for example S1, to a new state, S2, then makes the same request again.
If the method is idempotent then it is guaranteed that the second request would not change the server's state again, it will remain S2.
But if the method is not idempotent, there is no guarantee that the state would remain the same, S2; it may change to whatever state the server wants for example S3 or S1. So, in this case the client should not send the command again if a communication error occurs because the outcome would not be the same as the first time it sent the command.
GET /users/5
How is this idempotent?
You may call this url using GET method as many time you want and the server would not change its internal state, i.e. last_name of the user; if it does not change it, then it remains the same so GET is idempotent.
Furthermore if response contains some kind of UUID which is unique for each response, would that break idempotency?
The response has nothing to do with the server's state after subsequent requests of the same type so the response could be unique after each request and the request would still be idempotent. For example, in the GET request from your question, the minutes_active would be greater each minute and this does not make GET not-idempotent.
Other example of idempotent method is DELETE. For example if you delete a user then it is gone/deleted. Because DELETE is idempotent, after a second atempt/request to delete the same user, the user would remain deleted so the state would not change. Of course, the second response could be a little different, something like "warning, user already deleted" but this has nothing to do with idempotency.
For understanding idempotency in REST, your best starting point is probably going to be the definition include in RFC 7231
A request method is considered "idempotent" if the intended effect on the server of multiple identical requests with that method is the same as the effect for a single such request.
For "effect", think side effect. When the server is advertising that a particular action is idempotent, it is telling you that the (semantically significant) side effects will happen at most once.
// the guarantee of an idempotent operation
oldState.andThen(PUT(newState)) === oldState.andThen(PUT(newState)).andThen(PUT(newState))
Safe methods are inherently idempotent, because they have no effect on the server.
// the guarantee of a safe operation
oldState === oldState.andThen(GET)
// therefore, the guarantee of an idempotent operation follows trivially
oldState.andThen(GET) == oldState.andThen(GET).andThen(GET)
So does idempotency actually has something to do with server-work or a response?
Server work. More generally, its a constraint on the receiver of a command to change state.
Roy Fielding shared this observation in 2002:
HTTP does not attempt to require the results of a GET to be safe. What it does is require that the semantics of the operation be safe, and therefore it is a fault of the implementation, not the interface or the user of that interface, if anything happens as a result that causes loss of property (money, BTW, is considered property for the sake of this definition).
If you substitute PUT/DELETE for GET, and idempotent for safe, I think you get a good picture -- if a loss of property occurs because the server received two copies of an idempotent request, the fault is that the server handled the request improperly, not that the client broadcast the request more than once.
This is important because it allows for at least once delivery over an unreliable network. From RFC 7231
Idempotent methods are distinguished because the request can be repeated automatically if a communication failure occurs before the client is able to read the server's response.
Contrast this with POST; which does not promise idempotent handling. Submitting a web form twice may produce two side effects on the server, so the client implementations (and intermediary components, like proxies) cannot assume it is safe to repeat a lost request.
Back in the day, dialogs like this were common
for precisely this reason
And finally, is idempotency same server-work over and over again, or same results over and over again for the same/single request?
Work on the server. An idempotent change is analogous to SET, or REPLACE (aka, compare and swap).
The responses may, of course, be different. A conditional PUT, for example, will include meta data "indicating a precondition to be tested before applying the method semantics to the target resource."
So the server might change state in response to the receiving the first copy of a put, sending back 200 OK to indicate to the client that the request was successful; but upon receiving the second request, the server will find that the now-changed state of the resource no longer matches the provided meta data, and will respond with 412 Precondition Failed.
I noticed you mentioned may produce in "Contrast this with POST; which does not promise idempotent handling. Submitting a web form twice may produce two side effects on the server....." basically rest standards declare POST as non-idempotent, but one could actually make POST an idempotent, but it would be opposite to rest standards...
No, that's not quite right. The HTTP specification does not require that POST support idempotent semantics -- which means that clients and intermediaries are not permitted to assume that it does. The server controls its own implementation; it may provide idempotent handling of the POST request.
But the server has no way to advertise this capability to the clients (or intermediaries), so that they can take advantage of it.

Fix Message ER Rejected No Route Defined

I'm trying to send a new order single message but I'm getting an ER Rejected message that says that the order reject reason is UNKNOWN ORDER, and No Route Defined but I couldn't find any explanation for this error.
If anyone knows what "No Route Defined" means I would be grateful, thanks.
Typically this means the FIX engine and transaction infrastructure at the other end (it's usually a broker you are connecting to with FIX) does not know what to with the order you are sending to them. Specifically it does not know which exchange or other handling venue to route it to. Hence 'no route'.
This may be because it does not recognize the instrument in your order or some combination of parameters on your order are invalid. Although typically you would get a more informative error message if this were the cause.
Other causes include the broker's connection to a down stream handling system (e.g. exchange, trading desk) has been interrupted. Sometimes this is a transient situation - service interuption or time-of-day issue outside regular trading hours (RTH).
In any case the message indicates that a valid servicing destination for your the order can not be found at this moment.

Is it appropriate to return HTTP 503 in response to a database deadlock?

Is it appropriate for a server to return 503 ("Service Unavailable") when the requested operation resulted in a database deadlock?
Here is my reasoning:
Initially I tried avoiding database deadlocks, but I ran across https://stackoverflow.com/a/112256/14731
Next, I tried repeating the request on the server-side, but I ran across Java Servlets: How to repeat an HTTP request?. Technically speaking I can buffer the request entity but scalability will suffer and clients are more likely to see 503 Service Unavailable anyway.
Seeing as:
It's easier to ask clients to repeat the operation.
They need to be able to handle 503 Service Unavailable anyway.
Database deadlocks are rather rare.
I'm leaning towards this solution. What do you think?
UPDATE: I think returning 503 ("Service Unavailable") is still acceptable if you wish it, but I no longer think it is technically required. See https://stackoverflow.com/a/17960047/14731.
I think semantically 409 Conflict is a better alternative - basically if you have a deadlock there's contention for some resource, and so the operation could not be completed.
Now depending on the reason for the deadlock, the request may not succeed if submitted a second time, but that's true for anything.
For a 503, as a client I'd implement some sort of back-away/circuit breaker operation as the system is rate limited, whereas 409 relates to the specific request.
Just got here with the same question and no clear answer on the issue.
a 503 is acceptable but might not be correctly interpreted
a 409 is also OK but in my case was not OK (since multiple resources could end up returning a this error for the same URL)
In my case I ended up returning a 307 redirect on the same URL.
Clients will automatically "retry" and the second call works because the resource is only raising a deadlock during its initial creation.
Be warned that might end up in an infinite loop
I think it's fine so long as the entire transaction is rolled back or if the request is idempotent.