REST: Processing error in long running operation, how to inform clients? - rest

I am developing a webservice that alllows users to request validation reports. Report generation might take up to 20 hours per report. When a new validation request is posted, I return a 202 Accepted answer with Location set to a processing queue (e.g./queue/5) When the queue resource is polled some processing information is provided:
<queueResponse>
<status>QUEUED</status>
<queuePosition>1</queuePosition>
</queueResponse>
Once processing completes successfully and the queue is polled, a 303 see other will redirect to the created resource (at /reports/5 e.g.).
However if a processing error occurs on server, i simply return my queueResponse without redirect and status set to <status>ERROR</status>.
Is this the best way to comunicate a processing error to the client? Or should instead simply a 500 Internal Server Error returned when polling the queue for a failed validation task?.....

Your current solution is best. A 500 error for the queued process information would indicate that the request for that resource had failed, not the process it was reporting on.
postscript: If your API is still being defined, I would suggest FAILED instead of ERROR, as it sounds more permanent. Errors are potentially recoverable situations, failures are not.

Related

How to handle client response during Transient Exception retrying?

Context
I'm developing a REST API that, as you might expect, is backed by multiple external cross-network services, APIs, and databases. It's very possible that a transient failure is encountered at any point and for which the operation should be retried. My question is, during that retry operation, how should my API respond to the client?
Suppose a client is POSTing a resource, and my server encounters a transient exception when attempting to write to the database. Using a combination of the Retry Pattern perhaps with the Circuit Breaker Pattern, my server-side code should attempt to retry the operation, following randomized linear/exponential back-off implementations. The client would obviously be left waiting during that time, which is not something we want.
Questions
Where does the client fit into the retry operation?
Should I perhaps provide an isTransient: true indicator in the JSON response and leave the client to retry?
Should I leave retrying to the server and respond with a message and status code indicative that the server is actively retrying the request and then have the client poll for updates? How would you determine the polling interval in that case without overloading the server? Or, should the server respond via a web socket instead so the client need not poll?
What happens if there is an unexpected server crash during the retry operation? Obviously, when the server recovers, it won't "remember" the fact that it was retrying an operation unless that fact was persisted somewhere. I suppose that's a non-critical issue that would just cause further unnecessary complexity if I attempted to solve it.
I'm probably over-thinking the issue, but while there is a lot of documentation about implementing transient exception retry logic, seldom have I come across resources that discuss how to leave the client "pending" during that time.
Note: I realize that similar questions have been asked, but my queries are more specific, for I'm specifically interested in the different options for where the client fits into a given retry operation, how the client should react in those cases, and what happens should a crash occur that interrupts a retry sequence.
Thank you very much.
There are some rules for retry:
always create an idempotency key to understand that there is retry operation.
if your operation a complex and you want to wrap rest call with retry, you must ensure that for duplicate requests no side effects will be done(start from failure point and don't execute success code).
Personally, I think the client should not know that you retry something, and of course, isTransient: true should not be as a part of the resource.
Warning: Before add retry policy to something you must check side effects, put retry policy everywhere is bad practice

Async resource creation with OData

In REST API if I have a resource which creation could take considerable amount of time I could return a temporary resource with status code 202. Client then could poll this temporary resource until the actual resource is created and get redirected to it when it'd done (with 303 status code). Something like described in http://restcookbook.com/Resources/asynchroneous-operations/.
Is there any standardized way to created such resources in OData?
Asynchronous requests are (briefly) mentioned in the OData V4 specification. It's probably worth reading for the fine details, but in short:
A client makes a request which includes a Prefer: respond-async header. The server can then respond with a HTTP 202 response as you described. This response includes a Location header which points to the 'status monitor resource'.
When the client sends a request to the status monitor resource there are 3 main responses:
HTTP 202: The operation is not yet completed.
HTTP 200: The operation is completed. This response must also include the AsyncResult header which holds the status code of the operation (e.g. a 200 for success, 5xx for error etc.). The body of this response contains the result of the operation.
HTTP 404:
The operation does not exist.
The operation was canceled.
The operation may have existed, but the client waited too long before requesting the status (May also be HTTP 410 (Gone)).
I don't know of any framework which implements this behavior, so you'll probably have to program it yourself.

What is the most appropriate HTTP status code for an already processed POST request?

I have a RESTful API that is used by another internal application that post updates to it.
The problem is that some unexpected peaks occur and, during those times, a request might take longer than 60 seconds (the limit defined by the load balancer, which I cannot change) to respond, which causes a 504 Gateway Timeout error.
When the latter application gets such response, it will retry the request again after 10 minutes or so.
This caused some requests to be processed twice, because the first request was successful, but took more than 60 seconds.
So I decided to use Idempotency Keys in the requests to avoid this problem. The issue is that I don't know what I should return in this case.
Should I just stick with 200 OK? Should I return some 4xx code?
I'd say it highly depends if it is an error for you or not. But I'd say the exact response code is more a matter of taste rather than best practice. But since I guess you're rejecting the duplicated requests, you want to report an error code such as 409 Conflict:
Indicates that the request could not be processed because of conflict
in the current state of the resource, such as an edit conflict between
multiple simultaneous updates.
https://en.wikipedia.org/wiki/List_of_HTTP_status_codes#4xx_Client_errors
Whenever a resource conflict would be caused by fulfilling the request. Duplicate entries and deleting root objects when cascade-delete is not supported are a couple of examples.
https://www.restapitutorial.com/httpstatuscodes.html
A potentially useful reference is RFC 5789, which describes the PATCH method. Obviously, you aren't doing a patch, but the error handling is analogous.
For instance, if you were sending a JSON Patch document, then you might be ensuring idempotent behavior by including a test operation that checks that the resource is in the expected initial state. After your operation, that check would presumably fail. In that case, the error handling section directs your attention to RFC 5789 -- section 2.2 outlines a number of different possible cases.
Another source of inspiration is to look at RFC 7232 which describes conditional requests. The section on If-Match includes this gem:
An origin server MUST NOT perform the requested method if a received If-Match condition evaluates to false; instead, the origin server MUST respond with either a) the 412 (Precondition Failed) status code or b) one of the 2xx (Successful) status codes if the origin server has verified that a state change is being requested and the final state is already reflected in the current state of the target resource (i.e., the change requested by the user agent has already succeeded, but the user agent might not be aware of it, perhaps because the prior response was lost or a compatible change was made by some other user agent).
From this, I infer that 200 is completely acceptable if you can determine that the work was already done successfully.

http REST consumer timeout

I implemented a https/REST provider in node.js using express. The function is calling a webservice, transforming/enhancing data and returning transformed data as csv using response. Execution time of one get request is between 4 minutes 30 seconds and 5 minutes. I want to test the implementation by calling the url.
Problem:
execution in google chrome fails since it runs to long. No option to
increase the time out value.
execution in modzilla firefox:
network.http.response.timeout changed. Now the request is executed
over and over again. Looks like the response is ignored completely.
execution in postman: changed settings->general->XHR timeout in ms(...) .
Nevertheless execution stops every time after the same amount of seconds with
message: "Could not get any response" .
My question: which tool(s) can I use for reliable testing of long running http REST requests?
curl has a --max-time in seconds setting which should do what you want.
curl -m 330 http://you.url
But it might be worth creating a background job and polling for completion of the background job instead. HTTP isn't best suited to long running tasks.
I suggest you to use Socket IO to async response with pub/sub when the csv file is ready In the client send the request and put a timeout of 6 minutes for example, the server in the request return an ack to confirm the file process start, when the file is ready, return with Socket IO the file, Socket IO can be integrated with express
http://socket.io/
Do you have control over the server? If so, you should alter how it operates. Instead of the initial request expecting a response containing the answer, your API should emit a token (a URI) from where the status of the operation can be obtained. The status will either be "in progress" or "completed; here's your answer: ..."
You make the problem (the long-running operation) into its own first-class entity on your server.

Gatling synchronous Http request/response chain

I have implemented a chain of executions and each execution will send a HTTP request to the server and does check if the response status is 2XX. I need to implement a synchronous model in which the next execution in the chain should only get triggered when the previous execution is successful i.e response status is 2xx.
Below is the snapshot of the execution chain.
feed(postcodeFeeder).
exec(Seq(LocateStock.locateStockExecution, ReserveStock.reserveStockExecution, CancelOrder.cancelStockExecution,
ReserveStock.reserveStockExecution, ConfirmOrder.confirmStockExecution, CancelOrder.cancelStockExecution)
Since gatling has asynchronous IO model, what am currently observing is the HTTP requests are sent to the server in an asynchronous manner by a number of users and there is no real dependency between the executions with respect to a single user.
Also I wanted to know for an actor/user if an execution in a chain fails due the check, does it not proceed with the next execution in the chain?
there is no real dependency between the executions with respect to a single user
No, you are wrong. Except when using "resources", requests are sequential for a given user. If you want to stop the flow for a given user when it encounters an error, you can use exitblockonfail.
Gatling does not consider the failure response from the previous request before firing next in chain. You may need to cover the entire block with exitBlockOnFail{} to block the gatling to fire next.